Version 63, Scholarly Electronic Publishing Bibliography

Version 63 of the Scholarly Electronic Publishing Bibliography is now available. This selective bibliography presents over 2,730 articles, books, and other printed and electronic sources that are useful in understanding scholarly electronic publishing efforts on the Internet.

The PDF version of SEPB is now produced annually. The 2005 PDF file is available (Version 60, published 12/9/2005).

The Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals, by the same author, provides much more in-depth coverage of the open access movement and related topics (e.g., disciplinary archives, e-prints, institutional repositories, open access journals, and the Open Archives Initiative) than SEPB does.

The "Open Access Webliography" (with Ho) complements the OAB, providing access to a number of Websites related to open access topics.

Changes in This Version

The bibliography has the following sections (revised sections are in italics):

Table of Contents

1 Economic Issues
2 Electronic Books and Texts
2.1 Case Studies and History
2.2 General Works
2.3 Library Issues
3 Electronic Serials
3.1 Case Studies and History
3.2 Critiques
3.3 Electronic Distribution of Printed Journals
3.4 General Works
3.5 Library Issues
3.6 Research
4 General Works
5 Legal Issues
5.1 Intellectual Property Rights
5.2 License Agreements
5.3 Other Legal Issues
6 Library Issues
6.1 Cataloging, Identifiers, Linking, and Metadata
6.2 Digital Libraries
6.3 General Works
6.4 Information Integrity and Preservation
7 New Publishing Models
8 Publisher Issues
8.1 Digital Rights Management
9 Repositories, E-Prints, and OAI
Appendix A. Related Bibliographies
Appendix B. About the Author
Appendix C. SEPB Use Statistics

Scholarly Electronic Publishing Resources includes the following sections:

Cataloging, Identifiers, Linking, and Metadata
Digital Libraries
Electronic Books and Texts
Electronic Serials
General Electronic Publishing
Images
Legal
Preservation
Publishers
Repositories, E-Prints, and OAI
SGML and Related Standards

Further Information about SEPB

The HTML version of SEPB is designed for interactive use. Each major section is a separate file. There are links to sources that are freely available on the Internet. It can be can be searched using Boolean operators.

The HTML document includes three sections not found in the Acrobat file:

  1. Scholarly Electronic Publishing Weblog (biweekly list of new resources; also available by mailing list and RSS feed)
  2. Scholarly Electronic Publishing Resources (directory of over 270 related Web sites)
  3. Archive (prior versions of the bibliography)

The 2005 annual PDF file is designed for printing. The printed bibliography is over 210 pages long. The PDF file is over 560 KB.

Related Article

An article about the bibliography has been published in The Journal of Electronic Publishing.

More on ALA and Open Access

Peter Suber has provided clarification of ALA’s stance on open access in the below Open Access News posting excerpt:

Comment. This is the most detailed discussion I’ve seen of this question. You should read the whole thing, as I’ve had to omit most of the detail on which Charles’ conclusion rests. I’d only add that (1) the ALA Washington office has a page on OA, (2) the ALA Council adopted a resolution in support of FRPAA at its June 2006 annual meeting, and (3) the ALA has signed on to several public statements in support of OA, most recently a July 12 letter in support of FRPAA and a May 31 letter in support of the EC report on OA.

To further clarify this matter, FRPAA (Federal Research Public Access Act of 2006) and the European Commission’s Study on the Economic and Technical Evolution of the Scientific Publication Markets in Europe both deal with open access to publicly-funded research. This is certainly a major open access issue; however, ALA journals are unlikely to publish a high percentage of papers that result from such publicly-funded research. Consequently, the direct impact of FRPAA or, especially, the EC report on ALA’s journal publishing operations is likely to be minimal.

In contrast to this support for FRPAA and the EU report, ALA has not signed the "Budapest Open Access Initiative" (as other library organizations such as the Association of Academic Health Services Libraries, ALA’s Association of College and Research Libraries Division, the Association of Research Libraries, and the Canadian Association of Research Libraries have), the "Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities," or the "Washington DC Principles for Free Access to Science" (as many association publishers have).

The path from the ALA home page to the Washington Office page is: Home–> Washington Office –> Issues–> Copyright Issues–> Open Access to Research. The ALA Web site is quite large and deep, and one would not expect an OA page to be on the top level. The question is: Can this page be found by someone who doesn’t know that open access is a Washington Office concern? It appears that issues of primary concern to ALA are under the home page heading "Issues & Advocacy" (Home –> Issues & Advocacy).

Whether ALA provides more active support for the open access movement and its reform strategies is, of course, up to its officers and members. These two postings on the matter have been descriptive, not prescriptive. Further clarifications to ALA’s stance on open access or discussion of it are welcome, and can be submitted as comments to either posting.

The American Library Association and Open Access

Does the American Library Association (ALA) support open access and, if so, are its journal publishing practices congruent with open access journal publishing and self-archiving?

What is the American Library Association?

For non-librarians, a brief overview of the American Library Association (ALA) from its Web site may be helpful before considering its open access policies and practices

The American Library Association is the oldest and largest library association in the world, with more than 64,000 members. Its mission is to promote the highest quality library and information services and public access to information.

ALA’s Mission and Strategic Plans

Several key documents outline ALA’s mission and strategic goals:

Although the ALA’s mission and goals have become less library-centric over time, there is no explicit statement of support for open access in any of these documents.

ALA Memberships in Organizations That Support Open Access Initiatives

ALA is a member of at least two organizations that support open access initiatives: (1) the Alliance for Taxpayer Access (ATA), and (2) the SPARC Open Access Working Group. (ALA is not a member of SPARC, but its Association of College and Research Libraries division, known as ACRL, is.)

Information about these ARL memberships certainly exists in the ALA Web site, but it is deeply buried and difficult to find by navigating the site’s menu structure (see the Google site searches for ATA and the SPARC Open Access Working Group).

ALA’s Journal Copyright Agreements

ALA has two copyright agreements: (1) Copyright Assignment Agreement and (2) Copyright License Agreement.

In the Copyright Assignment Agreement, the author: "hereby grants to Publisher all right, title and interest in and to the Work, including copyright in all means of expression by any method now known or hereafter developed, including electronic format." ALA then grants back to the author one broad self-archiving right: "The right to use and distribute the Work on the Author’s Web site." It also grants a narrow right: "The right to use and distribute the Work internally at the Author’s place of employment, and for promotional and any other non-commercial purposes." Authors who use this agreement cannot self-archive in public sections of institutional repositories or in disciplinary archives.

In the Copyright License Agreement, the author retains copyright and then grants to ALA the rights needed to publish the article, with the only restriction on the author being that: "Author agrees not to publish the Work in print form prior to the publication of the Work by the Publisher." Authors who can choose this option can self-archive where ever they want.

ALA’s Journals

ALA publishes a number of serials. This section only considers its major journals. Since it is impossible to determine from their Web sites if some ALA journals are peer-reviewed, there has been no effort to distinguish peer-reviewed journals from those with other editorial policies.

  1. Children and Libraries: The Journal of the Association for Library Service to Children: The Web site provides no table of contents information or online access at all, although there is a link that says: "Click here to subscribe to Children and Libraries online now!" The Policies and Procedures page says: "All material in CAL is subject to copyright by ALA and may be reprinted or photocopied and distributed for the noncommercial purpose of educational or scientific advancement." There are no links to the ALA copyright forms. Verdict: Not an open access journal and, since it is unclear whether the Copyright License Agreement is accepted, it may only support limited self-archiving.
  2. College & Research Libraries: There is a six-month embargo period, after which issues are freely available at the Web site. Volume 57 (1996) through volume 66 (2005) are freely available. The journal page solely links to the Copyright License Agreement. Verdict: Not an open access journal, but fully supports self-archiving.
  3. Information Technology and Libraries: Recently, the journal’s access policy changed. There will be a six-month embargo period, after which issues will be freely available at the Web site. Selected articles from volume 20 (2001) through volume 23 (2004) are freely available. The home page links to both ALA copyright agreements. Verdict: Not an open access journal, but fully supports self-archiving. (Disclosure: Since ALA Annual, I have been on the Editorial Board.)
  4. Library Administration and Management: Web site only provides access to table of contents information. No discussion of copyright agreements in Author Instructions. Verdict: Not an open access journal and, since it is unclear whether the Copyright License Agreement is accepted, it may only support limited self-archiving.
  5. Library Resources & Technical Services: Web site provides access to table of contents information and the full-text of volumes 44 (2000) through 46 (2002). Instructions to Authors page links to both ALA copyright agreements. Not an open access journal, but fully supports self-archiving.
  6. Public Libraries: Web site provides free access to volume 42 (2002) through volume 44 (2005). There is no discussion of copyright in the Public Libraries Editorial Guidelines page, and there are no links to the ALA copyright forms. Verdict: Not an open access journal and, since it is unclear whether the Copyright License Agreement is accepted, it may only support limited self-archiving.
  7. RBM: A Journal of Rare Books, Manuscripts, and Cultural Heritage: Volume 13 (1998) (of the prior journal) through volume 6 (2005) are freely available. No links to ALA copyright agreements. Guidelines for Submission of Articles to RBM page states: "Articles published in RBM are copyrighted by the American Library Association, and subsequent inquiries for reprinting articles are referred to the ALA Office of Rights and Permissions." Verdict: Not an open access journal. Since it is unclear whether the Copyright License Agreement is accepted, it may only support limited self-archiving.
  8. Reference & User Services Quarterly: Web site only provides access to table of contents information and article abstracts. Information for Authors, Advertisers, and Subscriptions page links to both ALA copyright agreements. Not an open access journal, but fully supports self-archiving.
  9. School Library Media Research: Web site provides free access to all issues. Manuscript Policy page links to both ALA copyright agreements. An open access journal under the most liberal definition of that term (i.e., free, immediate access without using a Creative Commons Attribution License or similar license) that fully supports self-archiving.
  10. Young Adult Library Services: Web site only provides access to table of contents information. Author Guidelines page states: "A manuscript published in the journal is subject to copyright by ALA for Young Adult Library Services." There are no links to the ALA copyright forms. Verdict: Not an open access journal and, since it is unclear whether the Copyright License Agreement is accepted, it may only support limited self-archiving.

It should be noted that the Science and Technology Section of ACRL publishes Issues in Science & Technology Librarianship, a freely available e-journal whose Instructions for Authors page does not discuss copyright at all; however, ALA does not list this journal on its American Library Association Periodicals page, which "is a list of the various newsletters, magazines, and journals published within the American Library Association, including those which are only available over the Internet."

Summary

This brief investigation has not attempted to determine whether the divisions of ALA more vigorously support and enact open access principles than the parent organization. The Association of College and Research Libraries is certainly known for its general support (e.g., see ACRL Taking Action, Principles and Strategies for the Reform of Scholarly Communication, and Scholarly Communication Toolkit).

A user starting at the ALA home page would be hard pressed to find any information that suggests that ALA is an advocate of open access without using the search function. Yet, there are a number of pages on the site that deal with it, although many are ACRL Web site pages or serial articles.

ALA’s mission statements and plans reveal no explicit support for open access; however, ALA belongs to at least two organizations that support it: (1) the Alliance for Taxpayer Access (ATA), and (2) the SPARC Open Access Working Group.

Out of ten major journals that it publishes, ALA only publishes one open access journal: School Library Media Research. Two journals (College & Research Libraries and Information Technology and Libraries) have a clear six-month embargo policy. Two more (Public Libraries and RBM: A Journal of Rare Books, Manuscripts, and Cultural Heritage) may also be operating under an embargo policy. One provides free access to a subset of older back volumes (Library Resources & Technical Services). The rest only provide table of contents information, some with abstracts, or, in one case, no information at all.

Five journals (College & Research Libraries, Information Technology and Libraries, Library Resources & Technical Services, Reference & User Services Quarterly, School Library Media Research) clearly offer authors the option of the Copyright License Agreement, which fully supports all types of self-archiving. For the rest, it is unclear from the journal’s Web sites if this option is permitted, and only the Copyright Assignment Agreement may be available, which only permits self-archiving on the author’s Web site or on internal systems at the author’s place of employment (presumably including an access-restricted part of an institutional repository). It may be the case that all ALA journals permit the use of the Copyright License Agreement; however, this is impossible to determine from some their Web sites, a subset of which have language that appears to indicate otherwise.

As a whole, the American Library Association appears to support the open access movement to a limited extent. If this is incorrect and its support is strong, ALA appears to be having difficulty making its commitment visible and "walking the talk."

Open Access to Books: The Case of the Open Access Bibliography

In March 2005, the Association of Research Libraries (ARL) published my book the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals under a Creative Commons Attribution-NonCommercial 2.0 License. At the same time, a PDF version of the book was made freely available at the University of Houston Libraries Web site, and a PDF of the frontmatter, "Preface," and "Key Open Access Concepts" sections of the book was made freely available at the ARL Web site. The complete OAB PDF was moved to my new escholarlypub.com Web site in June, and an HTML version of "Key Open Access Concepts" was made available as well. In February 2006, author and title indexes for the OAB were made available in HTML form, and, in March 2006, the entire OAB was made available in HTML form.

The OAB deals with a topic that is of keen interest to a relatively small segment of the reading public. Moreover, it’s primarily a very detailed bibliography. The question is: Was it worth putting up all of these free digital versions of the book and creating these auxiliary digital materials?

From March through May 2005, there were 29,255 requests for the OAB PDF. From June 2005 through June 2006, there were another 15,272 requests for the OAB PDF; 17,952 requests for chapters or sections of the HTML version of the OAB; 11,610 requests for the HTML version of "Key Open Access Concepts"; 3,183 requests for the author index; and 2,918 requests for the title index. I don’t have use statistics for the ARL PDF of the first few sections of the book. (The June 2005 through June 2006 statistics are from Urchin; when I analyze the log files in analog, they may vary slightly.)

Print runs for scholarly books are notoriously short, often in the hundreds. I suspect most scholarly publishers would be delighted to sell 500 copies of a specialized bibliography, many of which would end up on library shelves. However, by making the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals freely available in digital form, over 44,500 copies of the complete book, over 29,500 chapters (or other book sections), and over 6,100 author or title indexes have been distributed to users worldwide. Thanks to ARL, the OAB has had greater visibility and impact than it would have had under the conventional publishing model.

More on How Can Scholars Retain Copyright Rights?

Peter Suber has made the following comment on Open Access News about "How Can Scholars Retain Copyright Rights?":

This is a good introduction to the options. I’d only make two additions.

  1. Authors needn’t retain full copyright in order to provide OA to their own work. They only need to retain the right of OA archiving—which, BTW, about 70% of journals already give to authors in the copyright transfer agreement.
  2. Charles mentions the author addenda from SPARC and Science Commons, but there’s also one from MIT.

Peter is right on both points; however, my document has a broader rights retention focus than providing OA to scholars’ work, although that is an important aspect of it.

For example, there is a difference between simply making an article available on the Internet and making it available under a Creative Commons Attribution-NonCommercial 2.5 License. The former allows the user to freely read, download, and print the article for personal use. The latter allows user to make any noncommercial use of the article without permission as long as proper attribution is made, including creating derivative works. So professor X could print professor Y’s article and distribute in class without permission and without worrying about fair use considerations. (Peter, of course, understands these distinctions, and he is just trying to make sure that authors understand that they don’t have to do anything but sign agreements that grant them appropriate self-archiving rights in order to provide OA access to their articles.)

I considered the MIT addenda, but thought it might be too institution-specific. On closer reading, it could be used without alteration.

How Can Scholars Retain Copyright Rights?

Scholars are often exhorted to retain the copyright rights to their journal articles to ensure that they can freely use their own work and to permit others to freely read and use it as well. The question for scholars who are convinced to do so is: "How do I do that?"

The first thing to understand is that copyright is not one right. Rather, it is a bundle of rights that can be individually granted or withheld. The second thing to understand is that rights can either be granted exclusively to one party or nonexclusively to multiple parties.

What are these rights? Here’s what the U.S. Copyright Office says:

  • To reproduce the work in copies or phonorecords;

  • To prepare derivative works based upon the work;

  • To distribute copies or phonorecords of the work to the public by sale or other transfer of ownership, or by rental, lease, or lending;

  • To perform the work publicly, in the case of literary, musical, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works;

  • To display the copyrighted work publicly, in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual
    images of a motion picture or other audiovisual work; and

  • In the case of sound recordings, to perform the work publicly by means of a digital audio transmission.

A legal document, typically called a copyright transfer agreement, governs the copyright arrangements between you and the publisher and determines what rights you retain and what rights you transfer or grant to the publisher. The publisher may offer a single standard agreement or may have more than one agreement.

Whereas the publisher has had its agreement(s) written by copyright lawyers, you are not likely to be a copyright lawyer. This puts you at a disadvantage in terms or understanding, modifying, or replacing the publisher’s agreement. Therefore, it is very helpful to have documents written by copyright lawyers that you can use to modify or replace the publisher’s agreement with, even if the organization providing such documents does so under a disclaimer that it is not providing "legal advice."

Ordered by increasing level of difficulty in getting publisher acceptance, here are the basic strategies for dealing with copyright transfer agreements:

  • If the publisher has multiple agreements, choose the one that has the author assigning and/or granting specific rights to the publisher (e.g., ALA Copyright License Agreement). Don’t choose the agreement where the author assigns, conveys, grants, or transfers all rights, copyright interest, copyright ownership, and/or title exclusively to the publisher (e.g., ALA Copyright Assignment Agreement).
  • If the publisher has a single agreement that assigns, conveys, grants, or transfers all rights, copyright interest, copyright ownership, and/or title exclusively to the publisher:

Of course, other strategies are possible. For example, you could use another type of open content license instead of the Science Commons Publication Agreement and Copyright License. However, you might want to keep it simple to start.

For more information on copyright transfer agreements, see Copyright Resources for Authors and Scholars Have Lost Control of the Process.

For a directory of publisher copyright and self-archiving policies, see Publisher Copyright Policies & Self-Archiving.

By the way, DigitalKoans doesn’t provide legal advice and the author is not a lawyer.

Open Access: Key Strategic, Technical and Economic Aspects Available on 7/17/06

Neil Jacobs has announced on several mailing lists that Open Access: Key Strategic, Technical and Economic Aspects, which he edited, will be available on July 17th. As you can see from book’s contents below, the book’s contributors include many key figures in the open access movement. I’ve seen an early draft, and I believe this will be a very important book.

The book itself is not OA, but contributors retained their copyrights and they can individually make their papers available on the Internet. My contribution ("What Is Open Access?") is available in both HTML and PDF formats, and it is under a Creative Commons Attribution-NonCommercial 2.5 License.

So far, the US Amazon doesn’t list the book, but it is available from Amazon.co.uk in both paperback and hardback form.

The papers in the book are listed below.

  • "Overview of Scholarly Communication" by Alma Swan
  • "What Is Open Access?" by Charles W. Bailey, Jr.
  • "Open Access: A Symptom and a Promise" by Jean-Claude Guédon
  • "Economic Costs of Toll access" by Andrew Odlyzko
  • "The Impact Loss to Authors and Research" by Michael Kurtz and Tim Brody
  • "The Technology of Open Access" by Chris Awre
  • "The Culture of Open Access: Researchers’ Views and Responses" by Alma Swan
  • "Opening Access By Overcoming Zeno’s Paralysis" by Steven Harnad
  • "Researchers and Institutional Repositories" by Arthur Sale
  • "Open Access to the Research Literature: A Funder’s Perspective" by Robert Terry and Robert Kiley
  • "Business Models in Open Access Publishing" by Matthew Cockerill
  • "Learned Society Business Models and Open Access" by Mary Waltham
  • "Open All Hours? Institutional Models for Open Access" by Colin Steele
  • "DARE Also Means Dare: Institutional Repository Status in the Netherlands as of Early 2006" by Leo Waaijers
  • "Open Access in the USA" by Peter Suber
  • "Towards Open Access to UK Research" by Frederick J. Friend
  • "Open Access in Australia" by John Shipp
  • "Open Access in India" by D. K. Sahu and Ramesh C. Parmar
  • "Open Competition: Beyond Human Reader-Centric Views of Scholarly Literatures" by Clifford Lynch
  • "The Open Research Web" by Nigel Shadbolt, Tim Brody, Les Carr, and Steven Harnad

Postscript:

The book is now available from the US Amazon in paperback and hardcover form.

Version 62, Scholarly Electronic Publishing Bibliography

Version 62 of the Scholarly Electronic Publishing Bibliography is now available. This selective bibliography presents over 2,680 articles, books, and other printed and electronic sources that are useful in understanding scholarly electronic publishing efforts on the Internet.

The Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals, by the same author, provides much more in-depth coverage of the open access movement and related topics (e.g., disciplinary archives, e-prints, institutional repositories, open access journals, and the Open Archives Initiative) than SEPB does.

The "Open Access Webliography" (with Ho) complements the OAB, providing access to a number of Websites related to open access topics.

Changes in This Version

The bibliography has the following sections (revised sections are marked with an asterisk):

Table of Contents

1 Economic Issues
2 Electronic Books and Texts
2.1 Case Studies and History*
2.2 General Works*
2.3 Library Issues
3 Electronic Serials
3.1 Case Studies and History*
3.2 Critiques
3.3 Electronic Distribution of Printed Journals*
3.4 General Works
3.5 Library Issues*
3.6 Research*
4 General Works*
5 Legal Issues
5.1 Intellectual Property Rights*
5.2 License Agreements
5.3 Other Legal Issues
6 Library Issues
6.1 Cataloging, Identifiers, Linking, and Metadata*
6.2 Digital Libraries*
6.3 General Works*
6.4 Information Integrity and Preservation*
7 New Publishing Models*
8 Publisher Issues*
8.1 Digital Rights Management*
9 Repositories, E-Prints, and OAI*
Appendix A. Related Bibliographies
Appendix B. About the Author
Appendix C. SEPB Use Statistics

Scholarly Electronic Publishing Resources includes the following sections:

Cataloging, Identifiers, Linking, and Metadata
Digital Libraries
Electronic Books and Texts
Electronic Serials*
General Electronic Publishing*
Images*
Legal
Preservation
Publishers
Repositories, E-Prints, and OAI*
SGML and Related Standards

Further Information about SEPB

The HTML version of SEPB is designed for interactive use. Each major section is a separate file. There are links to sources that are freely available on the Internet. It can be can be searched using Boolean operators.

The HTML document includes three sections not found in the Acrobat file:

  1. Scholarly Electronic Publishing Weblog (biweekly list of new resources; also available by mailing list and RSS feed)
  2. Scholarly Electronic Publishing Resources (directory of over 270 related Web sites)
  3. Archive (prior versions of the bibliography)

The Acrobat file is designed for printing. The printed bibliography is over 220 pages long. The Acrobat file is over 580 KB.

Related Article

An article about the bibliography has been published in The Journal of Electronic Publishing.

DigitalKoans Is One

DigitalKoans and the digital-scholarship.com domain (which is actually the same as escholarlypub.com) are one today.

Of course, my blogging career began on June 07, 2001, when the Scholarly Electronic Publishing Weblog was established. However, SEPW was designed as a supplement to the Scholarly Electronic Publishing Bibliography, whereas DigitalKoans was designed as a stand-alone publication, albeit one that is inevitably interwoven with my other publication efforts.

So, how did we do in year one? According to Urchin, the digital-scholarship.com domain, which includes DigitalKoans and my other digital works (excluding SEPB, SEPW, and SEPR), has had 251,033 visitor sessions, with an average of 686 sessions per day. There have been 540,054 page requests (pages typically being content-bearing HTML or PDF files), of which 377,640 were for DigitalKoans.

These requests came from 131 top-level Internet domains (e.g., .com). In terms of domains representing identifiable countries, the top ten were: Canada, Italy, United Kingdom, Germany, Australia, France, Japan, Netherlands, Sweden, and Belgium. (Interestingly, India and China came in at 11th and 12th place.) Of course, most U.S. users are in the .com, .edu, .net, and .org domains, which dominated the rankings as a whole.

As I’ve noted previously, I use Urchin for first-cut use statistics, and analog for final ones, so I consider these figures to be preliminary.

Hear Luminaries Interviewed at Recent CNI Task Force Meetings

Matt Pasiewicz and CNI have made available digital audio interviews with a number of prominent attendees at the CNI Fall (2005) and Spring (2006) Task Force meetings.

A Simple Search Hit Comparison for Google Scholar, OAIster, and Windows Live Academic Search

Given that Windows Live Academic Search’s content is limited to computer science, electrical engineering, and physics journals and conferences, a direct comparison of it with other search engines is somewhat difficult.

Although its limitations should be clearly recognized, the following simple experiment in comparing the number of hits for Google Scholar, OAIster (a search engine that indexes open access literature, such as e-prints), and Windows Live Academic Search may help to shed some light on their differences. (Note that OAIster does not typically include content directly provided by commercial publishers, although it does include e-prints for a large number articles published in academic journals.)

The search is for: "OAI-PMH" (entered without quotes).

"OAI-PMH" being, of course, the Open Archives Initiative Protocol for Metadata Harvesting. This is a highly specific search, where many, but not all, hits should fall within the subjects covered by Windows Live Academic Search. A major area that might not be covered is library and information science literature.

To get a better feel for the baseline published literature about OAI-PMH, let’s first do some searching for that term in specialized commercial databases.

  • ACM Digital Library (description): 51 hits.
  • Engineering Village 2 (description): 66 hits.
  • Information Science & Technology Abstracts (description): 36 hits.
  • Library Literature & Information Science Index/Full Text (description): 13 hits.

Now, the search engines in question (the links for the below search engine names are for the search, not the search engine):

So, what have we learned? Windows Live Academic Search has a somewhat higher number of hits than the selected commercial databases and, if adjusted downward for publisher versions only (see below), is on the high end. This suggests that it covers the toll-based published literature very well. However, it has a significantly lower number of hits than OAIster and Google Scholar, suggesting that its coverage of open access literature may be weaker than Google Scholar and it is quite likely weaker than OAIster.

Of the 74 hits for the "OAI-PMH" search in Windows Live Academic Search, 54 (73%) were "published versions" (i.e., publisher-supplied works); 20 (27%) were not (i.e., e-prints). Scanning the "Results by Institution" sidebar, it appears that 100% of OAIster’s 180 hits were from open access sources; I didn’t check them all. I didn’t try to break down the 542-hit Google Scholar search result, which has a mix of toll-based and open access materials, although it would be quite interesting to do so. It should be clear that a sample of one search term is a very crude measure (and that this posting won’t grace the pages of JASIST anytime soon).

Of course, this simple experiment tells us nothing about the presence of duplicate entries for the same work in search result sets, which could be important for a meaningful open access comparison. Consider, for example, this group of 11 hits for "A Scalable Architecture for Harvest-Based Digital Libraries—The ODU/Southampton Experiments" from the Google "OAI-PMH" search.

Nor does it tell us the number of items that are not journal articles (or e-prints for them) or conference papers.

An apples-to-apples comparison would adjust for useless duplicates and non-journal/conference literature. (But, of course, it would be quite useful if Windows Live Academic Search had non-journal/conference literature such as technical reports in it.)

However, given the small hit sets, it would not be impossible for someone else to do a deeper analysis on the duplicate entry question and some other tractable questions.

Windows Live Academic Search Is Up

The beta version of Windows Live Academic Search is now available: http://academic.live.com/. It appears to me that the system is under a very heavy load, so you may want to wait a bit before giving it a test drive.

The Windows Live Academic Search development team now has a Weblog. The official press release is now available. A list of participants in Microsoft’s MSN Search Champs V4, some of whom gave Microsoft detailed feedback about Windows Live Academic Search is also available.

Windows Live Academic Search Overview

The home page provides a search box, brief overview, an explanation of search results, and an FAQ.

The system’s indexed content is limited to Computer Science, Electrical Engineering, and Physics journals and conferences, including: "6 million records from approximately 4300 journals and 2000 conferences." Here is a list of works indexed. Relevance is determined by: (1) "quality of match of the search term with the content of the paper", and (2) "authoritativeness of the paper." Citation count is not being used in the ranking algorithm at this time.

Interface

The interface has the following key features:

  • Search box: My understanding is that all MSN search commands work, but I have not tested this.
  • Slider bar: Expands or restricts the amount of information shown for each hit in the search results (left side of screen).
  • Search results:
    • Author and title information in hits are linked (blue links).
    • Other hit links include search the Web for the item, CiteSeer citations (if available), show abstract, and hide abstract (grey links).
  • Sort by: You can sort search results by relevance, date (oldest), date (newest), author, journal, and conference. The last three sorts provide a header that precedes the listed search results: for example, John Doe (2).
  • "+add to Live.com": Adds the search to your Windows Live page. Three clickable buttons appear above the stored search on that page: Web, News, Feeds. When one is clicked, the search is repeated in the appropriate information source (e.g. RSS feeds).
  • Preview pane: The right side of the screen is used to display the fielded abstract, BibTex formatted abstract, or EndNote formatted abstract.

Highlights from the Home Page FAQ

  • More content?: "We are not ready to provide a detailed timeline on when we will have a comprehensive index by subject."
  • OpenUrl: "You will be able to click on the link to your library Open URL resolver to determine the availability of full text access."
  • Preferences: "While the version of Academic search does not have a preference page, future versions will have that functionality."

Highlights from Windows Live Academic Information: Librarians

  • OpenURL: "If Academic search can identify that a user is affiliated with your institution, appropriate search results will be accompanied by a link to your OpenURL resolver vendor. We request that you work with your link resolver company and give them permission to provide the necessary information about your institution to us."
  • RSS feeds for searches: "When a new article related to that search is posted, they [researchers] are alerted instantly via an RSS feed."

Highlights from Windows Live Academic Information: Publishers

  • Participation: "Talk to Crossref about the Crossref/Academic search partnership to receive information on the program and instructions on how to initiate participation."
  • Search Results: "Therefore, search results from journals that are indexed from publishers and are published articles will always be marked ‘Published Version’. This ensures that users know which result is the official version. If there are many instances of the same article (from other sources such as a web-crawl), we will always link the search result heading to the version on the publisher’s site."
  • Abstract information: "We require that non-subscribers at least see an abstract of the paper when they land on your site from our search results page." However, it also says:

    "Academic search provides for three levels of display in the preview pane:

    • Full abstract
    • First 140 characters from abstract
    • Nothing from the abstract

    Publishers can choose any of these options for their content."

The Caravan Project: One Book, Five Distribution Formats

BusinessWeek reports that Peter Osnos, founder and Editor-at-Large of Public Affairs, is working with Borders, selected independent bookstores, six nonprofit publishers, and Ingram Industries to experiment with a new book publishing model. The idea is this: publish the book in five formats (audio, chapter, hardcover, digital, and print-on-demand) and let customers decide which one(s) they want. Larger publishers have reservations about the Caravan Project’s experiment. The article states that "going this far this fast unnerves publishers," and it quotes Al Greco (of the Book Industry Study Group): "they are terrified of being Napsterized."

Source: Lowry, Tom. "Getting Out Of A Bind." BusinessWeek, 10 April 2006, 79-80.

Microsoft’s Windows Live Academic Search

Microsoft will be releasing Windows Live Academic Search shortly (I was recently told Wednesday; the blog buzz is saying tomorrow).

As is typical with such software projects, the team is doing some last minute tweaking before release. So, I won’t try to describe the system in any detail at this point, except to say that it integrates access to published articles with e-prints and other open access materials, it provides a reference export capability, there’s a cool optional two-pane view (short bibliographic information on the left; full bibliographic information and abstract on the right), and it supports search "macros" (user-written search programs).

What I will say is this: Microsoft made a real effort to get significant, honest input from the librarian and publisher communities during the development process. I know, because, now that the nondisclosure agreement has been lifted, I can say that I was one of the librarians who provided such input on an unpaid basis. I was very impressed by how carefully the development team listened to what we had to say, how sharp and energetic they were, how they really got the Web 2.0 concept, and how deeply committed they were to creating the best product possible. Having read Microserfs, I had a very different mental picture of Microsoft than the reality I encountered.

Needless to say, there were lively exchanges of views between librarians and publishers when open access issues were touched upon. My impression is that the team listened to both sides and tried to find the happy middle ground.

When it’s released, Windows Live Academic Search won’t be the perfect answer to your open access search engine dreams (what system is?), and Microsoft knows that there are missing pieces. But I think it will give Google Scholar a run for its money. I, for one, heartily welcome it, and I think it’s a good base to build upon, especially if Microsoft continues to solicit and seriously consider candid feedback from the library and publisher communities (and it appears that it will).

Open Source Software for Publishing E-Journals

Want to publish an open access journal, but you don’t want to license a commercial journal management system, develop your own system, or to do it all by tedious HTML hand-coding? Here’s summary information about two existing open source e-journal management systems (and one emerging system) that may do the trick.

HyperJournal

  • "HyperJournal is a software application that facilitates the administration of academic journals on the Web. Conceived for researchers in the Humanities and designed according to an intuitive and elegant layout, it permits the installation, personalization, and administration of a dedicated Web site at extremely low cost and without the need for special IT-competence. HyperJournal can be used not only to establish an online version of an existing paper periodical, but also to create an entirely new, solely electronic journal."
  • Overview
  • Documentation
  • Download

Open Journal Systems, Public Knowledge Project

  • "Open Journal Systems (OJS) is a journal management and publishing system that has been developed by the Public Knowledge Project through its federally funded efforts to expand and improve access to research. OJS assists with every stage of the refereed publishing process, from submissions through to online publication and indexing. Through its management systems, its finely grained indexing of research, and the context it provides for research, OJS seeks to improve both the scholarly and public quality of referred research."
  • Open Journal Systems (Overview)
  • FAQ
  • OJS Technical Reference
  • Download

DPubS (Digital Publishing System), Cornell University Library (In development)

  • "DPubS’ ground-breaking software system will enable publishers to cost-effectively organize, deliver, present and publish scholarly journals, monographs, conference proceedings, and other common and evolving means of academic discourse."
  • About DPubS
  • FAQ

Postscript: Peter Suber suggests adding several other software packages, including:

  1. ePublishing Toolkit
  2. SciX Open Publishing Services (SOPS)

Scholarly Communication Web Sites and Weblogs at ARL Libraries (Version 2)

This posting updates and considerably expands my earlier "Scholarly Communication Web Sites at ARL Libraries" posting.

It presents a list of scholarly communication Web sites and Weblogs at the academic member libraries of the Association of Research Libraries. Web sites and Weblogs were identified using separate Google "site:" searches for the exact phrases "scholarly communication" and "open access." Search results were then scanned to identify Web sites or Weblogs that appeared to be intended as the main library’s primary means of communicating to the university community about scholarly communication and/or open access issues. Conferences, presentations, newsletter articles, symposiums, and similar materials were excluded, as were Web sites or Weblogs at branch libraries. Searching was limited to the first few pages of search results.

Additions and corrections are welcome. Use the "Leave a Comment" function for this.

All Sections and Subsections of the Open Access Bibliography Now Linked

There is now a link to each section and subsection of the HTML version of the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals.

For example:

4 Open Access Journals

4.1 General Works
4.2 Economic Issues
4.2.1 General Works
4.2.2 BMJ Rapid Responses about "Author Pays" May Be the New Science Publishing Model
4.3 Open Access Journal Change Agents
4.3.1 SPARC
4.4 Open Access Journal Publishers and Distributors
4.4.1 BioMed Central
4.4.2 Public Library of Science
4.4.3 PubMed Central
4.4.3.1 General Works
4.4.3.2 Science Magazine dEbate on "Building a GenBank of the Published Literature"
4.4.3.3 Science Magazine dEbate on "Is a Government Archive the Best Option?"
4.4.3.4 Science Magazine dEbate on "Just a Minute, Please"
4.4.3.5 Other
4.5 Specific Open Access Journals
4.5.1 Journals in the Directory of Open Access Journals
4.5.2 Pioneering Free E-Journals Not in the DOAJ
4.5.3 Other
4.6 Research Studies

The table of contents in the home page of the bibliography has a complete set of links for all sections and subsections of the document.

The Web page for each major section of the bibliography has links to the subsections (if present) at the start of the page.

HTML Version of the Open Access Bibliography

An HTML version of the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals (OAB) is now available.

The HTML version of the book was created from the final draft using a complex set of digital transformations. Consequently, there may be minor variations between it and the print and Acrobat versions, which are the definitive versions of the book.

The OAB provides an overview of open access concepts, and it presents over 1,300 selected English-language books, conference papers (including some digital video presentations), debates, editorials, e-prints, journal and magazine articles, news articles, technical reports, and other printed and electronic sources that are useful in understanding the open access movement’s efforts to provide free access to and unfettered use of scholarly literature. Most sources have been published between 1999 and August 31, 2004; however, a limited number of key sources published prior to 1999 are also included. Where possible, links are provided to sources that are freely available on the Internet (approximately 78 percent of the bibliography’s references have such links).

dLIST E-Print Archive Adds Use Statistics

Authors who deposit e-prints in dLIST (Digital Library of Information Science and Technology) can now see use statistics for their works (archive users can see use startistics as well). For example, at the record for the "Indian Digital Library in Engineering Science and Technology (INDEST) Consortium: Consortia-Based Subscription to Electronic Resources for Technical Education System in India: A Government of India Initiative," you would click on "View statistics for this eprint" to get the use statistics for this work. You can view use statistics for the past four weeks, this year, last year, or all years.

Archive-wide use statistics are also available from either an e-print record or the dLIST Statistics page. From either one, you can rank all e-prints by use for the same time periods as individual e-prints and show overall archive use by year/month or country.

Disclosure: I am now the Scholarly Communication subject editor for dLIST.

Version 61, Scholarly Electronic Publishing Bibliography

Version 61 of the Scholarly Electronic Publishing Bibliography is now available. This selective bibliography presents over 2,610 articles, books, and other printed and electronic sources that are useful in understanding scholarly electronic publishing efforts on the Internet.

The Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals, by the same author, provides much more in-depth coverage of the open access movement and related topics (e.g., disciplinary archives, e-prints, institutional repositories, open access journals, and the Open Archives Initiative) than SEPB does.

The "Open Access Webliography" (with Ho) complements the OAB, providing access to a number of Websites related to open access topics.

Changes in This Version

The bibliography has the following sections (revised sections are marked with an asterisk):

Table of Contents

1 Economic Issues
2 Electronic Books and Texts
2.1 Case Studies and History*
2.2 General Works*
2.3 Library Issues
3 Electronic Serials
3.1 Case Studies and History*
3.2 Critiques
3.3 Electronic Distribution of Printed Journals*
3.4 General Works*
3.5 Library Issues*
3.6 Research*
4 General Works*
5 Legal Issues
5.1 Intellectual Property Rights*
5.2 License Agreements*
5.3 Other Legal Issues
6 Library Issues
6.1 Cataloging, Identifiers, Linking, and Metadata*
6.2 Digital Libraries*
6.3 General Works*
6.4 Information Integrity and Preservation*
7 New Publishing Models*
8 Publisher Issues*
8.1 Digital Rights Management*
9 Repositories, E-Prints, and OAI*
Appendix A. Related Bibliographies
Appendix B. About the Author*
Appendix C. SEPB Use Statistics*

Scholarly Electronic Publishing Resources includes the following sections:

Cataloging, Identifiers, Linking, and Metadata*
Digital Libraries*
Electronic Books and Texts*
Electronic Serials*
General Electronic Publishing*
Images*
Legal*
Preservation*
Publishers
Repositories, E-Prints, and OAI*
SGML and Related Standards*

Further Information about SEPB

The HTML version of SEPB is designed for interactive use. Each major section is a separate file. There are links to sources that are freely available on the Internet. It can be can be searched using Boolean operators.

The HTML document includes three sections not found in the Acrobat file:

  1. Scholarly Electronic Publishing Weblog (biweekly list of new resources; also available by mailing list and RSS feed)
  2. Scholarly Electronic Publishing Resources (directory of over 270 related Web sites)
  3. Archive (prior versions of the bibliography)

The Acrobat file is designed for printing. The printed bibliography is over 215 pages long. The Acrobat file is over 570 KB.

Related Article

An article about the bibliography has been published in The Journal of Electronic Publishing.

HTML Version of "What Is Open Access?"

An HTML version of my "What Is Open Access?" preprint is now available. This version includes additional links in the body of the document that make it easier to quickly access related information about OA concepts, documents, or systems. While it makes many footnote links available in the body of the document (as well as new ones), it is not an attempt to replicate all footnote links in it.

This paper presents a more nuanced, contemporary view of open access than my "Key Open Access Concepts" excerpt from the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals; however, it had to be very compact to meet the publisher’s needs, and it omits some topics discussed in the earlier document.

Those wanting a more in-depth recent treatment might want to try the first half of my "Open Access and Libraries" preprint, which covers much of this material more fully as a preliminary to discussing the relationship between open access and library functions and operations. However, the "What Is Open Access?" paper reflects some changes in my thinking about OA not found in "Open Access and Libraries."

A PDF version of "What Is Open Access?" is also available, which is more suitable for printing and reading offline.

"What Is Open Access?" will appear in: Jacobs, Neil, ed. Open Access: Key Strategic, Technical and Economic Aspects. Oxford: Chandos Publishing, 2006. It is under a Creative Commons Attribution-NonCommercial 2.5 License.

Open Access Bibliography Author and Title Indexes Are Now Available

Author and title indexes for the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals are now available.

These indexes, which include complete references, were initially generated in EndNote, then refined through a lengthy production process using several text editing programs to produce the final HTML files.

Scholarly Electronic Publishing Bibliography 2005 Use Statistics

There were 1,327,703 successful SEPB file requests in 2005, of which 1,034,745 were page requests. 115,029 host computers were served in 160 domains (excluding unknown domains). From October 1996 through December 2005, there have been 5,564,636 successful requests for SEPB files. See the details below.

SEPB Use Statistics

Requests By Year (October 1996-December 2005)

Year Number of File Requests Average Daily File Requests Number of Page Requests Average Daily Page Requests
1996 (October to December) 19,801 281 14,616 207
1997 156,139 428 109,638 300
1998 230,143 630 150,422 412
1999 254,411 697 170,517 467
2000 317,220 867 215,113 588
2001 405,037 1,109 280,547 768
2002 622,311 1,705 393,251 1,077
2003 1,023,619 2,827 634,607 1,752
2004 1,208,252 3,301 796,953 2,177
2005 1,327,703 3,637 1,034,745 2,834

Total File Requests (October 1996-December 2004)

Year Number of File Requests
1996-2005 5,564,636

Number of Host Computers Served (October 1996-December 2005)

Year Distinct Hosts Served
1996
(October to December)
4,276
1997 29,160
1998 39,145
1999 43,114
2000 51,809
2001 68,391
2002 94,464
2003 117,777
2004 128,218
2005 115,029

"Open Access and Libraries" Preprint

A preprint of my forthcoming book chapter "Open Access and Libraries" is now available.

The preprint takes an in-depth look at the open access movement with special attention to the perceived meaning of the term “open access” within it, the use of Creative Commons Licenses, and real-world access distinctions between different types of open access materials. After a brief consideration of some major general benefits of open access, it examines OA’s benefits for libraries and discusses a number of ways that libraries can potentially support the movement, with a consideration of funding issues.

It will appear in: Jacobs, Mark, ed. Electronic Resources Librarians: The Human Element of the Digital Information Age. Binghamton, NY: Haworth Press, 2006.

Postscript: A new preprint is available. I have added more content specific to the impact of OA on electronic resources librarians’ jobs and an appendix on the Creative Commons. Also, I have added another way that OA can save libraries money. I’ve changed the above link to the new preprint; the old one is still available; however, I would recommend reading the new one instead.

Post-PostScript: Having two versions of the preprint available has caused some confusion, so I have taken down the earlier version.

Open Access Bibliography and The Access Principle Discount at Amazon

Amazon is offering the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals and John Willinsky’s insightful The Access Principle: The Case for Open Access to Research and Scholarship together for a discounted price of $68.07 (vs. the normal $79.95). See the OAB Amazon record for the link. (Note: By my request, I do not profit from sales of the print version of the OAB; all proceeds go to ARL to subsidize the print version.)