Open Access to Books: The Case of the Open Access Bibliography

In March 2005, the Association of Research Libraries (ARL) published my book the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals under a Creative Commons Attribution-NonCommercial 2.0 License. At the same time, a PDF version of the book was made freely available at the University of Houston Libraries Web site, and a PDF of the frontmatter, "Preface," and "Key Open Access Concepts" sections of the book was made freely available at the ARL Web site. The complete OAB PDF was moved to my new escholarlypub.com Web site in June, and an HTML version of "Key Open Access Concepts" was made available as well. In February 2006, author and title indexes for the OAB were made available in HTML form, and, in March 2006, the entire OAB was made available in HTML form.

The OAB deals with a topic that is of keen interest to a relatively small segment of the reading public. Moreover, it’s primarily a very detailed bibliography. The question is: Was it worth putting up all of these free digital versions of the book and creating these auxiliary digital materials?

From March through May 2005, there were 29,255 requests for the OAB PDF. From June 2005 through June 2006, there were another 15,272 requests for the OAB PDF; 17,952 requests for chapters or sections of the HTML version of the OAB; 11,610 requests for the HTML version of "Key Open Access Concepts"; 3,183 requests for the author index; and 2,918 requests for the title index. I don’t have use statistics for the ARL PDF of the first few sections of the book. (The June 2005 through June 2006 statistics are from Urchin; when I analyze the log files in analog, they may vary slightly.)

Print runs for scholarly books are notoriously short, often in the hundreds. I suspect most scholarly publishers would be delighted to sell 500 copies of a specialized bibliography, many of which would end up on library shelves. However, by making the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals freely available in digital form, over 44,500 copies of the complete book, over 29,500 chapters (or other book sections), and over 6,100 author or title indexes have been distributed to users worldwide. Thanks to ARL, the OAB has had greater visibility and impact than it would have had under the conventional publishing model.

More on How Can Scholars Retain Copyright Rights?

Peter Suber has made the following comment on Open Access News about "How Can Scholars Retain Copyright Rights?":

This is a good introduction to the options. I’d only make two additions.

  1. Authors needn’t retain full copyright in order to provide OA to their own work. They only need to retain the right of OA archiving—which, BTW, about 70% of journals already give to authors in the copyright transfer agreement.
  2. Charles mentions the author addenda from SPARC and Science Commons, but there’s also one from MIT.

Peter is right on both points; however, my document has a broader rights retention focus than providing OA to scholars’ work, although that is an important aspect of it.

For example, there is a difference between simply making an article available on the Internet and making it available under a Creative Commons Attribution-NonCommercial 2.5 License. The former allows the user to freely read, download, and print the article for personal use. The latter allows user to make any noncommercial use of the article without permission as long as proper attribution is made, including creating derivative works. So professor X could print professor Y’s article and distribute in class without permission and without worrying about fair use considerations. (Peter, of course, understands these distinctions, and he is just trying to make sure that authors understand that they don’t have to do anything but sign agreements that grant them appropriate self-archiving rights in order to provide OA access to their articles.)

I considered the MIT addenda, but thought it might be too institution-specific. On closer reading, it could be used without alteration.

How Can Scholars Retain Copyright Rights?

Scholars are often exhorted to retain the copyright rights to their journal articles to ensure that they can freely use their own work and to permit others to freely read and use it as well. The question for scholars who are convinced to do so is: "How do I do that?"

The first thing to understand is that copyright is not one right. Rather, it is a bundle of rights that can be individually granted or withheld. The second thing to understand is that rights can either be granted exclusively to one party or nonexclusively to multiple parties.

What are these rights? Here’s what the U.S. Copyright Office says:

  • To reproduce the work in copies or phonorecords;

  • To prepare derivative works based upon the work;

  • To distribute copies or phonorecords of the work to the public by sale or other transfer of ownership, or by rental, lease, or lending;

  • To perform the work publicly, in the case of literary, musical, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works;

  • To display the copyrighted work publicly, in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual
    images of a motion picture or other audiovisual work; and

  • In the case of sound recordings, to perform the work publicly by means of a digital audio transmission.

A legal document, typically called a copyright transfer agreement, governs the copyright arrangements between you and the publisher and determines what rights you retain and what rights you transfer or grant to the publisher. The publisher may offer a single standard agreement or may have more than one agreement.

Whereas the publisher has had its agreement(s) written by copyright lawyers, you are not likely to be a copyright lawyer. This puts you at a disadvantage in terms or understanding, modifying, or replacing the publisher’s agreement. Therefore, it is very helpful to have documents written by copyright lawyers that you can use to modify or replace the publisher’s agreement with, even if the organization providing such documents does so under a disclaimer that it is not providing "legal advice."

Ordered by increasing level of difficulty in getting publisher acceptance, here are the basic strategies for dealing with copyright transfer agreements:

  • If the publisher has multiple agreements, choose the one that has the author assigning and/or granting specific rights to the publisher (e.g., ALA Copyright License Agreement). Don’t choose the agreement where the author assigns, conveys, grants, or transfers all rights, copyright interest, copyright ownership, and/or title exclusively to the publisher (e.g., ALA Copyright Assignment Agreement).
  • If the publisher has a single agreement that assigns, conveys, grants, or transfers all rights, copyright interest, copyright ownership, and/or title exclusively to the publisher:

Of course, other strategies are possible. For example, you could use another type of open content license instead of the Science Commons Publication Agreement and Copyright License. However, you might want to keep it simple to start.

For more information on copyright transfer agreements, see Copyright Resources for Authors and Scholars Have Lost Control of the Process.

For a directory of publisher copyright and self-archiving policies, see Publisher Copyright Policies & Self-Archiving.

By the way, DigitalKoans doesn’t provide legal advice and the author is not a lawyer.

Open Access: Key Strategic, Technical and Economic Aspects Available on 7/17/06

Neil Jacobs has announced on several mailing lists that Open Access: Key Strategic, Technical and Economic Aspects, which he edited, will be available on July 17th. As you can see from book’s contents below, the book’s contributors include many key figures in the open access movement. I’ve seen an early draft, and I believe this will be a very important book.

The book itself is not OA, but contributors retained their copyrights and they can individually make their papers available on the Internet. My contribution ("What Is Open Access?") is available in both HTML and PDF formats, and it is under a Creative Commons Attribution-NonCommercial 2.5 License.

So far, the US Amazon doesn’t list the book, but it is available from Amazon.co.uk in both paperback and hardback form.

The papers in the book are listed below.

  • "Overview of Scholarly Communication" by Alma Swan
  • "What Is Open Access?" by Charles W. Bailey, Jr.
  • "Open Access: A Symptom and a Promise" by Jean-Claude Guédon
  • "Economic Costs of Toll access" by Andrew Odlyzko
  • "The Impact Loss to Authors and Research" by Michael Kurtz and Tim Brody
  • "The Technology of Open Access" by Chris Awre
  • "The Culture of Open Access: Researchers’ Views and Responses" by Alma Swan
  • "Opening Access By Overcoming Zeno’s Paralysis" by Steven Harnad
  • "Researchers and Institutional Repositories" by Arthur Sale
  • "Open Access to the Research Literature: A Funder’s Perspective" by Robert Terry and Robert Kiley
  • "Business Models in Open Access Publishing" by Matthew Cockerill
  • "Learned Society Business Models and Open Access" by Mary Waltham
  • "Open All Hours? Institutional Models for Open Access" by Colin Steele
  • "DARE Also Means Dare: Institutional Repository Status in the Netherlands as of Early 2006" by Leo Waaijers
  • "Open Access in the USA" by Peter Suber
  • "Towards Open Access to UK Research" by Frederick J. Friend
  • "Open Access in Australia" by John Shipp
  • "Open Access in India" by D. K. Sahu and Ramesh C. Parmar
  • "Open Competition: Beyond Human Reader-Centric Views of Scholarly Literatures" by Clifford Lynch
  • "The Open Research Web" by Nigel Shadbolt, Tim Brody, Les Carr, and Steven Harnad

Postscript:

The book is now available from the US Amazon in paperback and hardcover form.

Top Five Technology Trends

As usual, the LITA top 10 technology trends session at ALA produced some thought-provoking results. And, as usual, I have a somewhat different take on this question.

I’ll whittle my list down to five.

  • Digital Copyright Wars: Big media and publishers are far from finished changing copyright laws to broaden, strengthen, and lengthen the rights of copyright holders. And they are not yet done protecting their digital turf with punitive lawsuits either. One big copyright impact on libraries is digitization: you can only safely digitize what’s in the public domain or what you have permission for (and the permission process can be difficult or impossible). There’s always fair use of course, if you have the deep pockets and institutional backing needed to defend yourself (like Google does) or if your efforts are tolerated (like e-reserves has been so far, except for a few sub rosa publisher objections). In opposition to this trend is a movement by the Creative Commons and others to persuade authors, musicians, and other copyright holders to license their works in ways that permit liberal use and reuse of them.
  • DRM: The Sony BMG rootkit fiasco was a blow, but think again if you believe that this will stop DRM from controlling your digital content in the future. The trick is to get DRM embedded in your operating system, and to have every piece of computer hardware and every consumer digital device that can access and/or manipulate content to support it (or to refuse access to material protected by unsupported DRM schemes). That’s a tall order, but incremental progress is likely to continue to be made towards this goal. Big media will continue to try to pass laws that mandate certain types of DRM and, like the DMCA, protect its use.
  • Internet Privacy: If you believe this still exists on the Internet, you are either using anonymous surfing services or you haven’t been paying attention. Net monitoring will become far more effective if ISPs can be persuaded or required to retain user-specific Internet activity logs. Would you be upset if every licensed e-document that your library users read could be traced back to them? Unless you still offer unauthenticated Internet access in your library, that may depend upon your retention of login records and whether you are legally compelled to reveal them.
  • Net Neutrality: If ISPs can create Internet speed lanes, you don’t want your library or digital content provider to be in the slow one. Hope you (or they) can pay for the fast one. But Net neutrality issues don’t end there: there are issues of content/service blockage and differential service based on fees as well.
  • Open Access: If there is a glimmer of hope on the horizon for the scholarly communication crisis, it’s open access. Efforts to produce alternative low-cost journals are important and deserve full support, but the open access movement’s impact is far greater, and it offers global access to scholars whose institutions may not be able to pay even modest subscription fees and to unaffiliated individuals.

"Strong Copyright + DRM + Weak Net Neutrality = Digital Dystopia?" Preprint

A preprint of my "Strong Copyright + DRM + Weak Net Neutrality = Digital Dystopia?" paper is now available.

It will appear in Information Technology and Libraries 25, no. 3 (2006).

This quote from the paper’s conclusion sums it up:

What this paper has said is simply this: three issues—a dramatic expansion of the scope, duration, and punitive nature of copyright laws; the ability of DRM to lock-down content in an unprecedented fashion; and the erosion of Net neutrality—bear careful scrutiny by those who believe that the Internet has fostered (and will continue to foster) a digital revolution that has resulted in an extraordinary explosion of innovation, creativity, and information dissemination. These issues may well determine whether the much-touted "information superhighway" lives up to its promise or simply becomes the "information toll road" of the future, ironically resembling the pre-Internet online services of the past.

For those who want a longer preview of the paper, here’s the introduction:

Blogs. Digital photo and video sharing. Podcasts. Rip/Mix/Burn. Tagging. Vlogs. Wikis. These buzzwords point to a fundamental social change fueled by cheap PCs and servers, the Internet and its local wired/wireless feeder networks, and powerful, low-cost software: citizens have morphed from passive media consumers to digital media producers and publishers.

Libraries and scholars have their own set of buzz words: digital libraries, digital presses, e-prints, institutional repositories, and open access journals to name a few. They connote the same kind of change: a democratization of publishing and media production using digital technology.

It appears that we are on the brink of an exciting new era of Internet innovation: a kind of digital utopia. Dr. Gary Flake of Microsoft has provided one striking vision of what could be (with a commercial twist) in a presentation entitled "How I Learned to Stop Worrying and Love the Imminent Internet Singularity," and there are many other visions of possible future Internet advances.

When did this metamorphosis begin? It depends on who you ask. Let’s say the late 1980’s, when the Internet began to get serious traction and an early flowering of noncommercial digital publishing occurred.

In the subsequent twenty-odd years, publishing and media production went from being highly centralized, capital-intensive analog activities with limited and well-defined distribution channels to being diffuse, relatively low-cost digital activities with the global Internet as their distribution medium. Not to say that print and conventional media are dead, of course, but it is clear that their era of dominance is waning. The future is digital.

Nor is it to say that entertainment companies (e.g., film, music, radio, and television companies) and information companies (e.g., book, database, and serial publishers) have ceded the digital content battlefield to the upstarts. Quite the contrary.

High-quality thousand-page-per-volume scientific journals and Hollywood blockbusters cannot be produced for pennies, even with digital wizardry. Information and entertainment companies still have an important role to play, and, even if they didn’t, they hold the copyrights to a significant chunk of our cultural heritage.

Entertainment and information companies have understood for some time that they must adopt to the digital environment or die, but this change has not always been easy, especially when it involves concocting and embracing new business models. Nonetheless, they intend to thrive and prosper—and to do whatever it takes to succeed. As they should, since they have an obligation to their shareholders to do so.

The thing about the future is that it is rooted in the past. Culture, even digital culture, builds on what has gone before. Unconstrained access to past works helps determine the richness of future works. Inversely, when past works are inaccessible except to a privileged minority, it impoverishes future works.

This brings us to a second trend that stands in opposition to the first. Put simply, it is the view that intellectual works are "property"; that this property should be protected with the full force of civil and criminal law; that creators have perpetual, transferable property rights; and that contracts, rather than copyright law, should govern the use of intellectual works.

A third trend is also at play: the growing use of Digital Rights Management (DRM) technologies. When intellectual works were in paper form (or other tangible forms), they could only be controlled at the object-ownership or object-access levels (a library controlling the circulation of a copy of a book is an example of the second case). Physical possession of a work, such as a book, meant that the user had full use of it (e.g., the user could read the entire book and photocopy pages from it). When works are in digital form and they are protected by some types of DRM, this may no longer true. For example, a user may only be able to view a single chapter from a DRM-protected e-book and may not be able to print it.

The fourth and final trend deals with how the Internet functions at its most fundamental level. The Internet was designed to be content, application, and hardware "neutral." As long as certain standards were met, the network did not discriminate. One type of content was not given preferential delivery speed over another. One type of content was not charged for delivery while another wasn’t. One type of content was not blocked (at least by the network) while another wasn’t. In recent years, "network neutrality" has come under attack.

The collision of these trends has begun in courts, legislatures, and the marketplace. It is far from over. As we shall see, it’s outcome will determine what the future of digital culture looks like.

A Simple Search Hit Comparison for Google Scholar, OAIster, and Windows Live Academic Search

Given that Windows Live Academic Search’s content is limited to computer science, electrical engineering, and physics journals and conferences, a direct comparison of it with other search engines is somewhat difficult.

Although its limitations should be clearly recognized, the following simple experiment in comparing the number of hits for Google Scholar, OAIster (a search engine that indexes open access literature, such as e-prints), and Windows Live Academic Search may help to shed some light on their differences. (Note that OAIster does not typically include content directly provided by commercial publishers, although it does include e-prints for a large number articles published in academic journals.)

The search is for: "OAI-PMH" (entered without quotes).

"OAI-PMH" being, of course, the Open Archives Initiative Protocol for Metadata Harvesting. This is a highly specific search, where many, but not all, hits should fall within the subjects covered by Windows Live Academic Search. A major area that might not be covered is library and information science literature.

To get a better feel for the baseline published literature about OAI-PMH, let’s first do some searching for that term in specialized commercial databases.

  • ACM Digital Library (description): 51 hits.
  • Engineering Village 2 (description): 66 hits.
  • Information Science & Technology Abstracts (description): 36 hits.
  • Library Literature & Information Science Index/Full Text (description): 13 hits.

Now, the search engines in question (the links for the below search engine names are for the search, not the search engine):

So, what have we learned? Windows Live Academic Search has a somewhat higher number of hits than the selected commercial databases and, if adjusted downward for publisher versions only (see below), is on the high end. This suggests that it covers the toll-based published literature very well. However, it has a significantly lower number of hits than OAIster and Google Scholar, suggesting that its coverage of open access literature may be weaker than Google Scholar and it is quite likely weaker than OAIster.

Of the 74 hits for the "OAI-PMH" search in Windows Live Academic Search, 54 (73%) were "published versions" (i.e., publisher-supplied works); 20 (27%) were not (i.e., e-prints). Scanning the "Results by Institution" sidebar, it appears that 100% of OAIster’s 180 hits were from open access sources; I didn’t check them all. I didn’t try to break down the 542-hit Google Scholar search result, which has a mix of toll-based and open access materials, although it would be quite interesting to do so. It should be clear that a sample of one search term is a very crude measure (and that this posting won’t grace the pages of JASIST anytime soon).

Of course, this simple experiment tells us nothing about the presence of duplicate entries for the same work in search result sets, which could be important for a meaningful open access comparison. Consider, for example, this group of 11 hits for "A Scalable Architecture for Harvest-Based Digital Libraries—The ODU/Southampton Experiments" from the Google "OAI-PMH" search.

Nor does it tell us the number of items that are not journal articles (or e-prints for them) or conference papers.

An apples-to-apples comparison would adjust for useless duplicates and non-journal/conference literature. (But, of course, it would be quite useful if Windows Live Academic Search had non-journal/conference literature such as technical reports in it.)

However, given the small hit sets, it would not be impossible for someone else to do a deeper analysis on the duplicate entry question and some other tractable questions.

Windows Live Academic Search Is Up

The beta version of Windows Live Academic Search is now available: http://academic.live.com/. It appears to me that the system is under a very heavy load, so you may want to wait a bit before giving it a test drive.

The Windows Live Academic Search development team now has a Weblog. The official press release is now available. A list of participants in Microsoft’s MSN Search Champs V4, some of whom gave Microsoft detailed feedback about Windows Live Academic Search is also available.

Windows Live Academic Search Overview

The home page provides a search box, brief overview, an explanation of search results, and an FAQ.

The system’s indexed content is limited to Computer Science, Electrical Engineering, and Physics journals and conferences, including: "6 million records from approximately 4300 journals and 2000 conferences." Here is a list of works indexed. Relevance is determined by: (1) "quality of match of the search term with the content of the paper", and (2) "authoritativeness of the paper." Citation count is not being used in the ranking algorithm at this time.

Interface

The interface has the following key features:

  • Search box: My understanding is that all MSN search commands work, but I have not tested this.
  • Slider bar: Expands or restricts the amount of information shown for each hit in the search results (left side of screen).
  • Search results:
    • Author and title information in hits are linked (blue links).
    • Other hit links include search the Web for the item, CiteSeer citations (if available), show abstract, and hide abstract (grey links).
  • Sort by: You can sort search results by relevance, date (oldest), date (newest), author, journal, and conference. The last three sorts provide a header that precedes the listed search results: for example, John Doe (2).
  • "+add to Live.com": Adds the search to your Windows Live page. Three clickable buttons appear above the stored search on that page: Web, News, Feeds. When one is clicked, the search is repeated in the appropriate information source (e.g. RSS feeds).
  • Preview pane: The right side of the screen is used to display the fielded abstract, BibTex formatted abstract, or EndNote formatted abstract.

Highlights from the Home Page FAQ

  • More content?: "We are not ready to provide a detailed timeline on when we will have a comprehensive index by subject."
  • OpenUrl: "You will be able to click on the link to your library Open URL resolver to determine the availability of full text access."
  • Preferences: "While the version of Academic search does not have a preference page, future versions will have that functionality."

Highlights from Windows Live Academic Information: Librarians

  • OpenURL: "If Academic search can identify that a user is affiliated with your institution, appropriate search results will be accompanied by a link to your OpenURL resolver vendor. We request that you work with your link resolver company and give them permission to provide the necessary information about your institution to us."
  • RSS feeds for searches: "When a new article related to that search is posted, they [researchers] are alerted instantly via an RSS feed."

Highlights from Windows Live Academic Information: Publishers

  • Participation: "Talk to Crossref about the Crossref/Academic search partnership to receive information on the program and instructions on how to initiate participation."
  • Search Results: "Therefore, search results from journals that are indexed from publishers and are published articles will always be marked ‘Published Version’. This ensures that users know which result is the official version. If there are many instances of the same article (from other sources such as a web-crawl), we will always link the search result heading to the version on the publisher’s site."
  • Abstract information: "We require that non-subscribers at least see an abstract of the paper when they land on your site from our search results page." However, it also says:

    "Academic search provides for three levels of display in the preview pane:

    • Full abstract
    • First 140 characters from abstract
    • Nothing from the abstract

    Publishers can choose any of these options for their content."

Microsoft’s Windows Live Academic Search

Microsoft will be releasing Windows Live Academic Search shortly (I was recently told Wednesday; the blog buzz is saying tomorrow).

As is typical with such software projects, the team is doing some last minute tweaking before release. So, I won’t try to describe the system in any detail at this point, except to say that it integrates access to published articles with e-prints and other open access materials, it provides a reference export capability, there’s a cool optional two-pane view (short bibliographic information on the left; full bibliographic information and abstract on the right), and it supports search "macros" (user-written search programs).

What I will say is this: Microsoft made a real effort to get significant, honest input from the librarian and publisher communities during the development process. I know, because, now that the nondisclosure agreement has been lifted, I can say that I was one of the librarians who provided such input on an unpaid basis. I was very impressed by how carefully the development team listened to what we had to say, how sharp and energetic they were, how they really got the Web 2.0 concept, and how deeply committed they were to creating the best product possible. Having read Microserfs, I had a very different mental picture of Microsoft than the reality I encountered.

Needless to say, there were lively exchanges of views between librarians and publishers when open access issues were touched upon. My impression is that the team listened to both sides and tried to find the happy middle ground.

When it’s released, Windows Live Academic Search won’t be the perfect answer to your open access search engine dreams (what system is?), and Microsoft knows that there are missing pieces. But I think it will give Google Scholar a run for its money. I, for one, heartily welcome it, and I think it’s a good base to build upon, especially if Microsoft continues to solicit and seriously consider candid feedback from the library and publisher communities (and it appears that it will).

Open Source Software for Publishing E-Journals

Want to publish an open access journal, but you don’t want to license a commercial journal management system, develop your own system, or to do it all by tedious HTML hand-coding? Here’s summary information about two existing open source e-journal management systems (and one emerging system) that may do the trick.

HyperJournal

  • "HyperJournal is a software application that facilitates the administration of academic journals on the Web. Conceived for researchers in the Humanities and designed according to an intuitive and elegant layout, it permits the installation, personalization, and administration of a dedicated Web site at extremely low cost and without the need for special IT-competence. HyperJournal can be used not only to establish an online version of an existing paper periodical, but also to create an entirely new, solely electronic journal."
  • Overview
  • Documentation
  • Download

Open Journal Systems, Public Knowledge Project

  • "Open Journal Systems (OJS) is a journal management and publishing system that has been developed by the Public Knowledge Project through its federally funded efforts to expand and improve access to research. OJS assists with every stage of the refereed publishing process, from submissions through to online publication and indexing. Through its management systems, its finely grained indexing of research, and the context it provides for research, OJS seeks to improve both the scholarly and public quality of referred research."
  • Open Journal Systems (Overview)
  • FAQ
  • OJS Technical Reference
  • Download

DPubS (Digital Publishing System), Cornell University Library (In development)

  • "DPubS’ ground-breaking software system will enable publishers to cost-effectively organize, deliver, present and publish scholarly journals, monographs, conference proceedings, and other common and evolving means of academic discourse."
  • About DPubS
  • FAQ

Postscript: Peter Suber suggests adding several other software packages, including:

  1. ePublishing Toolkit
  2. SciX Open Publishing Services (SOPS)

Scholarly Communication Web Sites and Weblogs at ARL Libraries (Version 2)

This posting updates and considerably expands my earlier "Scholarly Communication Web Sites at ARL Libraries" posting.

It presents a list of scholarly communication Web sites and Weblogs at the academic member libraries of the Association of Research Libraries. Web sites and Weblogs were identified using separate Google "site:" searches for the exact phrases "scholarly communication" and "open access." Search results were then scanned to identify Web sites or Weblogs that appeared to be intended as the main library’s primary means of communicating to the university community about scholarly communication and/or open access issues. Conferences, presentations, newsletter articles, symposiums, and similar materials were excluded, as were Web sites or Weblogs at branch libraries. Searching was limited to the first few pages of search results.

Additions and corrections are welcome. Use the "Leave a Comment" function for this.

All Sections and Subsections of the Open Access Bibliography Now Linked

There is now a link to each section and subsection of the HTML version of the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals.

For example:

4 Open Access Journals

4.1 General Works
4.2 Economic Issues
4.2.1 General Works
4.2.2 BMJ Rapid Responses about "Author Pays" May Be the New Science Publishing Model
4.3 Open Access Journal Change Agents
4.3.1 SPARC
4.4 Open Access Journal Publishers and Distributors
4.4.1 BioMed Central
4.4.2 Public Library of Science
4.4.3 PubMed Central
4.4.3.1 General Works
4.4.3.2 Science Magazine dEbate on "Building a GenBank of the Published Literature"
4.4.3.3 Science Magazine dEbate on "Is a Government Archive the Best Option?"
4.4.3.4 Science Magazine dEbate on "Just a Minute, Please"
4.4.3.5 Other
4.5 Specific Open Access Journals
4.5.1 Journals in the Directory of Open Access Journals
4.5.2 Pioneering Free E-Journals Not in the DOAJ
4.5.3 Other
4.6 Research Studies

The table of contents in the home page of the bibliography has a complete set of links for all sections and subsections of the document.

The Web page for each major section of the bibliography has links to the subsections (if present) at the start of the page.

HTML Version of the Open Access Bibliography

An HTML version of the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals (OAB) is now available.

The HTML version of the book was created from the final draft using a complex set of digital transformations. Consequently, there may be minor variations between it and the print and Acrobat versions, which are the definitive versions of the book.

The OAB provides an overview of open access concepts, and it presents over 1,300 selected English-language books, conference papers (including some digital video presentations), debates, editorials, e-prints, journal and magazine articles, news articles, technical reports, and other printed and electronic sources that are useful in understanding the open access movement’s efforts to provide free access to and unfettered use of scholarly literature. Most sources have been published between 1999 and August 31, 2004; however, a limited number of key sources published prior to 1999 are also included. Where possible, links are provided to sources that are freely available on the Internet (approximately 78 percent of the bibliography’s references have such links).

HTML Version of "What Is Open Access?"

An HTML version of my "What Is Open Access?" preprint is now available. This version includes additional links in the body of the document that make it easier to quickly access related information about OA concepts, documents, or systems. While it makes many footnote links available in the body of the document (as well as new ones), it is not an attempt to replicate all footnote links in it.

This paper presents a more nuanced, contemporary view of open access than my "Key Open Access Concepts" excerpt from the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals; however, it had to be very compact to meet the publisher’s needs, and it omits some topics discussed in the earlier document.

Those wanting a more in-depth recent treatment might want to try the first half of my "Open Access and Libraries" preprint, which covers much of this material more fully as a preliminary to discussing the relationship between open access and library functions and operations. However, the "What Is Open Access?" paper reflects some changes in my thinking about OA not found in "Open Access and Libraries."

A PDF version of "What Is Open Access?" is also available, which is more suitable for printing and reading offline.

"What Is Open Access?" will appear in: Jacobs, Neil, ed. Open Access: Key Strategic, Technical and Economic Aspects. Oxford: Chandos Publishing, 2006. It is under a Creative Commons Attribution-NonCommercial 2.5 License.

"What Is Open Access?" Preprint

A preprint of my book chapter "What Is Open Access?" is now available. This chapter provides a brief overview of open access (around 4,800 words). It examines the three base definitions of open access; notes other key OA statements; defines and discusses self-archiving, self-archiving strategies (author Websites, disciplinary archives, institutional-unit archives, and institutional repositories), and self-archiving copyright practices; and defines and discusses open access journals and the major types of OA publishers (born-OA publishers, conventional publishers, and non-traditional publishers). It will appear in: Jacobs, Neil, ed. Open Access: Key Strategic, Technical and Economic Aspects. Oxford: Chandos Publishing, 2006. It is under a Creative Commons Attribution-NonCommercial 2.5 License.

Open Access Bibliography Author and Title Indexes Are Now Available

Author and title indexes for the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals are now available.

These indexes, which include complete references, were initially generated in EndNote, then refined through a lengthy production process using several text editing programs to produce the final HTML files.

"Open Access and Libraries" Preprint

A preprint of my forthcoming book chapter "Open Access and Libraries" is now available.

The preprint takes an in-depth look at the open access movement with special attention to the perceived meaning of the term “open access” within it, the use of Creative Commons Licenses, and real-world access distinctions between different types of open access materials. After a brief consideration of some major general benefits of open access, it examines OA’s benefits for libraries and discusses a number of ways that libraries can potentially support the movement, with a consideration of funding issues.

It will appear in: Jacobs, Mark, ed. Electronic Resources Librarians: The Human Element of the Digital Information Age. Binghamton, NY: Haworth Press, 2006.

Postscript: A new preprint is available. I have added more content specific to the impact of OA on electronic resources librarians’ jobs and an appendix on the Creative Commons. Also, I have added another way that OA can save libraries money. I’ve changed the above link to the new preprint; the old one is still available; however, I would recommend reading the new one instead.

Post-PostScript: Having two versions of the preprint available has caused some confusion, so I have taken down the earlier version.

Open Access Bibliography and The Access Principle Discount at Amazon

Amazon is offering the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals and John Willinsky’s insightful The Access Principle: The Case for Open Access to Research and Scholarship together for a discounted price of $68.07 (vs. the normal $79.95). See the OAB Amazon record for the link. (Note: By my request, I do not profit from sales of the print version of the OAB; all proceeds go to ARL to subsidize the print version.)

The E-Print Deposit Conundrum

How can scholars be motivated to deposit e-prints in disciplinary archives, institutional repositories, and other digital archives?

In "A Key-Stroke Koan for Our Open-Access Times," Stevan Harnad says:

Researchers themselves have hinted at the resolution to this koan: Yes, they need and want OA. But there are many other demands on their time too, and they will only perform the requisite keystrokes if their employers and/or funders require them to do it, just as it is already their employers and funders who require them to do the keystrokes to publish (or perish) in the first place. It is employers and funders who set researchers’ priorities, because it is employers and funders who reward researchers’ performance. Today, about 15% of research is self-archived spontaneously but 95% of researchers sampled report that they would self-archive if required to do so by their employers and/or funders: 81% of them willingly, 14% reluctantly; only 5% would not comply with the requirement. And in the two objective tests to date of this self-reported prediction, both have fully confirmed it, with over 90% self-archiving in the two cases where it was made a requirement (Southampton-ECS and CERN).

This is a very cogent point, but, if the solution to the problem is to have scholars’ employers compel them to deposit e-prints, the next logical question is: how can university administrators and other key decision makers be convinced to mandate this activity?

In the UK, a debate is raging between OA advocates and publishers about the UK Research Funding Councils’ (RCUK) self-archiving proposal, which would "mandate the web self-archiving of authors’ final drafts of all journal articles resulting from RCUK-funded research." The fact that this national policy debate is occuring at all is an enormous advance for open access. If RCUK mandates e-print deposit, UK university administrators will need no convincing.

In the US, we are a long way from reaching that point, although the NIH’s voluntary e-print deposit policy provides some faint glimmer of hope that key government agencies can be moved to take some kind of action. However, the US does not have an equivalent to RUCK that can make dramatic e-print policy changes that affect research universities in one fell swoop. It does have government agencies, such as NSF, that control federal grant funds, private foundations that control their own grant funds, and thousands of universities and colleges that, in theory, could establish policies. This is a diffuse and varied audience for the OA message to reach and convince, and the message will need to be tailored to the audience to be effective.

While that plays out, we should not forget scholars themselves, however dimly we view the prospects of changing their behavior to be. University librarians and IT staff know their institutions’ scholars and can work with them one-one-one or in groups to gradually influence change. True, it’s "a journey of a thousand miles" approach, but, the number of librarians and IT staff that will be effective on a national stage is small, while the number of them that may be incrementally effective on the local level is large. The efforts are complementary, not mutually exclusive.

I would urge you to read Nancy Fried Foster and Susan Gibbons’ excellent article "Understanding Faculty to Improve Content Recruitment for Institutional Repositories" for a good example of how an IR can be personalized so that faculty have a greater sense of connection to it and how IR staff can change the way they talk about the IR to better match scholars’ world view.

Here are a few brief final thoughts.

First, as is often said, scholars care about the impact of their work, and it is likely that, if scholars could easily see detailed use statistics for their works (e.g., number of requests and domain breakdowns), they might be more inclined to deposit items if those statistics exceed their expectations. So, the challenge here is to incorporate this capability into commonly used archiving software programs if it is absent.

Second, scholars are unlikely to stumble when entering bibliographic data about their works (although it might not be quite as fully descriptive as purists might like), but entering subject keywords is another matter. Sure they know what the work is about, but are they using terms that others would use and that group their work with similar works in retrieval results? Yes, a controlled vocabulary would help, although such vocabularies have their own challenges. But, I wonder if user-generated "tags," such as those used in Technorati, might be another approach. The trick here is to make the tags and the frequency of their use visible to both authors and searchers. For authors, this helps them put their works where they will be found. For searchers, it helps them find the works.

Third, it might be helpful if an author could fill out a bibliographic template for a work once and, with a single keystroke, submit it to multiple designated digital archives and repositories. So, for example, a library author might choose to submit a work to his or her institutional repository, DLIST, and E-LIS all at once. Of course, this would require a minimal level of standardization of template information between systems and the development of appropriate import capabilities. Some will say: "why bother?" True, OAI-PMH harvesting should, in theory, make duplicate deposit unnecessary given OAIster-like systems. But "lots of copies keep stuff safe," and users still take a single-archive searching approach in spite of OAI-PMH systems.

Searchable Version of the Open Access Webliography

Jim Pitman, Professor of Statistics and Mathematics at the University of California, Berkeley, has created a derivative work from the Open Access Webliography, which is under a Creative Commons Attribution-NonCommercial License.

This version of the OAW utilizes the BibServer software, and it is searchable. There are four views of the entries:

  • Bookmark: A link to the resource.
  • Plain text: A field-oriented ASCII presentation of the resource with active links in the description field.
  • Linked text: A field-oriented HTML presentation of the resource with complete active links.
  • Descriptions: The resource name and description with active links.

Entries are can be sorted by category, description, title, and URL.

Thanks, Jim.

The Role of Reference Librarians in Institutional Repositories

Reference Services Review 33, no. 3 (2005) is a special issue on "the role of the reference librarian in the development, management, dissemination, and sustainability of institutional repositories (IRs)." It includes the following articles (the links are to e-prints):

Open Access Webliography

A preprint of the article "Open Access Webliography" by Adrian K. Ho and Charles W. Bailey, Jr. is now available. This annotated webliography presents a wide range of electronic resources related to the open access movement that were freely available on the Internet as of April 2005.

This article appears in the volume 33, no. 3 (2005) issue of "Reference Services Review," which is a special issue about "the role of the reference librarian in the development, management, dissemination, and sustainability of institutional repositories."

A preprint of my "The Role of Reference Librarians in Institutional Repositories" article in this issue is also available.

Both preprints are under the Creative Commons Attribution-NonCommercial License.

Below is a list of the topics covered in the webliography:

  • Starting Points
  • Bibliographies
  • Debates
  • Directories—E-Prints, Institutional Repositories, and
    Technical Reports
  • Directories—Open Access and Free Journals
  • Directories and Guides—Copyright and Licensing
  • Directories and Guide—Open Access Publishing
  • Directories and Guides—Software
  • Disciplinary Archives
  • E-Serials about Open Access
  • Free E-Serials That Frequently Publish Open Access
    Articles
  • General Information
  • Mailing Lists
  • Organizations
  • Projects
  • Publishers and Distributors
  • Search Engines
  • Special Programs for Developing Countries
  • Statements
  • Weblogs

The Economics of Free, Scholar-Produced E-Journals

While highly visible, large-scale STM open access publishing ventures such as BioMed Central loom large in the free e-journal scene, small-scale scholar-produced e-journals continue to quietly publish new scholarly articles as they have done for at least 18 years now.

I won’t detour into a lengthy history lesson for those readers who weren’t there. The short version of the story is that New Horizons in Adult Education is typically seen as the first scholarly e-journal published on the Internet (it was established in Fall 1987); however, it’s important to recognize that those were primitive times Internet-wise, when distribution of ASCII article files via list servers and FTP servers were cutting-edge ventures. So, as you would image, finding tools were informal and few and far between. ARL’s publication of the Directory of Electronic Journals, Newsletters, and Academic Discussion Lists in July 1991 was a landmark event that made the invisible visible.

For some reason, there was a mini-surge of activity in the 1989-1991 period, with the emergence of the Bryn Mawr Classical Review, EJournal, Electronic Journal of Communication, Journal of the International Academy of Hospitality Research, Postmodern Culture, Psycoloquy, The Public-Access Computer Systems Review, Surfaces, and other journals. Several editors (myself, Stevan Harnad, and John Unsworth) rocked the house at the Association of Research Libraries’ 1992 Symposium on Scholarly Publishing on the Electronic Networks to the dismay of the assembled conventional publishers, who thought we were mad as hatters because we thought that: (a) e-journals were viable, (b) we could anoint ourselves as publishers, and (c) we were giving it away for free. My recollection is that, after the last speech, there was a stunned silence followed by a spattering of applause and a frenzy of generally hostile, astonished questions.

And, as they say, the rest is history. Peter Suber’s Timeline of the Open Access Movement is a good way to get a handle on subsequent events. Someday, I’ll write more about the early e-volution of e-journals.

So, onto the topic at hand. What are the economics of free, scholar-produced e-journals?

Let’s delimit the field a bit. We are not talking about journals produced by university presses or professional associations. Scholar-produced e-journals are generally labors of love, supported by a small group of scholars who serve without pay as editors, editorial board members, and journal production staff.

They often leverage existing technical infrastructure (e.g., Web servers) at the editors’ institutions. The volume of published papers is typically fairly modest, and the papers themselves are frequently not graphically complex. Editors or other volunteers manage the peer review process (usually via electronic means) as well as copy edit and format articles. HTML and PDF are the usual distribution formats, requiring HTML editors, Word, Acrobat, or similar low-cost or free programs. Increasingly, electronic journal management systems are used to automate editorial functions and simplify journal site creation and maintenance (a prime example is the free Open Journal Systems software). "Marketing" is often done by free electronic means: journal mailing lists, table of contents messages sent to targeted subject-related mailing lists, RSS alerts, etc. Since the content is free and electronic, there is no overhead for subscription/licensing management. Since no one gets paid, human resources functions are not needed. If authors retain copyright or content is under a Creative Commons or similar license, no permissions support is needed. Since existing facilities are used (at work or at home), there is no need to rent or purchase office space. Since no money is changing hands in any form, accounting support is unnecessary.

So, what are the economics of free, scholar produced journals? The glib answer is that there are none. But, the real answer is that the costs are so low and the functions so integral to scholarship that they are easily absorbed into ongoing operational costs of universities. Even if they weren’t and scholars had to do it all on their own, server hosting solutions are so ubiquitous and cheap, free open source software is so functional and pervasive, and commercial PC software is so powerful and cheap (especially at academic discounts) that these minor costs would act as no real barrier to the production of scholar-produced e-journals.

Of course, this is not to say that there are not issues associated with the viability and sustainability of these journals, the perpetual preservation of their contents, and other difficulties, but these are topics for another day.

One-Page Open Access Resources Handout

Need a very short (one-page) handout that identifies a few key open access resources? My OA co-presenter (Sara Ranger) and I did, so we created one. It’s at:

http://www.escholarlypub.com/cwb/OAHandout.pdf

It’s available under a Creative Commons Attribution-NonCommercial License.

Obviously, a number of very valuable resources had to be omitted, but, hopefully, users can employ these core resources to discover them.

BMC’s Impact Factors: Elsevier’s Take and Reactions to It

A growing body of research suggests that open access may increase the impact of scholarly literature (see Steve Hitchcock’s "Effect of Open Access and Downloads ("Hits") on Citation Impact: A Bibliography of Studies"). Consequently, "impact factors" play an important part in the ongoing dialog about the desirability of the open access model.

On June 23, 2005, BioMed Central issued a press release entitled "Open Access Journals Get Impressive Impact Factors" that discussed the impact factors for their journals. You can consult the press release for the details, but the essence of it was expressed in this quote from Matthew Cockerill, Director of Operations at BioMed Central:

These latest impact factors show that BioMed Central’s Open Access journals have joined the mainstream of science publishing, and can compete with traditional journals on their own terms. The impact factors also demonstrate one of the key benefits that Open Access offers authors: high visibility and, as a result, a high rate of citation.

On July 8, 2005, Tony McSean, Director of Library Relations for Elsevier, sent an e-mail message to SPARC-OAForum@arl.org "(OA and Impressive Impact Factors—Non Propter Hoc") that presented Elsevier’s analysis of the BMC data, putting it "into context with those of the major subscription-based publishers." Again, I would encourage you to read this analysis. The gist of the argument is as follows:

This comparison with four major STM publishers demonstrates that BMC’s overall IF results are unremarkable, and that they certainly do not provide evidence to support the common assertion that the open access publishing model increases impact factor scores.

My reaction was as follows.

These interesting observations do not appear to account for one difference between BMC journals and the journals of other publishers: their age. Well-established, older journals are more likely to have attained the credibility required for high IFs than newer ones (if they ever will attain such credibility).

Moreover, there is another difference: BMC journals are primarily e-journals, not print journals with derivative electronic counterparts. Although true e-journals have gained significant ground, I suspect that they still start out with a steeper hill to climb credibility-wise than traditional print journals.

Third, since it involves paying a fee, the author-pays model requires a higher motivation on the part of the author to publish in such journals, likely leading to a smaller pool of potential authors. To obtain high journal IFs, these had better be good authors. And, for good authors to publish in such journals, they must hold them in high regard because they have other alternatives.

So, if this analysis is correct, for BMC journals to have attained "unremarkable" IFs is a notable accomplishment because they have attained parity with conventional journals that have some significant advantages.

Earlier in the day, Dr. David Goodman, Associate Professor of the Palmer School of Library and Information Science, commented (unbeknownst to me since I read the list in digest form):

1/ I doubt anyone is contending that at this point any of the
BMC titles are better than the best titles from other publishers. The point is that they are at least as good as the average, and the best of them well above average. For a new publisher, that is a major accomplishment—and one that initially seemed rather doubtful. . . .

2/ Normally, publishing in a relative obscure and newly founded journal would come at some disadvantage to the author, regardless of how the journal was financed. . . .

3/ You can’t judge OA advantage from IF alone. IF refers to journals, OA advantage refers to individual articles. The most convincing studies on OA advantage are those with paired comparisons of articles, as Stevan Harnad has explained in detail.

4/ Most of the BMC titles, the ones beginning with the BMC journal of…, are OA completely. For the ones with Toll Access reviews etc., there is obviously much less availability of those portions than the OA primary research, so I doubt the usual review journal effect applies to the same extent as usual.

On July 9, 2005, Matt Cockerill sent a rebuttal to the SPARC-OAForum that said in part:

Firstly, the statistics you give are based on the set of journals that have ISI impact factors (in fact, they cover only journals which had 2003 Impact Factors). . . . Many of BioMed Central’s best journals are not yet tracked by ISI.

Secondly, comparing the percentage of Impact Factors going up or down does not seem a particularly meaningful metric. What is important, surely, is the actual value of the Impact Factor (relative to others in the field). In that regard, BioMed Central titles have done extremely well, and several are close to the top of their disciplines. . . .

Thirdly, you raise the point that review articles can boost a journal’s Impact Factor, and that many journals publish review articles specifically with the intention of improving their Impact Factor. This is certainly true, but of BioMed Central’s 130+ journals, all but six are online research journals, and publish virtually no review articles whatsoever. . . .

No reply yet from Elsevier, but, whether there is or not, I’m sure that we have not heard the last of the "impact factor" argument.

Stevan Harnad has made it clear that what he calls the "journal-affordability problem" is not the focus of open access (this is perhaps best expressed in Harnad et al.’s "The Access/Impact Problem and the Green and Gold Roads to Open Access"). The real issue is the "research article access/impact problem":

Merely to do the research and then put your findings in a desk drawer is no better than not doing the research at all. Researchers must submit their research to peer review and then "publish or perish," so others can use and apply their findings. But getting findings peer-reviewed and published is not enough either. Other researchers must find the findings useful, as proved by their actually using and citing them. And to be able to use and cite them, they must first be able to access them. That is the research article access/impact problem.

To see that the journal-affordability problem and the article access/impact problem are not the same one need only note that even if all 24,000 peer-reviewed research journals were sold to universities at cost (i.e., with not a penny of profit) it would still be true that almost no university has anywhere near enough money to afford all or even most of the 24,000 journals, even at minimal access-tolls (http://fisher.lib.virginia.edu/cgi-local/arlbin/arl.cgi?task=setuprank). Hence, it would remain true even then that not all would-be users could access all of the yearly 2.5 million articles, and hence that that potential research impact would continue to be lost.

So although the two problems are connected (lower journal prices would indeed generate somewhat more access), solving the journal-affordability problem does not solve the research access/impact problem.

Of course, there are different views of open access, but, for the moment, let’s say that this view is the prevailing one and that this is the most compelling argument to win the hearts and minds of scholars for open access. Open access will rise or fall based on its demonstrated ability to significantly boost impact factors, and the battle to prove or disprove this effect will be fierce indeed.