HTML Version of "What Is Open Access?"

An HTML version of my "What Is Open Access?" preprint is now available. This version includes additional links in the body of the document that make it easier to quickly access related information about OA concepts, documents, or systems. While it makes many footnote links available in the body of the document (as well as new ones), it is not an attempt to replicate all footnote links in it.

This paper presents a more nuanced, contemporary view of open access than my "Key Open Access Concepts" excerpt from the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals; however, it had to be very compact to meet the publisher’s needs, and it omits some topics discussed in the earlier document.

Those wanting a more in-depth recent treatment might want to try the first half of my "Open Access and Libraries" preprint, which covers much of this material more fully as a preliminary to discussing the relationship between open access and library functions and operations. However, the "What Is Open Access?" paper reflects some changes in my thinking about OA not found in "Open Access and Libraries."

A PDF version of "What Is Open Access?" is also available, which is more suitable for printing and reading offline.

"What Is Open Access?" will appear in: Jacobs, Neil, ed. Open Access: Key Strategic, Technical and Economic Aspects. Oxford: Chandos Publishing, 2006. It is under a Creative Commons Attribution-NonCommercial 2.5 License.

"What Is Open Access?" Preprint

A preprint of my book chapter "What Is Open Access?" is now available. This chapter provides a brief overview of open access (around 4,800 words). It examines the three base definitions of open access; notes other key OA statements; defines and discusses self-archiving, self-archiving strategies (author Websites, disciplinary archives, institutional-unit archives, and institutional repositories), and self-archiving copyright practices; and defines and discusses open access journals and the major types of OA publishers (born-OA publishers, conventional publishers, and non-traditional publishers). It will appear in: Jacobs, Neil, ed. Open Access: Key Strategic, Technical and Economic Aspects. Oxford: Chandos Publishing, 2006. It is under a Creative Commons Attribution-NonCommercial 2.5 License.

Open Access Bibliography Author and Title Indexes Are Now Available

Author and title indexes for the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals are now available.

These indexes, which include complete references, were initially generated in EndNote, then refined through a lengthy production process using several text editing programs to produce the final HTML files.

"Open Access and Libraries" Preprint

A preprint of my forthcoming book chapter "Open Access and Libraries" is now available.

The preprint takes an in-depth look at the open access movement with special attention to the perceived meaning of the term “open access” within it, the use of Creative Commons Licenses, and real-world access distinctions between different types of open access materials. After a brief consideration of some major general benefits of open access, it examines OA’s benefits for libraries and discusses a number of ways that libraries can potentially support the movement, with a consideration of funding issues.

It will appear in: Jacobs, Mark, ed. Electronic Resources Librarians: The Human Element of the Digital Information Age. Binghamton, NY: Haworth Press, 2006.

Postscript: A new preprint is available. I have added more content specific to the impact of OA on electronic resources librarians’ jobs and an appendix on the Creative Commons. Also, I have added another way that OA can save libraries money. I’ve changed the above link to the new preprint; the old one is still available; however, I would recommend reading the new one instead.

Post-PostScript: Having two versions of the preprint available has caused some confusion, so I have taken down the earlier version.

Open Access Bibliography and The Access Principle Discount at Amazon

Amazon is offering the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals and John Willinsky’s insightful The Access Principle: The Case for Open Access to Research and Scholarship together for a discounted price of $68.07 (vs. the normal $79.95). See the OAB Amazon record for the link. (Note: By my request, I do not profit from sales of the print version of the OAB; all proceeds go to ARL to subsidize the print version.)

The E-Print Deposit Conundrum

How can scholars be motivated to deposit e-prints in disciplinary archives, institutional repositories, and other digital archives?

In "A Key-Stroke Koan for Our Open-Access Times," Stevan Harnad says:

Researchers themselves have hinted at the resolution to this koan: Yes, they need and want OA. But there are many other demands on their time too, and they will only perform the requisite keystrokes if their employers and/or funders require them to do it, just as it is already their employers and funders who require them to do the keystrokes to publish (or perish) in the first place. It is employers and funders who set researchers’ priorities, because it is employers and funders who reward researchers’ performance. Today, about 15% of research is self-archived spontaneously but 95% of researchers sampled report that they would self-archive if required to do so by their employers and/or funders: 81% of them willingly, 14% reluctantly; only 5% would not comply with the requirement. And in the two objective tests to date of this self-reported prediction, both have fully confirmed it, with over 90% self-archiving in the two cases where it was made a requirement (Southampton-ECS and CERN).

This is a very cogent point, but, if the solution to the problem is to have scholars’ employers compel them to deposit e-prints, the next logical question is: how can university administrators and other key decision makers be convinced to mandate this activity?

In the UK, a debate is raging between OA advocates and publishers about the UK Research Funding Councils’ (RCUK) self-archiving proposal, which would "mandate the web self-archiving of authors’ final drafts of all journal articles resulting from RCUK-funded research." The fact that this national policy debate is occuring at all is an enormous advance for open access. If RCUK mandates e-print deposit, UK university administrators will need no convincing.

In the US, we are a long way from reaching that point, although the NIH’s voluntary e-print deposit policy provides some faint glimmer of hope that key government agencies can be moved to take some kind of action. However, the US does not have an equivalent to RUCK that can make dramatic e-print policy changes that affect research universities in one fell swoop. It does have government agencies, such as NSF, that control federal grant funds, private foundations that control their own grant funds, and thousands of universities and colleges that, in theory, could establish policies. This is a diffuse and varied audience for the OA message to reach and convince, and the message will need to be tailored to the audience to be effective.

While that plays out, we should not forget scholars themselves, however dimly we view the prospects of changing their behavior to be. University librarians and IT staff know their institutions’ scholars and can work with them one-one-one or in groups to gradually influence change. True, it’s "a journey of a thousand miles" approach, but, the number of librarians and IT staff that will be effective on a national stage is small, while the number of them that may be incrementally effective on the local level is large. The efforts are complementary, not mutually exclusive.

I would urge you to read Nancy Fried Foster and Susan Gibbons’ excellent article "Understanding Faculty to Improve Content Recruitment for Institutional Repositories" for a good example of how an IR can be personalized so that faculty have a greater sense of connection to it and how IR staff can change the way they talk about the IR to better match scholars’ world view.

Here are a few brief final thoughts.

First, as is often said, scholars care about the impact of their work, and it is likely that, if scholars could easily see detailed use statistics for their works (e.g., number of requests and domain breakdowns), they might be more inclined to deposit items if those statistics exceed their expectations. So, the challenge here is to incorporate this capability into commonly used archiving software programs if it is absent.

Second, scholars are unlikely to stumble when entering bibliographic data about their works (although it might not be quite as fully descriptive as purists might like), but entering subject keywords is another matter. Sure they know what the work is about, but are they using terms that others would use and that group their work with similar works in retrieval results? Yes, a controlled vocabulary would help, although such vocabularies have their own challenges. But, I wonder if user-generated "tags," such as those used in Technorati, might be another approach. The trick here is to make the tags and the frequency of their use visible to both authors and searchers. For authors, this helps them put their works where they will be found. For searchers, it helps them find the works.

Third, it might be helpful if an author could fill out a bibliographic template for a work once and, with a single keystroke, submit it to multiple designated digital archives and repositories. So, for example, a library author might choose to submit a work to his or her institutional repository, DLIST, and E-LIS all at once. Of course, this would require a minimal level of standardization of template information between systems and the development of appropriate import capabilities. Some will say: "why bother?" True, OAI-PMH harvesting should, in theory, make duplicate deposit unnecessary given OAIster-like systems. But "lots of copies keep stuff safe," and users still take a single-archive searching approach in spite of OAI-PMH systems.

Searchable Version of the Open Access Webliography

Jim Pitman, Professor of Statistics and Mathematics at the University of California, Berkeley, has created a derivative work from the Open Access Webliography, which is under a Creative Commons Attribution-NonCommercial License.

This version of the OAW utilizes the BibServer software, and it is searchable. There are four views of the entries:

  • Bookmark: A link to the resource.
  • Plain text: A field-oriented ASCII presentation of the resource with active links in the description field.
  • Linked text: A field-oriented HTML presentation of the resource with complete active links.
  • Descriptions: The resource name and description with active links.

Entries are can be sorted by category, description, title, and URL.

Thanks, Jim.

The Role of Reference Librarians in Institutional Repositories

Reference Services Review 33, no. 3 (2005) is a special issue on "the role of the reference librarian in the development, management, dissemination, and sustainability of institutional repositories (IRs)." It includes the following articles (the links are to e-prints):

Open Access Webliography

A preprint of the article "Open Access Webliography" by Adrian K. Ho and Charles W. Bailey, Jr. is now available. This annotated webliography presents a wide range of electronic resources related to the open access movement that were freely available on the Internet as of April 2005.

This article appears in the volume 33, no. 3 (2005) issue of "Reference Services Review," which is a special issue about "the role of the reference librarian in the development, management, dissemination, and sustainability of institutional repositories."

A preprint of my "The Role of Reference Librarians in Institutional Repositories" article in this issue is also available.

Both preprints are under the Creative Commons Attribution-NonCommercial License.

Below is a list of the topics covered in the webliography:

  • Starting Points
  • Bibliographies
  • Debates
  • Directories—E-Prints, Institutional Repositories, and
    Technical Reports
  • Directories—Open Access and Free Journals
  • Directories and Guides—Copyright and Licensing
  • Directories and Guide—Open Access Publishing
  • Directories and Guides—Software
  • Disciplinary Archives
  • E-Serials about Open Access
  • Free E-Serials That Frequently Publish Open Access
    Articles
  • General Information
  • Mailing Lists
  • Organizations
  • Projects
  • Publishers and Distributors
  • Search Engines
  • Special Programs for Developing Countries
  • Statements
  • Weblogs

The Economics of Free, Scholar-Produced E-Journals

While highly visible, large-scale STM open access publishing ventures such as BioMed Central loom large in the free e-journal scene, small-scale scholar-produced e-journals continue to quietly publish new scholarly articles as they have done for at least 18 years now.

I won’t detour into a lengthy history lesson for those readers who weren’t there. The short version of the story is that New Horizons in Adult Education is typically seen as the first scholarly e-journal published on the Internet (it was established in Fall 1987); however, it’s important to recognize that those were primitive times Internet-wise, when distribution of ASCII article files via list servers and FTP servers were cutting-edge ventures. So, as you would image, finding tools were informal and few and far between. ARL’s publication of the Directory of Electronic Journals, Newsletters, and Academic Discussion Lists in July 1991 was a landmark event that made the invisible visible.

For some reason, there was a mini-surge of activity in the 1989-1991 period, with the emergence of the Bryn Mawr Classical Review, EJournal, Electronic Journal of Communication, Journal of the International Academy of Hospitality Research, Postmodern Culture, Psycoloquy, The Public-Access Computer Systems Review, Surfaces, and other journals. Several editors (myself, Stevan Harnad, and John Unsworth) rocked the house at the Association of Research Libraries’ 1992 Symposium on Scholarly Publishing on the Electronic Networks to the dismay of the assembled conventional publishers, who thought we were mad as hatters because we thought that: (a) e-journals were viable, (b) we could anoint ourselves as publishers, and (c) we were giving it away for free. My recollection is that, after the last speech, there was a stunned silence followed by a spattering of applause and a frenzy of generally hostile, astonished questions.

And, as they say, the rest is history. Peter Suber’s Timeline of the Open Access Movement is a good way to get a handle on subsequent events. Someday, I’ll write more about the early e-volution of e-journals.

So, onto the topic at hand. What are the economics of free, scholar-produced e-journals?

Let’s delimit the field a bit. We are not talking about journals produced by university presses or professional associations. Scholar-produced e-journals are generally labors of love, supported by a small group of scholars who serve without pay as editors, editorial board members, and journal production staff.

They often leverage existing technical infrastructure (e.g., Web servers) at the editors’ institutions. The volume of published papers is typically fairly modest, and the papers themselves are frequently not graphically complex. Editors or other volunteers manage the peer review process (usually via electronic means) as well as copy edit and format articles. HTML and PDF are the usual distribution formats, requiring HTML editors, Word, Acrobat, or similar low-cost or free programs. Increasingly, electronic journal management systems are used to automate editorial functions and simplify journal site creation and maintenance (a prime example is the free Open Journal Systems software). "Marketing" is often done by free electronic means: journal mailing lists, table of contents messages sent to targeted subject-related mailing lists, RSS alerts, etc. Since the content is free and electronic, there is no overhead for subscription/licensing management. Since no one gets paid, human resources functions are not needed. If authors retain copyright or content is under a Creative Commons or similar license, no permissions support is needed. Since existing facilities are used (at work or at home), there is no need to rent or purchase office space. Since no money is changing hands in any form, accounting support is unnecessary.

So, what are the economics of free, scholar produced journals? The glib answer is that there are none. But, the real answer is that the costs are so low and the functions so integral to scholarship that they are easily absorbed into ongoing operational costs of universities. Even if they weren’t and scholars had to do it all on their own, server hosting solutions are so ubiquitous and cheap, free open source software is so functional and pervasive, and commercial PC software is so powerful and cheap (especially at academic discounts) that these minor costs would act as no real barrier to the production of scholar-produced e-journals.

Of course, this is not to say that there are not issues associated with the viability and sustainability of these journals, the perpetual preservation of their contents, and other difficulties, but these are topics for another day.

One-Page Open Access Resources Handout

Need a very short (one-page) handout that identifies a few key open access resources? My OA co-presenter (Sara Ranger) and I did, so we created one. It’s at:

http://www.escholarlypub.com/cwb/OAHandout.pdf

It’s available under a Creative Commons Attribution-NonCommercial License.

Obviously, a number of very valuable resources had to be omitted, but, hopefully, users can employ these core resources to discover them.

BMC’s Impact Factors: Elsevier’s Take and Reactions to It

A growing body of research suggests that open access may increase the impact of scholarly literature (see Steve Hitchcock’s "Effect of Open Access and Downloads ("Hits") on Citation Impact: A Bibliography of Studies"). Consequently, "impact factors" play an important part in the ongoing dialog about the desirability of the open access model.

On June 23, 2005, BioMed Central issued a press release entitled "Open Access Journals Get Impressive Impact Factors" that discussed the impact factors for their journals. You can consult the press release for the details, but the essence of it was expressed in this quote from Matthew Cockerill, Director of Operations at BioMed Central:

These latest impact factors show that BioMed Central’s Open Access journals have joined the mainstream of science publishing, and can compete with traditional journals on their own terms. The impact factors also demonstrate one of the key benefits that Open Access offers authors: high visibility and, as a result, a high rate of citation.

On July 8, 2005, Tony McSean, Director of Library Relations for Elsevier, sent an e-mail message to SPARC-OAForum@arl.org "(OA and Impressive Impact Factors—Non Propter Hoc") that presented Elsevier’s analysis of the BMC data, putting it "into context with those of the major subscription-based publishers." Again, I would encourage you to read this analysis. The gist of the argument is as follows:

This comparison with four major STM publishers demonstrates that BMC’s overall IF results are unremarkable, and that they certainly do not provide evidence to support the common assertion that the open access publishing model increases impact factor scores.

My reaction was as follows.

These interesting observations do not appear to account for one difference between BMC journals and the journals of other publishers: their age. Well-established, older journals are more likely to have attained the credibility required for high IFs than newer ones (if they ever will attain such credibility).

Moreover, there is another difference: BMC journals are primarily e-journals, not print journals with derivative electronic counterparts. Although true e-journals have gained significant ground, I suspect that they still start out with a steeper hill to climb credibility-wise than traditional print journals.

Third, since it involves paying a fee, the author-pays model requires a higher motivation on the part of the author to publish in such journals, likely leading to a smaller pool of potential authors. To obtain high journal IFs, these had better be good authors. And, for good authors to publish in such journals, they must hold them in high regard because they have other alternatives.

So, if this analysis is correct, for BMC journals to have attained "unremarkable" IFs is a notable accomplishment because they have attained parity with conventional journals that have some significant advantages.

Earlier in the day, Dr. David Goodman, Associate Professor of the Palmer School of Library and Information Science, commented (unbeknownst to me since I read the list in digest form):

1/ I doubt anyone is contending that at this point any of the
BMC titles are better than the best titles from other publishers. The point is that they are at least as good as the average, and the best of them well above average. For a new publisher, that is a major accomplishment—and one that initially seemed rather doubtful. . . .

2/ Normally, publishing in a relative obscure and newly founded journal would come at some disadvantage to the author, regardless of how the journal was financed. . . .

3/ You can’t judge OA advantage from IF alone. IF refers to journals, OA advantage refers to individual articles. The most convincing studies on OA advantage are those with paired comparisons of articles, as Stevan Harnad has explained in detail.

4/ Most of the BMC titles, the ones beginning with the BMC journal of…, are OA completely. For the ones with Toll Access reviews etc., there is obviously much less availability of those portions than the OA primary research, so I doubt the usual review journal effect applies to the same extent as usual.

On July 9, 2005, Matt Cockerill sent a rebuttal to the SPARC-OAForum that said in part:

Firstly, the statistics you give are based on the set of journals that have ISI impact factors (in fact, they cover only journals which had 2003 Impact Factors). . . . Many of BioMed Central’s best journals are not yet tracked by ISI.

Secondly, comparing the percentage of Impact Factors going up or down does not seem a particularly meaningful metric. What is important, surely, is the actual value of the Impact Factor (relative to others in the field). In that regard, BioMed Central titles have done extremely well, and several are close to the top of their disciplines. . . .

Thirdly, you raise the point that review articles can boost a journal’s Impact Factor, and that many journals publish review articles specifically with the intention of improving their Impact Factor. This is certainly true, but of BioMed Central’s 130+ journals, all but six are online research journals, and publish virtually no review articles whatsoever. . . .

No reply yet from Elsevier, but, whether there is or not, I’m sure that we have not heard the last of the "impact factor" argument.

Stevan Harnad has made it clear that what he calls the "journal-affordability problem" is not the focus of open access (this is perhaps best expressed in Harnad et al.’s "The Access/Impact Problem and the Green and Gold Roads to Open Access"). The real issue is the "research article access/impact problem":

Merely to do the research and then put your findings in a desk drawer is no better than not doing the research at all. Researchers must submit their research to peer review and then "publish or perish," so others can use and apply their findings. But getting findings peer-reviewed and published is not enough either. Other researchers must find the findings useful, as proved by their actually using and citing them. And to be able to use and cite them, they must first be able to access them. That is the research article access/impact problem.

To see that the journal-affordability problem and the article access/impact problem are not the same one need only note that even if all 24,000 peer-reviewed research journals were sold to universities at cost (i.e., with not a penny of profit) it would still be true that almost no university has anywhere near enough money to afford all or even most of the 24,000 journals, even at minimal access-tolls (http://fisher.lib.virginia.edu/cgi-local/arlbin/arl.cgi?task=setuprank). Hence, it would remain true even then that not all would-be users could access all of the yearly 2.5 million articles, and hence that that potential research impact would continue to be lost.

So although the two problems are connected (lower journal prices would indeed generate somewhat more access), solving the journal-affordability problem does not solve the research access/impact problem.

Of course, there are different views of open access, but, for the moment, let’s say that this view is the prevailing one and that this is the most compelling argument to win the hearts and minds of scholars for open access. Open access will rise or fall based on its demonstrated ability to significantly boost impact factors, and the battle to prove or disprove this effect will be fierce indeed.

Open Access News Update

From June 24, 2005 to June 30, 2005, Open Access News was down, and I posted Peter Suber’s e-mail updates here. OAN is now up, and Peter has updated it with the missing postings. My updates have been deleted from this posting.

Links to the OAN messages in question are below.

June 30 posting (2 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2063.html

June 30 posting (7 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2062.html

June 29 posting (1 item)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2061.html

June 29 posting (5 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2060.html

June 28 posting (4 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2059.html

June 28 posting (2 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2056.html

June 27 posting (2 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2055.html

June 27 posting (6 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2054.html

June 26 posting (5 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2053.html

June 25 posting (11 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2051.html

June 24 posting (2 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2048.html

June 24 posting (7 items)
https://mx2.arl.org/Lists/SPARC-OAForum/Message/2043.html

Key Open Access Concepts

An excerpt from the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals (OAB) that provides a brief overview of OA concepts is now available in HTML-tagged format. Additional links have been added, and old links checked and updated. As part of the OAB, it is under a Creative Commons Attribution-NonCommercial License.

http://www.escholarlypub.com/oab/keyoaconcepts.htm

Will You Only Harvest Some?

The Digital Library for Information Science and Technology has announced DL-Harvest, an OAI-PMH service provider that harvests and makes searchable metadata about information science materials from the following archives and repositories:

  • ALIA e-prints
  • arXiv
  • Caltech Library System Papers and Publications
  • DLIST
  • Documentation Research and Training Centre
  • DSpace at UNC SILS
  • E-LIS
  • Metadata of LIS Journals
  • OCLC Research Publications
  • OpenMED@NIC
  • WWW Conferences Archive

DL-Harvest is a much needed, innovative discipline-based search service. Big kudos to all involved.

DLIST also just announced the formation of an advisory board.

The following musings, inspired by the DL-Harvest announcement, are not intended to detract from the fine work that DLIST is doing or from the very welcome addition of DL-Harvest to their service offerings.

Discipline-focused metadata can be relatively easily harvested from OAI-PHM-compliant systems that are organized along disciplinary lines (e.g., the entire archive/repository is discipline-based or an organized subset is discipline-based). No doubt these are very rich, primary veins of discipline-specific information, but how about the smaller veins and nuggets that are hard to identify and harvest because they are in systems or subsets that focus on another discipline?

Here’s an example. An economist, who is not part of a research center or other group that might have its own archive, writes extensively about the economics of the scholarly publishing business. This individual’s papers end up in the economics department section of his or her institutional repository and in EconWPA. They are highly relevant to librarians and information scientists, but will their metadata records be harvested for use in services like DL-Harvest using OAI-PMH since they are in the wrong conceptual bins (e.g., set in the case of the IR)?

Coleman et al. point to one solution in their intriguing "Integration of Non-OAI Resources for Federated Searching in DLIST, an Eprints Repository" paper. But (lots of hand waving here), if using automatic metadata extraction was an easy and simple way to supplement conventional OAI-PMH harvesting, the bottom line question is: how good is good enough? In other words, what’s an acceptable level of accuracy for the automatic metadata extraction? (I won’t even bring up the dreaded "controlled vocabulary" notion.)

No doubt this problem falls under the 80/20 Rule, and the 20 is most likely in the low hanging fruit OAI-PMH-wise, but wouldn’t it be nice to have more fruit?

Joint Institutional Repository Evaluation Project

The Johns Hopkins University Digital Knowledge Center in conjunction with MIT and the University of Virginia are working on a Mellon Foundation-funded "A Technology Analysis of Repositories and Services" project to: "conduct an architecture and technology evaluation of repository software and services such as e-learning, e-publishing, and digital preservation. The result will be a set of best practices and recommendations that will inform the development of repositories, services, and appropriate interfaces."

The grant proposal and a presentation given at the CNI Spring 2005 Task Force Meeting provide further details about the project.

Is the Access Spectrum a Red Herring or Are Green and Gold Too Black and White?

Stevan Harnad has commented extensively on my "The Spectrum of E-Journal Access Policies: Open to Restricted Access" DigitalKoans posting. Thanks for doing so, Stevan. Here are my thoughts on your comments.

First, let me concede that if you look at this question from Stevan’s particular open-access-centric point of view that, of course, the spectrum of publisher access policies is a complete and utter waste of time. I don’t recall suggesting that this was a new open access model per se, even though it includes open access in it as a component and it makes some further distinctions between open access and free access journals. Rather, it is what it says it is: a model that presents a range of publisher access policies from the least restrictive to the most restrictive. The color codes merely enhance the model slightly, they are not central to it (and, of course, as Steven says, he created this color coding Frankenstein to begin with). The model says nothing about e-prints.

That said, Steven’s view that open access equals free access (period) is not, as he well knows, universal, and his green and gold models are based on this premise.

Here is how Peter Suber defines OA in "Open Access Overview: Focusing on Open Access to Peer-Reviewed Research Articles and Their Preprints" (boldface is mine):

  • OA should be immediate, rather than delayed, and OA should apply to the full-text, not just to abstracts or summaries.
  • OA removes price barriers (subscriptions, licensing fees, pay-per-view fees) and permission barriers (most copyright and licensing restrictions).
  • There is some flexibility about which permission barriers to remove. For example, some OA providers permit commercial re-use and some do not. Some permit derivative works and some do not. But all of the major public definitions of OA agree that merely removing price barriers, or limiting permissible uses to "fair use" ("fair dealing" in the UK), is not enough.
  • Here’s how the Budapest Open Access Initiative put it: "There are many degrees and kinds of wider and easier access to this literature. By ‘open access’ to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited."
  • Here’s how the Bethesda and Berlin statements put it: "For a work to be OA, the copyright holder must consent in advance to let users ‘copy, use, distribute, transmit and display the work publicly and to make and distribute derivative works, in any digital medium for any responsible purpose, subject to proper attribution of authorship….’"
  • The Budapest (February 2002), Bethesda (June 2003), and Berlin (October 2003) definitions of "open access" are the most central and influential for the OA movement. Sometimes I call refer to them collectively, or to their common ground, as the BBB definition.

So, by most OA definitions, a journal that "makes all of its articles immediately and permanently accessible to all would-be users webwide toll-free" is not OA unless it also uses a Creative Commons or similar license that permits use with minimal restrictions. It is FA (Free Access). As I have said in an earlier dialog, we can count on no journal to be "permanently accessible" unless some trusted archive other than the publisher makes it so, an issue that Steven apparently disagrees with, believing that publishers never go out of business.

I note that Steven has deviated from his "chrononomic parsimony" principle by having both "Green" and "Pale-Green," in his model and then lumping them both together in his discussions as "GREEN." (In his Summary Statistics So Far site he also introduces the color Grey, for "neither yet.") If preprints and postprints are of equal value, why not just code them Green? If they are not of equal value (i.e., postprints that accurately incorporate the changes that occur during the peer-review process are the only real substitute for the published article), then, in reality, those 15.5% of "Pale-Green" journals are of limited value in terms of self-archiving, and the real GREEN journal number is 76.2%, not 92%.

I must admit to some confusion on his latest stand that all types of self-archiving are equal. In "Ten Years After," he seems to be expressing a different sentiment regarding author home pages:

That said, there was a naive element to the Subversive Proposal, too, since Harnad’s plan would have led to researchers posting their papers on thousands of isolated FTP sites. This would have meant that anyone wanting to access the papers would have needed prior knowledge of the papers’ existence and the whereabouts of every relevant archive. They would then have had to search each archive separately. Today, Harnad concedes that "anonymous FTP sites and arbitrary Web sites are more like common graves, insofar as searching is concerned."

Perhaps I misunderstand what is meant by "arbitrary Web sites."

As the prior DigitalKoans dialog beginning with "How Green Is My Publisher?" shows, we clearly disagree on many points related to the importance of author copyright agreements (e.g., they have to permit deposit in disciplinary archives), the importance of deposit in OAI-PMH-compliant archives, and the mission and scope of institutional repositories.

A series of DigitalKoans postings that start with "The View from the IR Trenches, Part 1" provides numerous quotes from the literature that bolster my case.

Second, while I admire Stevan’s unflagging advocacy of open access (by which he really means free access), open access is not the only issue in the e-journal publishing world that is of concern to librarians to whom this missive was mainly addressed. This is because librarians, while hopefully working to build a better future, have to deal with the messy existing realities of the e-publishing environment to do their jobs and to make decisions about how to allocate scarce resources. Consequently, librarians have to scan the e-publishing environment, analyze it, categorize it, and make evaluative judgements about it. They have to make models of e-publishing reality to better understand it. They don’t have the luxury of only dreaming about what that reality should be.

Thus, while Steven is indifferent to many of those 894,302 free full-text articles from 857 HighWire-hosted journals (a number which likely dwarfs all articles available from OA/free journals), librarians are not. Paying attention to them is important. While many are not immediately free, they are free nonetheless after some embargo period. And EA (Embargoed Access) journals are better than RA (Restricted Access) journals in practical terms for users who have no other current access. And even limited access to more restrictive PA (Partial Access) journals is likely to be welcomed by users who today would have no access otherwise. I know that both kinds of access are welcomed by me as a user.

This is not to say that we shouldn’t strive for journals to move up the spectrum from red to green, but it is to say that: (1) some free access is better than no free access for journals that will never move further up the spectrum, and (2) it may be that some journals have to move step-by-step, not in one leap, for the change to take place, and, if they start higher, it may be easier to encourage them to move further and faster. (But we have to know which ones have this potential based on their current status.)

Steven’s model has colors, but, in reality, each color is black and white: Gold and nothing, GREEN and grey. All or nothing. And, as long as you accept his premises, it works, and it allows him to focus on his free-access goal with single minded determination, undistracted by the knotted complexities of the e-scholarly publishing environment. Long may he run.

For those who have a different view of OA or who have broader concerns, it’s too "black and white."

I give him the last word on this matter.

The Spectrum of E-Journal Access Policies: Open to Restricted Access

As journal publishing continues to evolve, the access policies of publishers become more differentiated. The open access movement has been an important catalyst for change in this regard, prodding publishers to reexamine their access policies and, in some cases, to move towards new access models.

To fully understand where things stand with journal access policies, we need to clarify and name the policies in use. While the below list may not be comprehensive, it attempts to provide a first-cut model for key journal access policies, adopting the now popular use of colors as a second form of shorthand for identifying the policy types.

  1. Open Access journals (OA journals, color code: green): These journals provide free access to all articles and utilize a form of licensing that puts minimal restrictions on the use of articles, such as the Creative Commons Attribution License. Example: Biomedical Digital Libraries.
  2. Free Access journals (FA journals, color code: cyan): These journals provide free access to all articles and utilize a variety of copyright statements (e.g., the journal copyright statement may grant liberal educational copying provisions), but they do not use a Creative Commons Attribution License or similar license. Example: The Public-Access Computer Systems Review.
  3. Embargoed Access journals (EA journals, color code: yellow): These journals provide free access to all articles after a specified embargo period and typically utilize conventional copyright statements. Example: Learned Publishing.
  4. Partial Access journals (PA journals, color code: orange): These journals provide free access to selected articles and typically utilize conventional copyright statements. Example: College & Research Libraries.
  5. Restricted Access journals (RA journals, color code: red): These journals provide no free access to articles and typically utilize conventional copyright statements. Example: Library Administration and Management. (Available in electronic form from Library Literature & Information Science Full Text and other databases.)

Using this taxonomy, an examination of the contents of the Directory of Open Access Journals quickly reveals that, in reality, it is the Directory of Open and Free Access Journals, because many listed journals do not use a Creative Commons Attribution License or similar license.

Some may argue that the distinction between OA and FA journals is meaningless; however, to do so suggests that the below sections of the "Budapest Open Access Initiative" in italics are meaningless and, consequently, that the Open Access movement is really just the Free Access movement.

By "open access" to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.

Not that there would be anything wrong with the Free Access movement, but some may feel that the broader scope of the Open Access movement is much more desirable.

In any case, the journal universe is not just green or red, and it’s a pity that we don’t know the breakdown of the spectrum (e.g., x number of green journals and y number of cyan journals), for that would give us a better handle on how the world has changed from the days when all journals were red journals.

Institutional Repository Overviews: A Brief Bibliography

You want a good introduction to institutional repositories. What should you read? Try one or more of the works below. For a quick overview, try Drake, Johnson, or Lynch. For more detail, try Crow or Ware. For an in-depth, library-oriented overview, Gibbons can’t be beat.

Crow, Raym. The Case for Institutional Repositories: A SPARC Position Paper. Washington, DC: The Scholarly Publishing and Academic Resources Coalition, 2002.

Drake, Miriam A. "Institutional Repositories: Hidden Treasures." Searcher 12, no. 5 (2004): 41-45.

Gibbons, Susan. "Establishing an Institutional Repository." Library Technology Reports 40, no. 4 (2004). (Available on Academic Search Premier.)

Johnson, Richard K. "Institutional Repositories: Partnering with Faculty to Enhance Scholarly Communication." D-Lib Magazine 8 (November 2002).

Lynch, Clifford A. "Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age." ARL: A Bimonthly Report on Research Library Issues and Actions from ARL, CNI, and SPARC, no. 226 (2003): 1-7.

Ware, Mark. Pathfinder Research on Web-based Repositories. London: Publisher and Library/Learning Solutions, 2004.

The View from the IR Trenches, Part 4

Today, we’ll look at an article that describes the results of a one-year study at the University of Rochester, River Campus Libraries to "understand the current work practices of faculty in different disciplines in order to see how an IR might naturally support existing ways of work."

Foster, Nancy Fried, and Susan Gibbons. "Understanding Faculty to Improve Content Recruitment for Institutional Repositories." D-Lib Magazine 11, no. 1 (2005).
http://www.dlib.org/dlib/january05/foster/01foster.html

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Faculty Needs

The people we interviewed want most to be able to. . .

  • Work with co-authors
  • Keep track of different versions of the same document
  • Work from different computers and locations, both Mac and PC
  • Make their own work available to others
  • Have easy access to other people’s work
  • Keep up in their fields
  • Organize their materials according to their own scheme
  • Control ownership, security, and access
  • Ensure that documents are persistently viewable or usable
  • Have someone else take responsibility for servers and digital tools
  • Be sure not to violate copyright issues
  • Keep everything related to computers easy and flawless
  • Reduce chaos or at least not add to it
  • Not be any busier

Using Standard IR Terminology Doesn’t Work

Accordingly, when we tried to recruit content using typical IR promotional language, faculty members and researchers did not respond enthusiastically. This is because they did not perceive the relevance of almost any of the IR features as stated in the terms used by librarians, archivists, computer programmers, and others who were setting up and running the IR for the institution. One reason faculty have not rushed to put their work into IRs, therefore, is that they do not recognize its benefits to them in their own terms.

Another reason that faculty have expressed little interest in IRs is related to the way the IR is named and organized. The term ‘institutional repository’ implies that the system is designed to support and achieve the needs and goals of the institution, not necessarily those of the individual. Moreover, it suggests that contributions of materials into the repository serve to highlight the achievements of the institution, rather than those of individual researchers and authors. . . .

Faculty Are Most Interested in Communicating with Colleagues Worldwide

When it comes to research, a faculty member’s strongest ties are usually with a small circle of colleagues from around the world who share an interest in the same field of research, such as plasma astrophysics or contemporary European critical thought. It is with these colleagues, many of them at other institutions, that researchers most want to communicate and share their work. But most organizations have mapped their IR communities to their academic departments rather than to the subtle, shifting communities of scholars engaged in interrelated research projects. . . . In the absence of a strong connection that would naturally bring these documents together into a collection that other scholars would look for, find, and use, there is no compelling reason for the authors to make the submission.

One-on-One Librarian-Faculty Sessions Are Best Way to Interest Faculty

Rather than approach faculty with a set, one-size-fits-all promotional spiel, these library liaisons operate under the guidance that a personalized, tailored approach works best. As we learned from the work-practice study, what faculty members care most about is their research. . . . Throughout the conversation, the library liaison is listening for opportunities to demonstrate how the benefits of the IR respond directly to the faculty member’s web-related research needs. . . .

IR Benefits Must Be Stated in Terms That Faculty Relate To

By contrast to the language previously used to describe the features and benefits of the IR, we are now describing the IR in language drawn from faculty interviews. Thus, we tell faculty that the IR will enable them to. . .

  • Make their own work easily accessible to others on the web through Google searches and searches within the IR itself
  • Preserve digital items far into the future, safe from loss or damage
  • Give out links to their work so that they do not have to spend time finding files and sending them out as email attachments
  • Maintain ownership of their own work and control who sees it
  • Not have to maintain a server
  • Not have to do anything complicated

Scholarly Communication Web Sites at ARL Libraries

The Association of Research Libraries (ARL) currently has 123 member libraries in the US and Canada. Below is a list of scholarly communication web sites at ARL libraries. This list was complied by a quick examination of ARL libraries’ home pages, supplemented by some Google searching. It’s not comprehensive, and I would welcome additions.

More on OhioLINK’s Digital Resource Commons

David F. Kohl has self-archived a PowerPoint presentation about the DRC at E-LIS. It’s called "Cooperating Beyond the ‘Buying Club’: Digital Resource Commons (DRC): Making the Impossible Possible in Ohio."

To quote from the abstract:

Each institution can ‘brand’ itself in the system and may host a discrete and customized interface to all of its content. To the end user it will appear as an institutional resource as if it were hosted on your own servers. There will also be a collective OhioLINK level branding and ability for searches to retrieve across the institutional collections. . . . You will have complete control of your own content and how it is accessed. Multi-tiered security levels will allow your content to be shared only to the extent desired. . . .

Alternatively content can be restricted to an individual department, to an institution, or to the OhioLINK membership. Each institution can set its own policies governing the content in its repositories. Likewise custom workflows can be established to make the most of the personnel involved in each project and expedite the content creation and capture process. The service will include robust and flexible cataloging tools to aid in the creation of records that can be searched and browsed effectively by all types of users. Catalog records can be exported in international standard XML formats such as the Open Archives Initiative Protocol for Metadata Harvesting. Through OhioLINK’s unique collaboration with the Ohio Supercomputer Center your content is stored on enterprise class servers and storage networks.. . . A huge storage area network allows virtually unlimited storage space on our disks. . . . Programming or system administration skills and experience are not required. The system is flexible and adaptable and provides services superior to ‘DSpace’ and ‘ContentDM’ without the associated costs.

OhioLINK’s Digital Resource Commons

Peter Murray, Assistant Director of Multimedia Systems at OhioLINK recently posted a job announcement on LITA-L (I’d link, but given the way ALA safeguards access to its lists, it’s simply impossible) that brought to my attention a bold OhioLink project called the Digital Resource Commons, which is part of an even bolder project called the Ohio Digital Commons for Education. The quote from the job ad below describes the Digital Resource Commons. An earlier part of the ad indicates that Fedora will be used as the DRC’s platform.

OhioLINK’s Digital Resource Commons (DRC) is an Ohio Board of Regents-funded project to create a federated repository service that ingests, preserves, presents, and mediates administration of the educational and research materials of participating institutions. With the capability to store and deliver a virtually unlimited variety of digital file types and formats (including text, data sets, image, audio, video, streaming video, multimedia presentations, animations, etc.) the DRC is positioned to capture digital content from student and faculty researchers as it is produced and return it to users of the DRC upon request. The DRC offers wide and flexible control to member institutions and the communities within institution to define how content is added, preserved, and displayed to repository users. With federated community administration features, lead contacts at member institutions can create communities and delegate up to a complete subset of their privileges within the system to the editors/moderators of those new communities. The ability to scope and brand content to a particular community and institution is offered while retaining the ability to search for content across the entire repository. As both an Open Archives Initiative Data Provider and Service Provider, the DRC is positioned to become the premier point for the discovery of knowledge by and about Ohio’s scholars. In conjunction with the other parts of the Ohio Board of Regents grant funding, the DRC is one piece of a larger effort to build the Ohio Digital Commons for Education—a powerful vision for the future of learning and research in the state of Ohio.

The quote below from the DRC Web site describes the Ohio Digital Commons for Education.

The Digital Resource Commons is one of three projects funded by an Ohio Board of Regents Technology Initiatives grant collectively called the Ohio Digital Commons for Education (ODCE). The three components—this resource repository, the state-wide licensing and development of course management systems (WebCT and Blackboard), and a common access control mechanism (Shibboleth)—combine to offer a powerful vision for learning and research for the state of Ohio.

Impressive. As Daniel Hudson Burnham said: "Make no little plans; they have no magic to stir men’s blood and probably themselves will not be realized."

The View from the IR Trenches, Part 3

Today, we’ll look at an article that provides a UK academic library’s view of its institutional repository responsibilities:

Nixon, William J. "The Evolution of an Institutional E-Prints Archive at the University Of Glasgow." Ariadne, no. 32 (2002).
http://www.ariadne.ac.uk/issue32/eprint-archives/

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Library IR Roles

(The below quotes are from a summary list of library roles in the article.)

IR Advocate

Encouraging members of the University to deposit material into the ePrints archives. At Glasgow we have started an Advocacy campaign to demonstrating that this service has a broader context beyond Glasgow . . . A recent event to raise awareness about the issues of Scholarly Communication provided us with an opportunity to launch our e-prints service and to raise its profile

Copyright Advisory Service

Providing advice to members of the University about copyright and journal embargo policies for material which they would like to deposit in our archive, and as appropriate liaising directly with the Journal in question. This will become a pivotal role in the acceptance of our e-prints service since copyright is the number one question which members of the University ask about

Digitization Service

Converting material to a suitable format such as HTML or PDF for import into the archive. It may also be necessary to ensure that HTML which is submitted is properly formatted and cross-browser compatible

Deposit Service

Depositing material directly on behalf of members of the University who do not, or cannot self-archive their material. In instances in which we have deposited papers on behalf of individuals, we have created a new account for them and used that to submit their content. . . .

Metadata Review and Creation Service

Reviewing the metadata of content which has been self-archived to maintain the quality of the record and to add any additional subject headings and keywords as appropriate.

The View from the IR Trenches, Part 2

Today, we’ll look at an article about the challenges involved in populating an institutional repository:

Mackie, Morag. "Filling Institutional Repositories: Practical Strategies from the DAEDALUS Project." Ariadne, no. 39 (2004).
http://www.ariadne.ac.uk/issue39/mackie/

The DAEDALUS Project is at the University of Glasgow. This article is an especially interesting case study, and it details a number of useful, imaginative strategies for populating an IR.

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Faculty Do Not Want to Deposit Works Themselves

Despite a generally encouraging response, this did not translate into real content being deposited in the repository. . . . We found that it was difficult to get staff to give or send us electronic copies of their papers, even when they had promised to do so. This was our first indication that while staff may be sympathetic many of them do not have the time or the inclination to contribute. They were happy to give us permission to do the work on their behalf, but could not commit to doing the work themselves. Clearly the advantages of institutional repositories were not yet sufficiently convincing to academics to persuade them to play an active part in the process.

Determining Which Articles Can be Legally Deposited Is Difficult and Time Consuming

[T]he majority of academics we contacted were happy for us to establish which of their publications could be added to the repository.

While an extremely useful resource and one that is growing all the time, the [SHERPA] list does not cover all publishers. . . . it has been necessary to track down policies from publishers’ Web sites, or to contact publishers directly where these do not exist or where they do not address the issue of whether an author is permitted to make his or her paper available in a repository. No two publisher polices are exactly the same, and many do not explicitly state what rights authors have in relation to repositories. . . . Interpreting publisher copyright policies is also a difficult area, particularly as there is no real precedent and no case law.

Where copyright policies did not exist or where they were unclear, we contacted the publishers directly and asked for permission. . . . Although some publishers reply quickly, others may take some weeks and some do not reply at all. We found that publishers were more likely to give permission for specific papers to be added than to outline their general policy on the issue. Consequently permissions for most articles have to be established on a case-by-case basis.

It Is Challenging to Identify Possible Depositors Using Open Access Journals

It would be useful to be able to identify additional content in other open access journals, but so far we have not found an easy way of doing this. The Directory of Open Access Journals. . . is very useful, but it does not enable searching by institution or author affiliation.

For IRs to Be Filled, Deposit May Need to be Mandated

Although we have succeeded in adding a reasonable amount of content to the repository we have also been offered significant amounts of content that cannot be added because of restrictive publisher copyright agreements. . . . This is a clear demonstration that major changes need to take place at a high level in order for repositories to be successful. Although some academics have taken the decision to try and avoid publishing in the journals of publishers with restrictive policies, this is still relatively rare. We can inform staff about the issues, but we cannot and should not dictate in which journals they publish. Change is only likely to happen if staff are required, either by the funding councils or by their institution, to make their publications available either by publishing in open access journals or in journals that permit deposit in a repository.