The E-Print Deposit Conundrum

How can scholars be motivated to deposit e-prints in disciplinary archives, institutional repositories, and other digital archives?

In "A Key-Stroke Koan for Our Open-Access Times," Stevan Harnad says:

Researchers themselves have hinted at the resolution to this koan: Yes, they need and want OA. But there are many other demands on their time too, and they will only perform the requisite keystrokes if their employers and/or funders require them to do it, just as it is already their employers and funders who require them to do the keystrokes to publish (or perish) in the first place. It is employers and funders who set researchers’ priorities, because it is employers and funders who reward researchers’ performance. Today, about 15% of research is self-archived spontaneously but 95% of researchers sampled report that they would self-archive if required to do so by their employers and/or funders: 81% of them willingly, 14% reluctantly; only 5% would not comply with the requirement. And in the two objective tests to date of this self-reported prediction, both have fully confirmed it, with over 90% self-archiving in the two cases where it was made a requirement (Southampton-ECS and CERN).

This is a very cogent point, but, if the solution to the problem is to have scholars’ employers compel them to deposit e-prints, the next logical question is: how can university administrators and other key decision makers be convinced to mandate this activity?

In the UK, a debate is raging between OA advocates and publishers about the UK Research Funding Councils’ (RCUK) self-archiving proposal, which would "mandate the web self-archiving of authors’ final drafts of all journal articles resulting from RCUK-funded research." The fact that this national policy debate is occuring at all is an enormous advance for open access. If RCUK mandates e-print deposit, UK university administrators will need no convincing.

In the US, we are a long way from reaching that point, although the NIH’s voluntary e-print deposit policy provides some faint glimmer of hope that key government agencies can be moved to take some kind of action. However, the US does not have an equivalent to RUCK that can make dramatic e-print policy changes that affect research universities in one fell swoop. It does have government agencies, such as NSF, that control federal grant funds, private foundations that control their own grant funds, and thousands of universities and colleges that, in theory, could establish policies. This is a diffuse and varied audience for the OA message to reach and convince, and the message will need to be tailored to the audience to be effective.

While that plays out, we should not forget scholars themselves, however dimly we view the prospects of changing their behavior to be. University librarians and IT staff know their institutions’ scholars and can work with them one-one-one or in groups to gradually influence change. True, it’s "a journey of a thousand miles" approach, but, the number of librarians and IT staff that will be effective on a national stage is small, while the number of them that may be incrementally effective on the local level is large. The efforts are complementary, not mutually exclusive.

I would urge you to read Nancy Fried Foster and Susan Gibbons’ excellent article "Understanding Faculty to Improve Content Recruitment for Institutional Repositories" for a good example of how an IR can be personalized so that faculty have a greater sense of connection to it and how IR staff can change the way they talk about the IR to better match scholars’ world view.

Here are a few brief final thoughts.

First, as is often said, scholars care about the impact of their work, and it is likely that, if scholars could easily see detailed use statistics for their works (e.g., number of requests and domain breakdowns), they might be more inclined to deposit items if those statistics exceed their expectations. So, the challenge here is to incorporate this capability into commonly used archiving software programs if it is absent.

Second, scholars are unlikely to stumble when entering bibliographic data about their works (although it might not be quite as fully descriptive as purists might like), but entering subject keywords is another matter. Sure they know what the work is about, but are they using terms that others would use and that group their work with similar works in retrieval results? Yes, a controlled vocabulary would help, although such vocabularies have their own challenges. But, I wonder if user-generated "tags," such as those used in Technorati, might be another approach. The trick here is to make the tags and the frequency of their use visible to both authors and searchers. For authors, this helps them put their works where they will be found. For searchers, it helps them find the works.

Third, it might be helpful if an author could fill out a bibliographic template for a work once and, with a single keystroke, submit it to multiple designated digital archives and repositories. So, for example, a library author might choose to submit a work to his or her institutional repository, DLIST, and E-LIS all at once. Of course, this would require a minimal level of standardization of template information between systems and the development of appropriate import capabilities. Some will say: "why bother?" True, OAI-PMH harvesting should, in theory, make duplicate deposit unnecessary given OAIster-like systems. But "lots of copies keep stuff safe," and users still take a single-archive searching approach in spite of OAI-PMH systems.

The Role of Reference Librarians in Institutional Repositories

Reference Services Review 33, no. 3 (2005) is a special issue on "the role of the reference librarian in the development, management, dissemination, and sustainability of institutional repositories (IRs)." It includes the following articles (the links are to e-prints):

Will You Only Harvest Some?

The Digital Library for Information Science and Technology has announced DL-Harvest, an OAI-PMH service provider that harvests and makes searchable metadata about information science materials from the following archives and repositories:

  • ALIA e-prints
  • arXiv
  • Caltech Library System Papers and Publications
  • DLIST
  • Documentation Research and Training Centre
  • DSpace at UNC SILS
  • E-LIS
  • Metadata of LIS Journals
  • OCLC Research Publications
  • OpenMED@NIC
  • WWW Conferences Archive

DL-Harvest is a much needed, innovative discipline-based search service. Big kudos to all involved.

DLIST also just announced the formation of an advisory board.

The following musings, inspired by the DL-Harvest announcement, are not intended to detract from the fine work that DLIST is doing or from the very welcome addition of DL-Harvest to their service offerings.

Discipline-focused metadata can be relatively easily harvested from OAI-PHM-compliant systems that are organized along disciplinary lines (e.g., the entire archive/repository is discipline-based or an organized subset is discipline-based). No doubt these are very rich, primary veins of discipline-specific information, but how about the smaller veins and nuggets that are hard to identify and harvest because they are in systems or subsets that focus on another discipline?

Here’s an example. An economist, who is not part of a research center or other group that might have its own archive, writes extensively about the economics of the scholarly publishing business. This individual’s papers end up in the economics department section of his or her institutional repository and in EconWPA. They are highly relevant to librarians and information scientists, but will their metadata records be harvested for use in services like DL-Harvest using OAI-PMH since they are in the wrong conceptual bins (e.g., set in the case of the IR)?

Coleman et al. point to one solution in their intriguing "Integration of Non-OAI Resources for Federated Searching in DLIST, an Eprints Repository" paper. But (lots of hand waving here), if using automatic metadata extraction was an easy and simple way to supplement conventional OAI-PMH harvesting, the bottom line question is: how good is good enough? In other words, what’s an acceptable level of accuracy for the automatic metadata extraction? (I won’t even bring up the dreaded "controlled vocabulary" notion.)

No doubt this problem falls under the 80/20 Rule, and the 20 is most likely in the low hanging fruit OAI-PMH-wise, but wouldn’t it be nice to have more fruit?

Joint Institutional Repository Evaluation Project

The Johns Hopkins University Digital Knowledge Center in conjunction with MIT and the University of Virginia are working on a Mellon Foundation-funded "A Technology Analysis of Repositories and Services" project to: "conduct an architecture and technology evaluation of repository software and services such as e-learning, e-publishing, and digital preservation. The result will be a set of best practices and recommendations that will inform the development of repositories, services, and appropriate interfaces."

The grant proposal and a presentation given at the CNI Spring 2005 Task Force Meeting provide further details about the project.

Institutional Repository Overviews: A Brief Bibliography

You want a good introduction to institutional repositories. What should you read? Try one or more of the works below. For a quick overview, try Drake, Johnson, or Lynch. For more detail, try Crow or Ware. For an in-depth, library-oriented overview, Gibbons can’t be beat.

Crow, Raym. The Case for Institutional Repositories: A SPARC Position Paper. Washington, DC: The Scholarly Publishing and Academic Resources Coalition, 2002.

Drake, Miriam A. "Institutional Repositories: Hidden Treasures." Searcher 12, no. 5 (2004): 41-45.

Gibbons, Susan. "Establishing an Institutional Repository." Library Technology Reports 40, no. 4 (2004). (Available on Academic Search Premier.)

Johnson, Richard K. "Institutional Repositories: Partnering with Faculty to Enhance Scholarly Communication." D-Lib Magazine 8 (November 2002).

Lynch, Clifford A. "Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age." ARL: A Bimonthly Report on Research Library Issues and Actions from ARL, CNI, and SPARC, no. 226 (2003): 1-7.

Ware, Mark. Pathfinder Research on Web-based Repositories. London: Publisher and Library/Learning Solutions, 2004.

The View from the IR Trenches, Part 4

Today, we’ll look at an article that describes the results of a one-year study at the University of Rochester, River Campus Libraries to "understand the current work practices of faculty in different disciplines in order to see how an IR might naturally support existing ways of work."

Foster, Nancy Fried, and Susan Gibbons. "Understanding Faculty to Improve Content Recruitment for Institutional Repositories." D-Lib Magazine 11, no. 1 (2005).
http://www.dlib.org/dlib/january05/foster/01foster.html

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Faculty Needs

The people we interviewed want most to be able to. . .

  • Work with co-authors
  • Keep track of different versions of the same document
  • Work from different computers and locations, both Mac and PC
  • Make their own work available to others
  • Have easy access to other people’s work
  • Keep up in their fields
  • Organize their materials according to their own scheme
  • Control ownership, security, and access
  • Ensure that documents are persistently viewable or usable
  • Have someone else take responsibility for servers and digital tools
  • Be sure not to violate copyright issues
  • Keep everything related to computers easy and flawless
  • Reduce chaos or at least not add to it
  • Not be any busier

Using Standard IR Terminology Doesn’t Work

Accordingly, when we tried to recruit content using typical IR promotional language, faculty members and researchers did not respond enthusiastically. This is because they did not perceive the relevance of almost any of the IR features as stated in the terms used by librarians, archivists, computer programmers, and others who were setting up and running the IR for the institution. One reason faculty have not rushed to put their work into IRs, therefore, is that they do not recognize its benefits to them in their own terms.

Another reason that faculty have expressed little interest in IRs is related to the way the IR is named and organized. The term ‘institutional repository’ implies that the system is designed to support and achieve the needs and goals of the institution, not necessarily those of the individual. Moreover, it suggests that contributions of materials into the repository serve to highlight the achievements of the institution, rather than those of individual researchers and authors. . . .

Faculty Are Most Interested in Communicating with Colleagues Worldwide

When it comes to research, a faculty member’s strongest ties are usually with a small circle of colleagues from around the world who share an interest in the same field of research, such as plasma astrophysics or contemporary European critical thought. It is with these colleagues, many of them at other institutions, that researchers most want to communicate and share their work. But most organizations have mapped their IR communities to their academic departments rather than to the subtle, shifting communities of scholars engaged in interrelated research projects. . . . In the absence of a strong connection that would naturally bring these documents together into a collection that other scholars would look for, find, and use, there is no compelling reason for the authors to make the submission.

One-on-One Librarian-Faculty Sessions Are Best Way to Interest Faculty

Rather than approach faculty with a set, one-size-fits-all promotional spiel, these library liaisons operate under the guidance that a personalized, tailored approach works best. As we learned from the work-practice study, what faculty members care most about is their research. . . . Throughout the conversation, the library liaison is listening for opportunities to demonstrate how the benefits of the IR respond directly to the faculty member’s web-related research needs. . . .

IR Benefits Must Be Stated in Terms That Faculty Relate To

By contrast to the language previously used to describe the features and benefits of the IR, we are now describing the IR in language drawn from faculty interviews. Thus, we tell faculty that the IR will enable them to. . .

  • Make their own work easily accessible to others on the web through Google searches and searches within the IR itself
  • Preserve digital items far into the future, safe from loss or damage
  • Give out links to their work so that they do not have to spend time finding files and sending them out as email attachments
  • Maintain ownership of their own work and control who sees it
  • Not have to maintain a server
  • Not have to do anything complicated

More on OhioLINK’s Digital Resource Commons

David F. Kohl has self-archived a PowerPoint presentation about the DRC at E-LIS. It’s called "Cooperating Beyond the ‘Buying Club’: Digital Resource Commons (DRC): Making the Impossible Possible in Ohio."

To quote from the abstract:

Each institution can ‘brand’ itself in the system and may host a discrete and customized interface to all of its content. To the end user it will appear as an institutional resource as if it were hosted on your own servers. There will also be a collective OhioLINK level branding and ability for searches to retrieve across the institutional collections. . . . You will have complete control of your own content and how it is accessed. Multi-tiered security levels will allow your content to be shared only to the extent desired. . . .

Alternatively content can be restricted to an individual department, to an institution, or to the OhioLINK membership. Each institution can set its own policies governing the content in its repositories. Likewise custom workflows can be established to make the most of the personnel involved in each project and expedite the content creation and capture process. The service will include robust and flexible cataloging tools to aid in the creation of records that can be searched and browsed effectively by all types of users. Catalog records can be exported in international standard XML formats such as the Open Archives Initiative Protocol for Metadata Harvesting. Through OhioLINK’s unique collaboration with the Ohio Supercomputer Center your content is stored on enterprise class servers and storage networks.. . . A huge storage area network allows virtually unlimited storage space on our disks. . . . Programming or system administration skills and experience are not required. The system is flexible and adaptable and provides services superior to ‘DSpace’ and ‘ContentDM’ without the associated costs.

OhioLINK’s Digital Resource Commons

Peter Murray, Assistant Director of Multimedia Systems at OhioLINK recently posted a job announcement on LITA-L (I’d link, but given the way ALA safeguards access to its lists, it’s simply impossible) that brought to my attention a bold OhioLink project called the Digital Resource Commons, which is part of an even bolder project called the Ohio Digital Commons for Education. The quote from the job ad below describes the Digital Resource Commons. An earlier part of the ad indicates that Fedora will be used as the DRC’s platform.

OhioLINK’s Digital Resource Commons (DRC) is an Ohio Board of Regents-funded project to create a federated repository service that ingests, preserves, presents, and mediates administration of the educational and research materials of participating institutions. With the capability to store and deliver a virtually unlimited variety of digital file types and formats (including text, data sets, image, audio, video, streaming video, multimedia presentations, animations, etc.) the DRC is positioned to capture digital content from student and faculty researchers as it is produced and return it to users of the DRC upon request. The DRC offers wide and flexible control to member institutions and the communities within institution to define how content is added, preserved, and displayed to repository users. With federated community administration features, lead contacts at member institutions can create communities and delegate up to a complete subset of their privileges within the system to the editors/moderators of those new communities. The ability to scope and brand content to a particular community and institution is offered while retaining the ability to search for content across the entire repository. As both an Open Archives Initiative Data Provider and Service Provider, the DRC is positioned to become the premier point for the discovery of knowledge by and about Ohio’s scholars. In conjunction with the other parts of the Ohio Board of Regents grant funding, the DRC is one piece of a larger effort to build the Ohio Digital Commons for Education—a powerful vision for the future of learning and research in the state of Ohio.

The quote below from the DRC Web site describes the Ohio Digital Commons for Education.

The Digital Resource Commons is one of three projects funded by an Ohio Board of Regents Technology Initiatives grant collectively called the Ohio Digital Commons for Education (ODCE). The three components—this resource repository, the state-wide licensing and development of course management systems (WebCT and Blackboard), and a common access control mechanism (Shibboleth)—combine to offer a powerful vision for learning and research for the state of Ohio.

Impressive. As Daniel Hudson Burnham said: "Make no little plans; they have no magic to stir men’s blood and probably themselves will not be realized."

The View from the IR Trenches, Part 3

Today, we’ll look at an article that provides a UK academic library’s view of its institutional repository responsibilities:

Nixon, William J. "The Evolution of an Institutional E-Prints Archive at the University Of Glasgow." Ariadne, no. 32 (2002).
http://www.ariadne.ac.uk/issue32/eprint-archives/

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Library IR Roles

(The below quotes are from a summary list of library roles in the article.)

IR Advocate

Encouraging members of the University to deposit material into the ePrints archives. At Glasgow we have started an Advocacy campaign to demonstrating that this service has a broader context beyond Glasgow . . . A recent event to raise awareness about the issues of Scholarly Communication provided us with an opportunity to launch our e-prints service and to raise its profile

Copyright Advisory Service

Providing advice to members of the University about copyright and journal embargo policies for material which they would like to deposit in our archive, and as appropriate liaising directly with the Journal in question. This will become a pivotal role in the acceptance of our e-prints service since copyright is the number one question which members of the University ask about

Digitization Service

Converting material to a suitable format such as HTML or PDF for import into the archive. It may also be necessary to ensure that HTML which is submitted is properly formatted and cross-browser compatible

Deposit Service

Depositing material directly on behalf of members of the University who do not, or cannot self-archive their material. In instances in which we have deposited papers on behalf of individuals, we have created a new account for them and used that to submit their content. . . .

Metadata Review and Creation Service

Reviewing the metadata of content which has been self-archived to maintain the quality of the record and to add any additional subject headings and keywords as appropriate.

The View from the IR Trenches, Part 2

Today, we’ll look at an article about the challenges involved in populating an institutional repository:

Mackie, Morag. "Filling Institutional Repositories: Practical Strategies from the DAEDALUS Project." Ariadne, no. 39 (2004).
http://www.ariadne.ac.uk/issue39/mackie/

The DAEDALUS Project is at the University of Glasgow. This article is an especially interesting case study, and it details a number of useful, imaginative strategies for populating an IR.

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Faculty Do Not Want to Deposit Works Themselves

Despite a generally encouraging response, this did not translate into real content being deposited in the repository. . . . We found that it was difficult to get staff to give or send us electronic copies of their papers, even when they had promised to do so. This was our first indication that while staff may be sympathetic many of them do not have the time or the inclination to contribute. They were happy to give us permission to do the work on their behalf, but could not commit to doing the work themselves. Clearly the advantages of institutional repositories were not yet sufficiently convincing to academics to persuade them to play an active part in the process.

Determining Which Articles Can be Legally Deposited Is Difficult and Time Consuming

[T]he majority of academics we contacted were happy for us to establish which of their publications could be added to the repository.

While an extremely useful resource and one that is growing all the time, the [SHERPA] list does not cover all publishers. . . . it has been necessary to track down policies from publishers’ Web sites, or to contact publishers directly where these do not exist or where they do not address the issue of whether an author is permitted to make his or her paper available in a repository. No two publisher polices are exactly the same, and many do not explicitly state what rights authors have in relation to repositories. . . . Interpreting publisher copyright policies is also a difficult area, particularly as there is no real precedent and no case law.

Where copyright policies did not exist or where they were unclear, we contacted the publishers directly and asked for permission. . . . Although some publishers reply quickly, others may take some weeks and some do not reply at all. We found that publishers were more likely to give permission for specific papers to be added than to outline their general policy on the issue. Consequently permissions for most articles have to be established on a case-by-case basis.

It Is Challenging to Identify Possible Depositors Using Open Access Journals

It would be useful to be able to identify additional content in other open access journals, but so far we have not found an easy way of doing this. The Directory of Open Access Journals. . . is very useful, but it does not enable searching by institution or author affiliation.

For IRs to Be Filled, Deposit May Need to be Mandated

Although we have succeeded in adding a reasonable amount of content to the repository we have also been offered significant amounts of content that cannot be added because of restrictive publisher copyright agreements. . . . This is a clear demonstration that major changes need to take place at a high level in order for repositories to be successful. Although some academics have taken the decision to try and avoid publishing in the journals of publishers with restrictive policies, this is still relatively rare. We can inform staff about the issues, but we cannot and should not dictate in which journals they publish. Change is only likely to happen if staff are required, either by the funding councils or by their institution, to make their publications available either by publishing in open access journals or in journals that permit deposit in a repository.

The View from the IR Trenches, Part 1

It may be helpful in understanding IRs to to examine some of the articles mentioned in yesterday’s "Early Adopters of IRs: A Brief Bibliography" posting in more detail.

Today, we’ll look at:

Andrew, Theo. "Trends in Self-Posting of Research Material Online by Academic Staff." Ariadne, no. 37 (2003).
http://www.ariadne.ac.uk/issue37/andrew/

This paper presents findings from "a baseline survey of research material already held on departmental and personal Web pages in the ed.ac.uk domain" (this is the University of Edinburgh’s domain).

Selected quotes from the article are below; the headings are mine. Caveat emptor: selected quotes are just that. It’s always a good idea to read the full paper. I would hope that these brief quotes entice you to do so.

Self-Archiving Disciplinary Differences Matter

As expected, there is a clear difference between academic areas. The average percentage of self-archiving scholars in each College supports this view. Within the College of Science and Engineering (S&E) this figure is 14.81%, which drops to 3.18% within Humanities and Social Science (HSS) and 0.32% within Medicine and Veterinary Medicine (MVM).

However, the situation is more complex than a simple trend of self-archiving being better established in S&E. Looking at the averages between Schools shows that even within Colleges there is a wide distribution of values. In S&E this ranges from 32.67% in Informatics to 6.99% in Engineering and Electronics. . . and in HSS from 12.70% in Philosophy, Psychology and Language Sciences to 0% in Divinity and Law . . . .

Even within individual Schools there is a noticeable change in self-archiving attitudes. For example, self-archiving percentages within the School of GeoScience range from 29.41% in Meteorology down to 0% in Geography. . . .

Disciplinary Archives May Not Be Generally Trusted

Considering the wide-ranging self-archiving trends between academic Colleges and even within Schools, it seems there is a direct correlation between willingness to self-archive and the existence of subject-based repositories. . . . because the ArXiv has become so successful . . . academics trust it as their ‘natural’ repository for self-archived material. The same degree of trust may not yet obtain in the case of the subject repositories mentioned above, which leads to additional self-archiving in home institution repositories. . . . where there is a pre-existing culture of self-archiving eprints in subject repositories, scholars are more likely to post research material on their own Web pages, until such time as those subject repositories become trusted for their comprehensiveness and persistence.

Low Number of Preprints Found on Personal Web Pages

A surprising finding from the baseline survey is the relatively low volume of preprints found on personal Web pages. This could be related to the success of eprint repositories. . . . Preprints do not have anywhere near the same impact factor as those papers from accredited journal titles, so it is possible that researchers would favour only putting their most impressive work in their online CV.

Scholars Are Confused by Copyright Agreements

One aspect of the survey that is not shown in the results is the lack of consistency in dealing with copyright and IPR issues that scholars face when placing material online. Some academic units have responded by not self-archiving any material at all. . . . A small percentage of individual scholars have responded by using general disclaimers that may or may not be effective. Others, generally well-established professors, have posted material online that is arguably in breach of copyright agreements. . . . Most, however, take a middle line of only posting papers from sympathetic publishers who allow some form of self-archiving. It is apparent that if institutional repositories are going to work, then this general confusion over copyright and IPR issues needs to be addressed right at the source.

Early Adopters of IRs: A Brief Bibliography

In "Two Views of IRs," I discussed institutional repositories in the abstract. A useful exercise, but we don’t need to just conjecture about how IRs will be structured and supported. Nor do we need to simply speculate about the issues that they will face. IRs exist, and we can "ask" their managers these questions by examining the articles that have been written about them. (Yesterday’s "ARL Institutional Repositories" posting provides another way to investigate operational IRs: try them out.)

Below is brief bibliography of interesting articles about IRs that are notable for providing insider views. You’ll note that many of them are about UK IRs. The UK has been in the forefront of the IR movement.

Andrew, Theo. "Trends in Self-Posting of Research Material Online by Academic Staff." Ariadne, no. 37 (2003).
http://www.ariadne.ac.uk/issue37/andrew/

Ashworth, Susan. "The DAEDALUS Project." Serials 16, no. 3 (2003): 249-253.
https://dspace.gla.ac.uk/handle/1905/149

Ashworth, Susan, Morag Mackie, and William J. Nixon. "The DAEDALUS Project, Developing Institutional Repositories at Glasgow University: The Story So Far." Library Review 53, no. 5 (2004): 259-264.
http://eprints.gla.ac.uk/archive/00000408/

Barton, Mary R., and Julie Harford Walker. "Building a Business Plan for DSpace, MIT Libraries’ Digital Institutional Repository." Journal of Digital Information 4, no. 2 (2003).
http://jodi.ecs.soton.ac.uk/Articles/v04/i02/Barton/

Baudoin, Patsy, and Margret Branschofsky. "Implementing an Institutional Repository: The DSpace Experience at MIT." Science & Technology Libraries 24, no. 1/2 (2003): 31-45.

Foster, Nancy Fried, and Susan Gibbons. "Understanding Faculty to Improve Content Recruitment for Institutional Repositories." D-Lib Magazine 11, no. 1 (2005).
http://www.dlib.org/dlib/january05/foster/01foster.html

Hey, Jessie. "Targeting Academic Research with Southampton’s Institutional Repository." Ariadne, no. 40 (2004).
http://www.ariadne.ac.uk/issue40/hey/

Mackie, Morag. "Filling Institutional Repositories: Practical Strategies from the DAEDALUS Project." Ariadne, no. 39 (2004).
http://www.ariadne.ac.uk/issue39/mackie/

Nixon, William J. "DAEDALUS: Freeing Scholarly Communication at the University of Glasgow." Ariadne, no. 34 (2003).
http://www.ariadne.ac.uk/issue34/nixon/

________. "The Evolution of an Institutional E-Prints Archive at the University Of Glasgow." Ariadne, no. 32 (2002).
http://www.ariadne.ac.uk/issue32/eprint-archives/

Soehner, Catherine. "The eScholarship Repository: A University of California Response to the Scholarly Communication Crisis." Science & Technology Libraries 22, no. 3/4 (2002): 29-37.

ARL Institutional Repositories

The Association of Research Libraries (ARL) currently has 123 member libraries in the US and Canada. Below is a list of operational institutional repositories at ARL libraries. This list was complied by a quick examination of ARL libraries’ home pages, supplemented with a bit of Google searching. I certainly wouldn’t claim that it’s comprehensive, and I would welcome additions. (Quick note to ARL library Web site managers: put a highly visible link to your IR on your home page.)

While not perfect (what is?), this list does give us a rough snapshot of the level of IR activity in ARL libraries, and it provides some insight into how these large research libraries have chosen to structure and support their IRs (can you say bepress and DSpace?).

Two Views of IRs

Yesterday, Stevan Harnad offered extensive comments on my "Not Green Enough" posting. Here are my thoughts on those comments.

The crux of the matter is two very different views of institutional repositories (IRs), and, therefore, different perceptions about how quickly IRs will solve the self-archiving problem. My apologies in advance to Stevan if my capsule summary of his position is incorrect.

In Stevan’s view, the sole purpose of an IR is to provide free global access to e-prints. Once institutions adopt the Berlin 3 recommendations (which require faculty to self-archive in IRs and encourage them to publish in OA journals), establishing and running an IR is a cheap, simple technical problem. Therefore, it doesn’t matter whether publisher copyright agreements allow scholars to archive in disciplinary archives or in the Internet Archive’s universal repository. (I’m unclear about Steven’s position about independent scholars who will never be able to self-archive in an IR because they are not affiliated with any institution or about researchers who are affiliated with non-academic institutions that will never have IRs. Perhaps, in the last case, he believes that IRs will be universal for every non-academic institution.) IR managers who hold other views are obstructing progress because they are wasting time on nonessential issues, not correctly perceiving the urgency and simplicity of his self-archiving solution, and unnecessarily delaying the progress of OA.

My view of the basic function of an IR is best summed up by two quotes (the first by Clifford Lynch, Executive Director of the Coalition for Networked Information) and the second by me:

"In my view, a university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution." [1]

"An institutional repository includes a variety of materials produced by scholars from many units, such as e-prints, technical reports, theses and dissertations, data sets, and teaching materials. Some institutional repositories are also being used as electronic presses, publishing e-books and e-journals." [2]

Given this vision of IRs, I see them as more technically complex than Steven. However, I see the primary challenges being in the areas of achieving buy-in from university administrators and faculty, establishing a wide range of policies and procedures (e.g., acceptable types and formats of material, deposit control and facilitation strategies, copyright compliance procedures, and metadata utilization), recruiting content (including depositing items for faculty if required to help populate the IR), providing user support and training, and providing data migration services as file formats become obsolete. Of course, if IRs a assume formal publishing role, this adds new dimensions of complexity, but I’ll defer that point for now since it is only being done in a few IRs, such as the following two examples:

eScholarship Repository
http://repositories.cdlib.org/escholarship/

Internet-First University Press at Cornell University
http://dspace.library.cornell.edu/handle/1813/62

(To clarify one point of confusion, libraries are not generally expecting IRs to solve the e-journal preservation problem. They are turning to solutions such as LOCKSS to do that.)

I do not believe that getting faculty to voluntarily deposit e-prints will be easy. I’m not convinced that most university administrators are going to be quickly and effortlessly persuaded to endorse Berlin 3 unless it is, in effect, externally mandated (e.g., Research Councils UK proposal).

I think that at least a significant subset of universities will want some type of basic vetting of the copyright compliance status of submitted e-prints, and, given the current wide range of variations in publisher copyright agreements and a relatively low level of faculty awareness and interest in copyright matters, that this will be a thorny issue (and one that directly relates to my standard copyright agreement idea).

This is why Johanneke Sytsema of Oxford University said in her comment about "How Green Is My Publisher"
(http://www.escholarlypub.com/digitalkoans/2005/04/26/how-green-is-my-publisher/#comments):

"I do agree with Charles Bailey that ‘green’ doesn’t automatically mean ‘go’. Being a repository manager myself, I never just ‘go’ when I encounter ‘green’ on the (invaluable) SHERPA Romeo list. First, I need to check whether the publisher allows archiving into an institutional repository, rather than just on a personal or departmental website. Secondly, I need to check the permitted format: some publisher[s] object to using the publisher PDF, other publishers require the use of the publisher PDF. Thirdly, I need to check on publisher policies every time I deposit, since publishers may change their policy from day to day. So, could the light get greener than it is now? I believe, it should."

Given my view of IRs, I agree with University of Rochester IR manager Susan Gibbons, when she says that the "the costs and efforts involved in maintaining an IR are substantial."

Which of these two views of institutional repositories will prevail? Time will tell.

If my view prevails, IRs will take longer than if Stevan’s view prevails. Academic authors who have papers accepted by publishers with restrictive author copyright agreements (i.e., those that bar deposit in disciplinary archives or in the universal repository) will have to wait to deposit papers in an OAI-PMH compliant archive. Lacking a way to self-archive with relative ease, they may simply choose not to do so. Non-academic authors may never be able to deposit their papers in an OAI-PMH compliant archive.

If Stevan’s view prevails, IRs will pop up like mushrooms and the above won’t matter, as long as authors enthusiastically deposit their old papers once their IRs are in place.

If the only barrier is a small investment of time and money (as Stevan describes below), it’s unclear to me why we don’t have universal IRs today:

"The 94% of authors at archiveless universities are one $2000 linux server plus a few days’ one-time sysad set-up time and a few annual sysaddays’ maintenance time away from having an institutional repository."

But, I say, Godspeed, Stevan. Prove me wrong, for that will mean that OA happens sooner, and scholars without access to IRs will be deprived of the benefits of depositing in an OAI-compliant repository (or depositing at all) for a shorter period of time.

And, I cheerfully give Steven the last word on the matter (for now anyway).

1. Clifford A. Lynch, "Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age," ARL: A Bimonthly Report on Research Library Issues and Actions from ARL, CNI, and SPARC, no. 226 (2003),
http://www.arl.org/newsltr/226/ir.html

2. Charles W. Bailey, Jr., Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals (Washington, DC: Association of Research Libraries, 2005), xviii,
http://info.lib.uh.edu/cwb/oab.pdf