“Repository Software Survey, March 2009”

The Repositories Support Project has released the "Repository Software Survey, March 2009," which analyzes the CONTENTdm, Digital Commons, DigiTool, DSpace, EPrints, EQUELLA, Fedora, intraLibrary, Research-Output Repository Platform, Open Repository, and VITAL digital repository systems.

Repositories Support Project Podcasts Launched

The Repositories Support Project Podcasts has launched a podcast series.

Here are titles of the initial podcasts:

  • Digital Preservation: Are Repositories Doing Enough for Preservation?
  • DRIVER: Promoting Digital Repositories across Europe
  • EPrints: Repository Software of the Future or of the Past?
  • Fedora: Optimum Repository Software or Overkill?

E-Print Preservation: SHERPA DP: Final Report of the SHERPA DP Project

JISC has released SHERPA DP: Final Report of the SHERPA DP Project.

Here's an excerpt from the "Executive Summary":

The SHERPA DP project (2005–2007) investigated the preservation of digital resources stored by institutional repositories participating in the SHERPA project. An emphasis was placed on the preservation of e-prints—research papers stored in an electronic format, with some support for other types of content, such as electronic theses and dissertations.

The project began with an investigation of the method that institutional repositories, as Content Providers, may interact with Service Providers. The resulting model, framed around the OAIS, established a Co-operating archive relationship, in which data and metadata is transferred into a preservation repository subsequent to it being made available. . . .

The Arts & Humanities Data Service produced a demonstrator of a Preservation Service, to investigate the operation of the preservation service and accepted responsibility for the preservation of the digital objects for a three-year period (two years of project funding, plus one year).

The most notable development of the Preservation Service demonstrator was the creation of a reusable service framework that allows the integration of a disparate collection of software tools and standards. The project adopted Fedora as the basis for the preservation repository and built a technical infrastructure necessary to harvest metadata, transfer data, and perform relevant preservation activities. Appropriate software tools and standards were selected, including JHOVE and DROID as software tools to validate data objects; METS as a packaging standard; and PREMIS as a basis on which to create preservation metadata. . . .

A number of requirements were identified that were essential for establishing a disaggregated service for preservation, most notably some method of interoperating with partner institutions and he establishment of appropriate preservation policies. . . . In its role as a Preservation Service, the AHDS developed a repository-independent framework to support the EPrints and DSpace-based repositories, using OAI-PMH as common method of connecting to partner institutions and extracting digital objects.

Institutional Repositories, Tout de Suite

Institutional Repositories, Tout de Suite, the latest Digital Scholarship publication, is designed to give the reader a very quick introduction to key aspects of institutional repositories and to foster further exploration of this topic through liberal use of relevant references to online documents and links to pertinent websites. It is under a Creative Commons Attribution-Noncommercial 3.0 United States License, and it can be freely used for any noncommercial purpose in accordance with the license.

Open-Source IRStats Released: Use Statistics for EPrints and DSpace

Eprints.org has released IRStats, an open source use statistics analysis package that analyzes both EPrints (versions 2 and 3) and DSpace (beta functionality) logs. The software is under a BSD license, and it requires Perl, awstats, MySQL, Maxmind Organisation Database, ChartDirector, and a CGI-capable Web server.

A description of IRStats features is available as well as examples of its use. For additional information on the project, see "Introduction to IRS."

Open Access to Books: The Case of the Open Access Bibliography Updated

Last July, I reported on use of the Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals, which is both a printed book and a freely available e-book. Both versions are under a Creative Commons Attribution-NonCommercial 2.0 License. You can get a detailed history at the prior posting; the major changes since then have been the conversion of the HTML version to XHTML and the addition of a Google Custom Search Engine.

So, what does cumulative use of the e-book OAB version look like slightly over one year down the road from the last posting? Here's a summary:

  • UH PDF: 29,255 (March through May 2005)
  • All Web files on both Digital Scholarship hosts: 192,849 (33,814 uses of the PDF file; June 2005 through July 2007)
  • dLIST PDF: 655 (March 2005 to present)
  • E-LIS PDF: 556 (November 2005 to present)
  • ARL PDF: Not Available

Combined, OAB Web files have been accessed 223,315 times since March 2005.

ACRLog Urgent Call for Action about NIH Policy Vote

An urgent call for action has been issued on ACRLog about upcoming House and Senate votes on Labor, Health and Human Services appropriations bills that will determine whether NIH-funded researchers are required to make their final manuscripts publicly accessible within twelve months of publication.

Here's an excerpt from the posting:

We need your help to keep the momentum going. The full House of Representatives and the full Senate will vote on their respective measures this summer. The House is expected to convene on Tuesday, July 17. We’re asking that you contact your US Representative and your US Senators by phone or fax as soon as possible and no later than Monday afternoon. Urge them to maintain the Appropriations Committee language. (Find talking points and contact info for your legislators in the ALA Legislative Action Center. It is entirely possible that an amendment will be made on the floor of the House to delete the language in the NIH policy.

Want to know more? Listen to an interview with Heather Joseph of SPARC on the ALA Washington Office District Dispatch blog. Find background on the issue along with tips on communicating effectively with your legislators in the last two issues of ACRL’s Legislative Update and at the Alliance for Taxpayer Access website.

Peter Suber has issued a similar call on Open Access News. Here it is in full:

Tell Congress to support an OA mandate at the NIH

Let me take the unusual step of repeating a call to action from yesterday in case it got buried in the avalanche of news. 

The House Appropriations Committee approved language establishing an OA mandate at the NIH.  The full House is scheduled to vote on the appropriations bill containing that language on Tuesday, July 17

Publishers are lobbying hard to delete this language.  If you are a US citizen and support public access for publicly-funded research, please ask your representative to support this bill, and to oppose any attempt to amend or strike the language.  Contact your representative now, before you forget.

Time is short.  Offices are closed on the weekend, but emails and faxes will go through.  Send an email or fax right now or telephone before Monday afternoon.

Because the Senate Appropriations Committee approved the same language in June, you should contact your Senators with the same message.  But the vote by the full House is in three days, while the vote by the full Senate has not yet been scheduled.

For help in composing your message, see

Then spread the word!

SWORD (Simple Web-service Offering Repository Deposit) Project

Led by UKOLN, The JISC SWORD (Simple Web-service Offering Repository Deposit) Project is developing "a prototype ‘smart deposit’ tool" to "facilitate easier and more effective population of repositories."

Here’s an excerpt from the project plan:

The effective and efficient population of repositories is a key concern for the repositories community. Deposit is a crucial step in the repository workflow; without it a repository has no content and can fulfill no further function. Currently most repositories exist in a fairly linear context, accepting deposits from a single interface and putting them into a single repository. Further deployment of repositories, encouraged by JISC and other funders, means that this situation is changing and we are beginning to see an increasingly complex and dynamic ecology of interactions between repositories and other services and systems. By and large developers are not creating repository systems and software from scratch, rather they are considering how repositories interface with other applications within institutions and the wider information landscape. A single repository, or multiple repositories, might interact with other components, such as VLEs, authoring tools, packaging tools, name authority services, classification services and research systems. In terms of content, resources may be deposited in a repository by both human and software agents, e.g. packaging tools that push content into repositories or a drag-and-drop desktop tool. The type of resource being deposited will also influence the choice of deposit mechanism. If the resources are complex packaged objects then a web service will need to support the ingest of multiple packaging standards.

There is currently no standard mechanism for accepting content into repositories, yet there already exists a stable and widely implemented service for harvesting metadata from repositories (OAI-PMH—Open Archives Initiative Protocol for Metadata Harvesting). This project will implement a similarly open protocol or specification for deposit. By taking a similar approach, the project and the resulting protocol and implementations will gain easier acceptance by a community already familiar with the OAI-PMH.

This project aims to develop a Simple Web-service Offering Repository Deposit (SWORD)—a lightweight deposit protocol that will be implemented as a simple web service within EPrints, DSpace, Fedora and IntraLibrary and tested against a prototype ‘smart deposit’ tool. The project plans to take forward the lightweight protocol originally formulated by a small group working within the Digital Repositories Programme (the ‘Deposit API’ work) . The project is aligned with the Object Reuse and Exchange (ORE) Mellon-funded two-year project by the Open Archives Initiative, which commenced in October 2006. Members of the SWORD project team are represented on its Technical and Liaison Committees. . . . . The SWORD project is not attempting to duplicate work being done being done by ORE, but seeks to build on existing work to support UK-specific requirements whilst feeding into the ongoing ORE project.

Open Access Repository Software Use By Country

Based on data from the OpenDOAR Charts service, here is snapshot of the open access repository software that is in use in the top five countries that offer such repositories.

The countries are abbreviated in the table header column as follows: US = United States, DK = Germany, UK = United Kingdom, AU = Australia, and NL = Netherlands. The number in parentheses is the reported number of repositories in that country.

Read the country percentages downward in each column (they do not total to 100% across the rows).

Excluding "unknown" or "other" systems, the highest in-country percentage is shown in boldface.

Software/Country US (248) DE (109) UK (93) AU (50) NL (44)
Bepress 17% 0% 2% 6% 0%
Cocoon 0% 0% 1% 0% 0%
CONTENTdm 3% 0% 2% 0% 0%
CWIS 1% 0% 0% 0% 0%
DARE 0% 0% 0% 0% 2%
Digitool 0% 0% 1% 0% 0%
DSpace 18% 4% 22% 14% 14%
eDoc 0% 2% 0% 0% 0%
ETD-db 4% 0% 0% 0% 0%
Fedora 0% 0% 0% 2% 0%
Fez 0% 0% 0% 2% 0%
GNU EPrints 19% 8% 46% 22% 0%
HTML 2% 4% 4% 4% 0%
iTor 0% 0% 0% 0% 5%
Milees 0% 2% 0% 0% 0%
MyCoRe 0% 2% 0% 0% 0%
OAICat 0% 0% 0% 2% 0%
Open Repository 0% 0% 3% 0% 2%
OPUS 0% 43% 2% 0% 0%
Other 6% 7% 2% 2% 0%
PORT 0% 0% 0% 0% 2%
Unknown 31% 28% 18% 46% 23%
Wildfire 0% 0% 0% 0% 52%

Snapshot Data from OpenDOAR Charts

OpenDOAR has introduced OpenDOAR Charts, a nifty new service that allows users to create and view charts that summarize data from its database of open access repositories.

Here’s what a selection of the default charts show today. Only double-digit percentage results are discussed.

  • Repositories by continent: Europe is the leader with 49% of repositories. North America places second with 33%.
  • Repositories by country: In light of the above, it is interesting that the US leads the pack with 29% of repositories. Germany (13%) and the UK follow (11%).
  • Repository software: After the 28% of unknown software, EPrints takes the number two slot (21%), followed by DSpace (19%).
  • Repository types: By far, institutional repositories are the leader at 79%. Disciplinary repositories follow (13%).
  • Content types: ETDs lead (53%), followed by unpublished reports/working papers (48%), preprints/postprints (37%), conference/workshop papers (35%), books/chapters/sections (31%), multimedia/av (20%), postprints only (17%), bibliographic references (16%), special items (15%), and learning objects (13%).

This is a great service; however, I’d suggest that University of Nottingham consider licensing it under a Creative Commons license so that snapshot charts could be freely used (at least for noncommercial purposes).

MIRACLE Project’s Institutional Repository Survey

The MIRACLE (Making Institutional Repositories A Collaborative Learning Environment) project at the University of Michigan’s School of Information presented a paper at JCDL 2006 titled "Nationwide Census of Institutional Repositories: Preliminary Findings."

MIRACLE’s sample population was 2,147 library directors at four-year US colleges and universities. The paper presents preliminary findings from 273 respondents.

Respondents characterized their IR activities as: "(1) implementation of an IR (IMP), (2) planning & pilot testing an IR software package (PPT), (3) planning only (PO), or (4) no planning to date (NP)."

Of the 273 respondents, "28 (10%) have characterized their IR involvement as IMP, 42 (15%) as PPT, 65 (24%) as PO, and 138 (51%) as NP."

The top-ranked benefits of having an IR were: "capturing the intellectual capital of your institution," "better service to contributors," and "longtime preservation of your institution’s digital output." The bottom-ranked benefits were "reducing user dependence on your library’s print collection," "providing maximal access to the results of publicly funded research," and "an increase in citation counts to your institution’s intellectual output."

On the question of IR staffing, the survey found:

Generally, PPT and PO decision-makers envision the library sharing operational responsibility for an IR. Decision-makers from institutions with full-fledged operational IRs choose responses that show library staff bearing the burden of responsibility for the IR.

Of those with operational IRs who identified their IR software, the survey found that they were using: "(1) 9 for Dspace, (2) 5 for bePress, (3) 4 for ProQuest’s Digital Commons, (4) 2 for local solutions, and (5) 1 each for Ex Libris’ DigiTools and Virginia Tech’s ETD." Of those who were pilot testing software: "(1) 17 for DSpace, (2) 9 for OCLC’s ContentDM, (3) 5 for Fedora, (4) 3 each for bePress, DigiTool, ePrints, and Greenstone, (5) 2 each for Innovative Interfaces, Luna, and ETD, and (6) 1 each for Digital Commons, Encompass, a local solution, and Opus."

In terms of number of documents in the IRs, by far the largest percentages were for less than 501 documents (IMP, 41%; and PPT, 67%).

The preliminary results also cover other topics, such as content recruitment, investigative decision-making activities, IR costs, and IR system features.

It is interesting to see how these preliminary results compare to those of the ARL Institutional Repositories SPEC Kit. For example, when asked "What are the top three benefits you feel your IR provides?," the ARL survey respondents said:

  1. Enhance visibility and increase dissemination of institution’s scholarship: 68%
  2. Free, open, timely access to scholarship: 46%
  3. Preservation of and long-term access to institution’s scholarship: 36%
  4. Preservation and stewardship of digital content: 36%
  5. Collecting, organizing assets in a central location: 24%
  6. Educate faculty about copyright, open access, scholarly communication: 8%