Institutional Repository Bibliography, Version 1

To celebrate Open Access Week, Digital Scholarship is releasing version one of the Institutional Repository Bibliography. This bibliography presents over 620 selected English-language articles, books, and other scholarly textual sources that are useful in understanding institutional repositories. Although institutional repositories intersect with a number of open access and scholarly communication topics, this bibliography only includes works that are primarily about institutional repositories.

Most sources have been published between 2000 and the present; however, a limited number of key sources published prior to 2000 are also included. Where possible, links are provided to e-prints in disciplinary archives and institutional repositories.

Table of Contents

1 General
2 Country and Regional Institutional Repository Surveys
3 Multiple-Institution Repositories
4 Specific Institutional Repositories
5 Institutional Repository Digital Preservation Issues
6 Institutional Repository Library Issues
7 Institutional Repository Metadata Issues
8 Institutional Repository Open Access Policies
9 Institutional Repository R&D Projects
10 Institutional Repository Research Studies
11 Institutional Repository Software
Appendix A. About the Author

ETD Self-Archiving Tools: ICE-TheOREM Final Report

JISC has released the ICE-TheOREM Final Report.

Here's an excerpt:

ICE-TheOREM was a project which made several important contributions to the repository domain, promoting deposit by integrating the repository with authoring workflows and enhancing open access by prototyping new infrastructure to allow fine-grained embargo management within an institution without impacting on existing open access repository infrastructure.

In the area of scholarly communications workflows, the project produced a complete end-to-end demonstration of eScholarship for word processor users, with tools for authoring, managing and disseminating semantically-rich ETD (Electronic Theses and Dissertations) documents fully integrated with supporting data. This work is focused on theses, as it is well understood that early career researchers are the most likely to lead the charge in new innovations in scholarly publishing and dissemination models.

The authoring tools are built on the ICE content management system, which allows authors to work within a word processing system (as most authors do) with easy-to-use toolbars to structure and format their documents. The ICE system manages both small data files and links to larger data sets. The result is research publications which are available not just as paper-ready PDF files but as fully interactive semantically aware web documents which can be disseminated via repository software such as ePrints, DSpace and Fedora as complete supported web-native and PDF publications.

Stevan Harnad on "Integrating Universities' Thesis and Research Deposit Mandates"

Stevan Harnad has self-archived the text of his "Integrating Universities' Thesis and Research Deposit Mandates" presentation in the ECS EPrints Repository.

Here's an excerpt:

A growing number of universities are beginning to require the digital deposit of their thesis and dissertation output in their institutional repositories. At the same time, a growing number of universities as well as research funders are beginning to mandate that all refereed research must be deposited too. This makes for a timely synergy between the practices of the younger and older generation of researchers as the Open Access era unfolds. It also maximizes the uptake, usage and impact of university research input at all stages, as well as providing rich and powerful new metrics to monitor and reward research productivity and impact. It is important to integrate universities' ETD and research output repositories, mandates and metrics as well as to provide the mechanism for those deposits that may need to be made Closed Access rather than Open Access: Repositories need to implement the "email eprint request" Button for all Closed Access Deposits. Any would-be user webwide, having reached the metadata of a Closed Access Deposit can, with one click, request an eprint for research purposes; the author instantly receives an automatic email and can then, again with one click, authorize the automatic emailing of one copy to the user by the repository software. This feature is important for fulfilling immediate research usage needs during any journal-article embargo period, and it also gives the authors of dissertations they hope to publish as books a way to control who has access to the dissertation. Digital dissertations will also benefit from the reference-linking and book-citation metrics that will be provided by harvesters of the distributed institutional repository metadata (which will also include the metadata and reference lists of all university book output). Dissertation downloads as well as eprint-requests will also provide useful new research impact metrics

"Getting Started with Fedora"

Fedora Commons has released "Getting Started with Fedora."

Here's an excerpt from the announcement:

The "Getting Started with Fedora" Guide is designed to offer new users, or potential users, a basic understanding of the Fedora architecture and the core repository management software, along with some general ideas about how to use it. Whether you want to adopt one of the existing Fedora-based solutions or develop you own, this general introduction should be useful to you.

Embedding Repositories in Research Management Systems: Final Report

JISC has released Embedding Repositories in Research Management Systems: Final Report. (See JISC's "Research Information Management" page for more information on research management systems.)

Here's an excerpt:

The main finding from our research is the variety of disparate "systems" (i.e. both technology and process) in use. Management information is derived from many different systems and processes: HR; student administration (e.g. SITS); finance; research support; and information services/library.

A number of universities use business intelligence software such as Cognos or Oracle Discoverer for monitoring activity and performance, and for resource allocation and forecasting. However, some rely on tools as simple as a series of complex interlinked spreadsheets.

The role of institutional repositories is still small. IRs are near-universal, but mainly lack critical mass of content. Publications and research expertise databases are widespread and are the main research assessment management tool for many HEIs: few of these connect to the IR, although in a small number of cases the two are linked.

We could find no implementations of a research management system with embedded repository; further, where both RMS and IRs do exist within the same HEI they are not well-integrated.

University of Illinois' IDEALS Repository Tops One Million Downloads

The University of Illinois' IDEALS institutional repository has topped one million downloads.

Here's an excerpt from the announcement:

The Illinois Digital Environment for Access to Learning and Scholarship (IDEALS), a digital repository for research and scholarship developed at the University of Illinois at Urbana-Champaign, has surpassed its one-millionth download.

The service, offered through the University Library and Campus Information Technologies and Educational Services (CITES), is sponsored by the Office of the Provost at Illinois and was launched in 2006. The campus institutional repository includes articles, working papers, preprints, technical reports, conference papers and, data sets in various digital formats provided by University faculty, staff, and graduate students. Although central to the University of Illinois, anyone can access and benefit from IDEALS collections and services. "Today, over 12,000 items have been uploaded into IDEALS," said Sarah Shreeves, associate professor and IDEALS coordinator. "The success of this service has surpassed what anyone envisioned two and a half years ago, and we hope that others in the Illinois community will take advantage of its services."

The mission of IDEALS is to preserve and provide persistent and reliable access to digital research and scholarship in order to give these works the greatest possible recognition and distribution. IDEALS endeavors to ensure that its materials appear in search engines such as Google, Google Scholar, and Bing and that the majority of the research is openly available for anyone to access. As a result of its efforts to disseminate research produced at the University of Illinois, IDEALS was recently ranked in the top 10 of institutional repositories worldwide. "I am delighted with the exposure that IDEALS has provided us with. Whenever we place a thesis or a report, the downloads start and never stop. We get many comments back from readers and researchers who have seen our work only on IDEALS," said Amr Elnashai, head, Civil and Environmental Engineering Department at the University of Illinois at Urbana-Champaign.

IDEALS contains a wealth of diverse information, from a Mid-America Earthquake Center report on the Kashmir Earthquake of 2005 to the Ethnography of the University Initiative’s publications and presentations, including campus folklore and cultural perceptions. "I appreciate that my thesis is archived in a stable location for reliable long-term access. The document is now freely available to anyone in the world, yet I retain the copyright," said David P. Hruska, an Illinois graduate. "Furthermore, my thesis is now displayed in search results returned by Google Scholar, improving the dissemination of my research."

SWORD PHP Library Version 0.9

The SWORD PHP library version 0.9 has been released. SWORD is "a lightweight protocol for depositing content from one location to another. It stands for Simple Web-service Offering Repository Deposit and is a profile of the Atom Publishing Protocol (known as APP or ATOMPUB)."

Here's an excerpt from the announcement:

  • Changed swordappservicedocument to build the servcedocument from the xml response rather than having the swordappclient do the work. This allows the service document to be parsed at a later time.
  • Changed the swordappclient deposit method to stream the file being deposited straight from disk rather than via memory to avoid using excessive memory and potentially exceeding the PHP memory limit. I’ve successfully tested this against DSpace with deposits of 600MB CD images.
  • Added some validation to the SWAP/METS packager to allow it to cope with filenames and metadata containing ampersands

SWORD2 Project Final Report

JISC has released SWORD2 Project Final Report.

Here's an excerpt:

The SWORD vision is about 'lowering the barriers to deposit', primarily for depositing content into repositories, and additionally, for depositing into any system which may wish to receive content from remote sources. The SWORD protocol defines a standard mechanism for depositing into repositories and other systems. The project and protocol were developed because there was previously no standardised way of doing this. A standard deposit interface allows repository services to be built that can offer functionality such as deposit from multiple locations, e.g. disparate repositories, desktop drag'n'drop tools, or from within standard office applications. SWORD can also facilitate deposit to multiple repositories, increasingly important for depositors who wish to deposit to funder, institutional or subject repositories. There are many other possibilities, including migration of content between repositories and transfer to preservation services. In addition to refining the existing SWORD application profile, the SWORD2 project has developed a number of tools and services to demonstrate these possibilities. It has also been pro-active in promoting SWORD and encouraging uptake within other repositories, services and tools, notably with its adoption into the Microsoft Article Authoring Add-in for Word 2007 and with the new Microsoft Zentity repository system .

The core aims of the project were to update the SWORD Protocol, the SWORD repository code libraries in the DSpace, Fedora, EPrints and Intrallect repositories, and the existing reference demonstrators. A Facebook application and validator have also been developed. Advocacy efforts include an e-learning case study, a briefing paper, a new SWORD website, and a range of additional dissemination activities, including conference papers, presentations, demonstrations and workshops at a number of national and international conferences and meetings.

University of Wollongong Repository Hits One Million Download Mark

The University of Wollongong's institutional repository, Research Online, has now had a million full-text downloads.

Here's an excerpt from the press release:

The millionth paper to be accessed was a 2006 conference paper by Faculty of Informatics academics Katina Michael, A. McNamee and MG Michael entitled "The Emerging Ethics of Humancentric GPS Tracking and Monitoring."

Mr Organ said the Michaels are some of the strongest supporters of Research Online, with more than 160 items on the site.

"I am absolutely delighted," Katina Michael said. "Research Online has been instrumental in getting our research out to the wider community—fellow academics, industry, government and citizens. It is such a powerful tool."

"An academic has the ability to control the release of their papers at any point throughout the publication process… but I think the real contribution of Research Online has been in forming cross-institutional and transnational networks."

Research Online also gives academics the ability to see which of their papers are the most popular, and Katina Michael says this has been useful for her research.

"My fellow collaborators and I have been able to gauge which papers are being downloaded most and when. We can then make some basic assumptions about the significance of various research endeavours and direct our efforts accordingly."

Serials Review Special Issue on Asia-Pacific Repositories

Serials Review has published a special issue on Asia-Pacific repositories.

Here's an selection of article titles:

  • "Exploring Research Data Hosting at the HKUST Institutional Repository"
  • "An Integrative View of the Institutional Repositories in Hong Kong: Strategies and Challenges"
  • "Open Access in Hong Kong—Where Are We Now?"
  • "Promoting the Visibility of Educational Research through an Institutional Repository"
  • "Research Online: Digital Commons as a Publishing Platform at the University of Wollongong, Australia"
  • "Towards Scholarly HTML"

University of Western Ontario Launches Scholarship@Western

The Western Libraries at the University of Western Ontario have launched Scholarship@Western.

Here's an excerpt from the announcement by Adrian K. Ho, Scholarly Communications Librarian:

Scholarship@Western showcases publications and presentations from the university community by department. As this is a new initiative, not all academic departments are listed at present. A segment of Scholarship@Western, named Researcher Gallery, offers virtual space for Western's faculty, graduate students, librarians, and archivists to create their homepages and provide access to their publications, presentations, and other academic materials. In addition, Scholarship@Western can function as an online publishing platform for journals, conference proceedings, research reports, and working papers. An online journal, the Western Undergraduate Research Journal: Health and Natural Sciences, will be published to celebrate Western's academic excellence.

Scholarship@Western will feature a niche for the University's master's theses and PhD dissertations. Western Libraries already has some of the past theses and dissertations digitized and will upload them to Scholarship@Western for free public access. Meanwhile, we have been working with the School of Graduate and Postdoctoral Studies to develop a university-wide program that will publish, archive, and preserve future theses and dissertations on Scholarship@Western for widest possible access.

Read more about it at "Online Archive Opens Access to Research."

Related post: "Adrian K. Ho Named Scholarly Communication Librarian at Western Libraries of the University of Western Ontario."

Harvard Launches DASH (Digital Access to Scholarship at Harvard) Repository

Harvard has launched its DASH (Digital Access to Scholarship at Harvard) repository. (Thanks to Open Access News.)

Here's an excerpt from the press release:

Harvard's leadership in open access to scholarship took a significant step forward this week with the public launch of DASH—or Digital Access to Scholarship at Harvard—a University-wide, open-access repository. More than 350 members of the Harvard research community, including over a third of the Faculty of Arts and Sciences, have jointly deposited hundreds of scholarly works in DASH.

"DASH is meant to promote openness in general," stated Robert Darnton, Carl H. Pforzheimer University Professor and Director of the University Library. "It will make the current scholarship of Harvard's faculty freely available everywhere in the world, just as the digitization of the books in Harvard's library will make learning accumulated since 1638 accessible worldwide. Taken together, these and other projects represent a commitment by Harvard to share its intellectual wealth." . . .

DASH has its roots in the February 2008 open-access vote in the Faculty of Arts and Sciences. In a unanimous decision, FAS adopted a policy stating that

Each Faculty member grants to the President and Fellows of Harvard College permission to make available his or her scholarly articles and to exercise the copyright in those articles. In legal terms, the permission granted by each Faculty member is a nonexclusive, irrevocable, paid-up, worldwide license to exercise any and all rights under copyright relating to each of his or her scholarly articles, in any medium, and to authorize others to do the same, provided that the articles are not sold for a profit.

In addition, faculty members committed to providing copies of their manuscripts for distribution, which the DASH repository now enables. Authored by Stuart M. Shieber, James O. Welch, Jr. and Virginia B. Welch Professor of Computer Science and director of the Office for Scholarly Communication, the policy marked a groundbreaking shift from simply encouraging scholars to consider open access to creating a pro-open-access policy with an "opt out" clause.

"It's the best university policy anywhere," said Peter Suber of the Scholarly Publishing and Academic Resources Coalition in Washington, DC, and a fellow of Harvard Law School's Berkman Center and the University's Office for Scholarly Communication (OSC). "It shifts the default so Harvard faculty must make their work openly available unless they opt out. The default at most universities is the other way around: you have to choose open access and arrange for all the provisions."

To date, Harvard Law School, the John F. Kennedy School of Government, and the Harvard Graduate School of Education have joined FAS in supporting a comprehensive policy of open access. DASH fulfills the promise made in these four open-access votes.

Still a beta, DASH is a joint project of the OSC and the Office for Information Systems (OIS), both of which are strategic programs of the Harvard University Library. DASH is based on the open-source DSpace repository platform. Software customizations will continue throughout the coming academic year.

DASH is also intended to serve as a local digital home for a wide and growing array of other scholarly content produced at the University. Non-faculty researchers and students are already afforded deposit privileges, and DASH will eventually have collection spaces for each of the 10 schools at Harvard.

Among the many features the DASH development team has added to its DSpace implementation is the ability to link directly from a faculty author's name in DASH search results to his or her entry in Profiles, a research social networking site developed by Harvard Catalyst. Profiles, which provides a comprehensive view of a researcher's publications and connections within the University research community, currently indexes faculty from the medical and public health schools; its developers hope to expand it to include the Faculty of Arts and Sciences and School of Engineering and Applied Sciences in the near future. . . .

DASH currently supports automated embargo lift dates, so that a work can be deposited "dark" and then automatically switch to open access once a publisher's self-archiving embargo has expired. Another noteworthy feature is DASH's PDF header page: when a user downloads a full-text item, DASH generates a header page for the document, giving its provenance and relevant terms of use.

"The terms of use were drafted after a series of conversations with publishers about Harvard's open-access initiatives," said Shieber. "We wanted to give publishers the opportunity to articulate their concerns about Harvard's intended use of content in the repository, and we designed our repository and our practices as responsively as possible. We continue to welcome publisher input and engagement along these lines.

"Our long-term growth strategy for DASH is to integrate it so fully into other faculty tools that self-archiving just becomes second nature. When a Harvard author is updating their profile or the CV on their personal web site, upload-to-DASH will be there, and vice versa. All these loci for sharing information about publications will eventually synchronize with one another. This includes tools that store bibliographic information only, as well as those that provide open access to full text, such as the established subject repositories already used by many of our faculty to disseminate their work. Ultimately, DASH aims to provide as comprehensive and open a view of Harvard research as possible."

Repository Staff and Skills Set Revised

SHERPA has released a revised Repository Staff and Skills Set.

Here's an excerpt from the announcement:

The original document was developed in response to requests the SHERPA core team received for examples of repository job descriptions. The content of the 2009 version is largely unchanged with the main additions being advice on how the document can be used in planning hosted repositories and in the addition of a link to the JISC Recruitment Toolkit released earlier this year.

The original Staff and Skill Set document was not designed to describe the skills set required of a particular repository post but rather is a list of the entire set of skills, knowledge and abilities required for the development and management of a successful institutional repository. Due to requests from the community we provide here a generic job description of a technical repository post. This description was development from actual job advertisements and using advice and templates from the JISC Recruitment Toolkit.

The concise nature of the document continues to be popular and feedback from the community shows that it has been used to develop job descriptions, plan repository development and staffing, seek funding from institutions, renegotiate salaries/job profiles and regrading and in identifying skill gaps and areas for staff training.

Although primarily aimed at the UK repository community, it has also proved useful to the repository community and projects in Australia, Spain and Ireland.

EdSpace: An Educationally Focussed Repository for the University of Southampton. Final Report.

JISC has released EdSpace: An Educationally Focussed Repository for the University of Southampton. Final Report..

Here's an excerpt:

For some years, digital content has been stored in our VLE, but the VLE does not encourage sharing or re-use. EdShare is intended to act as the storage for the VLE, storing our everyday teaching materials such as presentations, hand-outs, reading lists, assignments etc., so that they can easily be viewed by others and re-used in whole or part as appropriate.

Important design principles of this share were:

  • Ease of use The share should be open to anyone to access, whether logged-in or not; any logged in member of the university can upload resources and comment on others. The user interface should be simple to use and fully accessible
  • Minimal metadata We acknowledge that requiring metadata is a barrier to use, and that search engines do a large part of the job based on free text. Web 2.0 style recommendations complement the search engines
  • Permanent URLs Every share entered in EdShare, and the description of the resource, are allocated unique and permanent URLs, which can be used to refer to them from external programs – for example VLEs such as Blackboard
  • Open Access to the descriptions, but user controlled access to the content. Anyone in the world can browse or search to discover what items are in EdShare (i.e. they can see the description), but the depositing user can control the visibility of the actual resource. The default is to allow visibility within the university, but it is possible to make the visibility wider (the whole world) or narrower (only my school, or even only the depositor and named collaborators). . . .

EdShare has been implemented as open source software on top of E-Prints, and the team are committed to working with others in supporting other institutions, or cross-institutional disciplinary consortia, to make both the technical and educational changes that we have benefited from.

"Towards Scholarly HTML"

Peter Sefton has posted an e-print of his forthcoming Serials Review article "Towards Scholarly HTML" (dx.doi.org/10.1016/j.serrev.2009.05.001) on ptsefton.

In an attempt to comply with Elsevier's author agreement, he states:

At the moment that link seems to resolve to an open version of the article, whether or not you have a subscription to the journal but I guess that will change; when it is "published" you will only see the article if you are clicking from inside a network that's on their list of subscribers. If not, you will need money to see it. But I can post the article here with the copyright statement you see below and remind you that you need to use the DOI to cite the paper should you wish to. No naughty linking back here (unless it is to reference these comments I'm adding). And no linking to the version I'm about to put in ePrints. OK? Even though you know that if you do link to the DOI some people may not be able to see the article in the future, don't do it, use the DOI link. There, I think I told you.

Note: the Elsevier version is no longer freely available.

OAI-PMH: MOAI 1.0.6 Released

MOAI 1.0.6 has been released.

Here's an excerpt from the MOAI Web page:

MOAI has some interesting features not found in most OAI servers. Besides serving OAI, it can also harvest OAI. This makes it possible for MOAI to work as a pipe, where the OAI data can be reconfigured, cached, and enriched while it passes through the MOAI processing.

More specifically MOAI has the ability to:

  • Harvest data from different kinds of sources
  • Serve many OAI feeds from one MOAI server, each with their own configuration
  • Turn metadata values into OAI sets on the fly, creating new collections
  • Use OAI sets to filter records shown in a feed, configurable for each feed
  • Work easily with relational data (e.g. if an author changes, the publication should also change)
  • Simple and robust authentication through integration with the Apache webserver
  • Serve assets via Apache while still using configurable authentication rules

Web Services and Repositories: Report from an EThOSnet Project Workshop

Electronic Theses Online Service (EThOS) has released Web Services and Repositories: Report from an EThOSnet project workshop, British Library, 2nd June 2009.

Here's an excerpt:

One of the areas highlighted for potential investigation was the use of Web Services in supporting the delivery of EThOS. Due to staff changes following the start of the project it was not possible to carry out this investigation on the technical level that had been originally hoped. Nevertheless, an initial investigation was carried out to assess options. In considering the role of Web Services in supporting EThOS, it was concluded that it was not possible for the most part to consider the needs of EThOS alone, as using Web Services is primarily about communication between systems. EThOS has been developed on a model of ongoing interaction with institutional repositories, and as such the role of Web Services in supporting these local repository instances is key to the success of EThOS making use of them. Furthermore, given the development of local repositories as systems that need to interact with other systems, either within an institution or outside it, it seemed timely to address this issue to provide guidance to the community as a whole.

A workshop to investigate the potential value and use of Web Services to digital repositories was thus organised to both disseminate and capture information on the possibilities. This report summarises much of the information and conclusions from the workshop, and accompanies the full resources from the day available at http://www.ethos.ac.uk/0031_Web_Services_Day.html.

Confederation of Open Access Repositories to Launch During Open Access Week 2009

In a press release posted to the American-Scientist-Open-Access-Forum, D. Peters announced that the Confederation of Open Access Repositories (COAR) will launch during Open Access Week 2009. Supported by DRIVER, COAR "aims to promote greater visibility and application of research outputs through global networks of Open Access digital repositories."

"TCO and ROI: Assessing and Evaluating an Institutional Repository"

Pamela Bluh has self-archived her presentation "TCO and ROI: Assessing and Evaluating an Institutional Repository," which was given at the at the American Association of Law Libraries 2009, in DigitalCommons@UM Law ("TCO" means Total Cost of Ownership and "ROI" means Return on Investment).

Here's an excerpt:

On the surface, a TCO analysis would seem to be a fairly straightforward process. After all, isn't it just a matter of getting prices for hardware and software and determining the cost of staffing? While TCO can be used to determine the financial implications associated with the implementation of an IR and, at a minimum, should examine the direct cost of hardware and software and of personnel it should also take into consideration the indirect or "hidden" costs for ongoing operations such as training, system upgrades, licenses, technical support, and loss of accessibility due to system downtime. While not specifically part of TCO, a thorough analysis should also take into account intangibles such as the complexity of the implementation, the timely delivery of the product, and the availability of an effective exit strategy or a clearly delineated migration path for software and hardware upgrades.