SWORD2 Project Final Report

JISC has released SWORD2 Project Final Report.

Here's an excerpt:

The SWORD vision is about 'lowering the barriers to deposit', primarily for depositing content into repositories, and additionally, for depositing into any system which may wish to receive content from remote sources. The SWORD protocol defines a standard mechanism for depositing into repositories and other systems. The project and protocol were developed because there was previously no standardised way of doing this. A standard deposit interface allows repository services to be built that can offer functionality such as deposit from multiple locations, e.g. disparate repositories, desktop drag'n'drop tools, or from within standard office applications. SWORD can also facilitate deposit to multiple repositories, increasingly important for depositors who wish to deposit to funder, institutional or subject repositories. There are many other possibilities, including migration of content between repositories and transfer to preservation services. In addition to refining the existing SWORD application profile, the SWORD2 project has developed a number of tools and services to demonstrate these possibilities. It has also been pro-active in promoting SWORD and encouraging uptake within other repositories, services and tools, notably with its adoption into the Microsoft Article Authoring Add-in for Word 2007 and with the new Microsoft Zentity repository system .

The core aims of the project were to update the SWORD Protocol, the SWORD repository code libraries in the DSpace, Fedora, EPrints and Intrallect repositories, and the existing reference demonstrators. A Facebook application and validator have also been developed. Advocacy efforts include an e-learning case study, a briefing paper, a new SWORD website, and a range of additional dissemination activities, including conference papers, presentations, demonstrations and workshops at a number of national and international conferences and meetings.

JISC Final Report—CTREP, Cambridge TETRA Repositories Enhancement Project

JISC has released JISC Final Report—CTREP, Cambridge TETRA Repositories Enhancement Project .

Here's an excerpt:

CTREP created a connector between an Institutional VRE and an Institutional Repository. It is designed to be reusable in a number of different institutions where policy on deposit varies by means of a flexible deposit configuration system. In the process of executing the project:

  • the various stakeholders came to understand institutional cultural differences and address them in such a way that recent projects with a strong Repository and research dissemination/visualisation aspect have been more joined up than would previously have been possible
  • we developed an approach to policy expression designed both to avoid creating unnecessary tension within the institution during its development, and also to be authorable by a wide range of individuals
  • we have sought to record and capture lessons learnt (based, in part on case studies) for future institutionalisation projects
  • we developed a number of techniques which allowed apparent barriers to integration to be overcome by technical-architectural tools
  • we open-sourced the integration
  • we modified our approach to metadata/data binding in light of community feedback and developed a spreadsheet-based automated approach with which contributors felt comfortable, but which required a number of technical obstacles to be overcome through the use of creative programming techniques.

Word + SWORD + Ingester = Word to DSpace Deposit

In "Direct from MS Word to DSpace via SWORD," Stuart Lewis describes how to get documents into DSpace from Word via SWORD and a custom DSpace ingester.

Here's an excerpt:

This complete end to end process allows you to create Word templates, and to mark them up with required and optional fields. It also allows you to embed details of the SWORD deposit repository URL (so the users do not need to know what it is) within the template for easy deposit. This could be used for example for a journal editor to provide a template and a deposit location for new paper submissions all-in-one.

Texas Conference on Digital Libraries 2009 Presentations

Presentations from the Texas Conference on Digital Libraries 2009 are now available.

Here's those by Texas Digital Library staff:

DISC-UK DataShare Project: Final Report

JISC has released DISC-UK DataShare Project: Final Report.

Here's an excerpt:

The DISC-UK DataShare Project was funded from March 2007-March 2009 as part of JISC's Repositories and Preservation programme, Repositories Enhancement strand. It was led by EDINA and Edinburgh University Data Library in partnership with the University of Oxford and the University of Southampton. The project built on the existing informal collaboration of UK data librarians and data managers who formed DISC-UK (Data Information Specialists Committee–UK).

This project has brought together the distinct communities of data support staff in universities and institutional repository managers in order to bridge gaps and exploit the expertise of both to advance the current provision of repository services for accommodating datasets, and thus to explore new pathways to assist academics at our institutions who wish to share their data over the Internet. The project's overall aim was to contribute to new models, workflows and tools for academic data sharing within a complex and dynamic information environment which includes increased emphasis on stewardship of institutional knowledge assets of all types; new technologies to enhance e- Research; new research council policies and mandates; and the growth of the Open Access / Open Data movement.

With three institutions taking part plus the London School of Economics as an associate partner, a range of exemplars have emerged from the establishment of institutional data repositories and related services. Part of the variety in the exemplars is a result of the different repository platforms used by the three project partners: DSpace (Edinburgh DataShare), ePrints (e-Prints Soton) and Fedora (Oxford University Research Archive, ORA)–all open source software. LSE took another route and is using the distributed Dataverse repository network for data, linking to publications in LSE Research Online. Also, different approaches were taken in setting up the repositories. All three institutions had an existing, well-used institutional repository, but two chose to incorporate datasets within the same system as the publications, and one (Edinburgh DataShare) was a paired repository exclusively for datasets, designed to interoperate with the publications repository (Edinburgh Research Archive). The approach took a major turn midway through the project when an apparent solution to the problem of lack of voluntary deposits arose, in the form of the advent of the Data Audit Framework. Edinburgh participated as a partner in the DAF Development project which created the methodology for the framework, and also won a bid to carry out its own DAF Implementation project. Later, the other two partners conducted their own versions of the data audit framework under the auspices of the DataShare project.

A number of scoping activities were carried about by the partners with the goal of informing repository enhancement as well as broader dissemination. These included a State-of-the-Art-Review to determine what had been learned by previous repository projects in the UK that had forayed into the data arena. This resulted in a list of benefits and barriers to deposit of datasets by researchers to inform our outreach activities. A Data Sharing Continuum diagram was developed to illustrate where the projects were aiming to fit into the curation landscape, and the range of curation steps that could be taken, from simple backup to online visualization. Later on, a specialized metadata schema was explored (Data Documentation Initiative or DDI) in terms of how it might be incorporated into repository systems, though repository development in this area was not taken up. Instead, a dataset application profile was developed based on qualified Dublin Core (dcterms). This was implemented in the Edinburgh DataShare repository and adapted by Southampton for their next release. The project wished to explore wider issues with open data and web publishing, and therefore produced two briefing papers to do with data mashups–on numeric data and geospatial data. Finally, the project staff and consultant distilled what it had learned in terms of policy development for data repositories in a training guide. A number of peer reviewed posters, papers, and articles were written by DISC-UK members about various aspects of the project during the period.

Key conclusions were that 1) Data management motivation is a better bottom-up driver for researchers than data sharing but is not sufficient to create culture change, 2) Data librarians, data managers and data scientists can help bridge communication between repository managers & researchers, and 3) IRs can improve impact of sharing data over the internet.

Welsh Repository Network Final Report

JISC has released the Welsh Repository Network Final Report.

Here's an excerpt:

The aim of the Welsh Repository Network (WRN) was to put in place an essential building block for the development of an integrated network of institutional digital repositories in Wales. The project entailed a centrally managed hardware procurement programme designed to provide every HEI in Wales with dedicated and configured repository hardware. In close collaboration with the technical, organisational and operational support specifically provided for Welsh Higher Education Institutions (HEIs) within the JISC funded Repositories Support Project (RSP), also delivered from Aberystwyth University, this initiative provided a cost-effective, collaborative and decisive boost to the repository agenda in Wales and helped JISC achieve the critical mass of populated repositories and digital content that is a stated objective of the Repositories and Preservation Programme.

The project employed a three-stage approach: requirements gathering, procurement and installation, and monitoring and evaluation. Extensive site visits and regular communication with project partners were a fundamental aspect of project activity and a variety of models were used for procuring hardware including collaborative approaches, outsourcing to commercial software and establishing hosting agreements.

At its most practical level the principal deliverable of the WRN project has been the provision of repository hardware capacity in each and every HEI in Wales which, in combination with the hands-on technical support provided by the RSP, enabled all 12 HEIs to have functional institutional repositories by March 2009. More generally, the project has contributed a series of case studies and test sites that provide the wider JISC community with practical insights into the process of matching alternative organisational models, repository types and hardware configurations to different geographical and institutional settings. The main conclusion to be drawn from the WRN is that while providing funds for procuring hardware helps to push repository development up the institutional agenda, the support that goes with the funding, especially the technical support, is a far more crucial factor in generating a successful and lasting outcome.

Read more about it at the "project Web site."

“Evaluation of Digital Repository Software at the National Library of Medicine”

Jennifer L. Marill and Edward C. Luczak have published "Evaluation of Digital Repository Software at the National Library of Medicine" in the latest issue of D-Lib Magazine.

Here's an excerpt:

The National Institutes of Health (NIH) National Library of Medicine® (NLM) undertook an 18-month project to evaluate, test and recommend digital repository software and systems to support NLM's collection and preservation of a wide variety of digital objects. This article outlines the methodology NLM used to analyze the landscape of repository software and select three systems for in-depth testing. Finally, the article discusses the evaluation results and next steps for NLM. This project followed an earlier NLM working group, which created functional requirements and identified key policy issues for an NLM digital repository to aid in building NLM's collection in the digital environment.

DSpace and Fedora Commons Merge to Form DuraSpace

DSpace and Fedora Commons have merged to form a new organization, DuraSpace.

Here's an excerpt from the press release:

The joined organization, named "DuraSpace," will sustain and grow its flagship repository platforms – Fedora and DSpace. DuraSpace will also expand its portfolio by offering new technologies and services that respond to the dynamic environment of the Web and to new requirements from existing and future users. DuraSpace will focus on supporting existing communities and will also engage a larger and more diverse group of stakeholders in support of its not-for-profit mission. The organization will be led by an executive team consisting of Sandy Payette (Chief Executive Officer), Michele Kimpton (Chief Business Officer), and Brad McLean (Chief Technology Officer) and will operate out of offices in Ithaca, NY and Cambridge, MA.

"This is a great development," said Clifford Lynch, Executive Director of the Coalition for Networked Information (CNI). "It will focus resources and talent in a way that should really accelerate progress in areas critical to the research, education, and cultural memory communities. The new emphasis on distributed reliable storage infrastructure services and their integration with repositories is particularly timely."

Together Fedora and DSpace make up the largest market share of open repositories worldwide, serving over 700 institutions. These include organizations committed to the use of open source software solutions for the dissemination and preservation of academic, scientific, and cultural digital content.

"The joining of DSpace and Fedora Commons is a watershed event for libraries, specifically, and higher education, more generally," said James Hilton, CIO of the University of Virginia. "Separately, these two organizations operated with similar missions and a shared commitment to developing and supporting open technologies. By bringing together the technical, financial, and community-based resources of the two organizations, their communities gain a robust organization focused on solving the many challenges involved in storing, curating, and preserving digital data and scholarship," he said.

New Products

DuraSpace will continue to support its existing software platforms, DSpace and Fedora, as well as expand its offerings to support the needs of global information communities. The first new technology to emerge will be a Web-based service named "DuraCloud." DuraCloud is a hosted service that takes advantage of the cost efficiencies of cloud storage and cloud computing, while adding value to help ensure longevity and re-use of digital content. The DuraSpace organization is developing partnerships with commercial cloud providers who offer both storage and computing capabilities.

The DuraCloud service will be run by the DuraSpace organization. Its target audiences are organizations responsible for digital preservation and groups creating shared spaces for access and re-use of digital content. DuraCloud will be accessible directly as a Web service and also via plug-ins to digital repositories including Fedora and DSpace. The software developed to support the DuraCloud service will be made available as open source. An early release of DuraCloud will be available for selected pilot partners in Fall 2009.

Key Benefits of the DuraSpace Organization

DuraSpace will support both DSpace and Fedora by working closely with both communities and when possible, develop synergistic technologies, services, and programs that increase interoperability of the two platforms. DuraSpace will also support other open source software projects including the Mulgara semantic store, a scalable RDF database.

DuraSpace is mission-focused. The organization will be associated with its broader mission of working towards developing services and solutions on behalf of diverse communities rather than focusing on single-solution product development. This change in orientation can be characterized as moving beyond the software and toward the mission.

DuraSpace will bring strength and leadership to a larger community and amplify the value brought by each organization individually. With both organizations working in unison, there can be significant economies of scale, synergies in developing open technologies and services, and a strong position for long-term sustainability.

Summary of DSpace Community Network Survey Results

A brief summary of the recent DSpace Community Network Survey results is now available.

Here's an excerpt:

  • Type of institution: 83% of respondents represent academic institutions, 19% research centers, 10% archive/public library, 10% government *
  • Number of items: More than half of the respondents have 2,500 or less items in their repositories, only 15% have 10,000 or more items . . . .
  • Modifications to software: 54% minor cosmetic, 29% new features, 29% significant UI customizations, 23% no changes, 8% core code changes*

*Many of the questions allowed for multiple answers, therefore some of the numbers and percentages represent multiple answers from the same respondent.

DSpace Sites: What Do You Want in Version 1.6?

The DSpace Committers Group is conducting a short survey about desired features in DSpace version 1.6.

Here's an excerpt from the "DSpace 1.6: You Decide!":

As you'll have seen from recent emails, the DSpace community has now released version 1.5.2 of the DSpace software. It has many new features, some enhancements to current features, and some bug fixes. Many of you will also know that a small team of developers have been working on DSpace version 2.0 which will bring with it many essential architectural enhancements to ensure that DSpace continues to fulfil the needs of the user community over the coming years. DSpace 2.0 is likely to be released early in 2010.

n the mean time, the DSpace committers have decided to start working on DSpace version 1.6. By moving to 1.6 (rather than 1.5.3) we can add new features that require changes to underlying DSpace database. We can’t tell you just yet what new features will be in version 1.6 because we haven’t decided! And that is where you come in . . .

John Robertson Overviews Digital Repository Software Developments

John Robertson of JISC CETIS has overviewed developments in major digital repository software in his "Repository Software Update" post.

Here's an excerpt:

Over the past couple of months I've had a chance to hear updates from a number of repository software developers (at a Fedora training day, at DEV8D and on a number of blogs). Albeit slightly delayed by holidays, here's a bit of a snapshot of where ePrints, DSpace, Fedora, Microsoft's repository are at. There's a lot more information about Fedora than the others as I've heard a couple of updates from them. The usual caveat that I may have misunderstood what some of these are or how developed they are should apply. Much of this development is building up to releases at Open Repositories 2009.

DSpace 1.5.2 Stable Released

DSpace 1.5.2 Stable has been released.

Here's an excerpt from the announcement:

This release is primarily a bug fix release incorporating numerous bug fixes and enhancements.

We want to highlight the following additions:

  • SWORD module/version 1.3.1 supporting the sword standard version 1.3
  • cocoon upgraded to 2.2
  • fix for the UTF-8 issues with the XMLUI
  • new authentication methods: Hierarchical LDAP and Shibboleth
  • full update translations: German, Italian for both XMLUI and JSPUI and Ukrainian for JSPUI
  • new translations for 1.5.x: Greek and Thai
  • graceful resolver for urn in the item page for the JSPUI

“Repository Software Survey, March 2009”

The Repositories Support Project has released the "Repository Software Survey, March 2009," which analyzes the CONTENTdm, Digital Commons, DigiTool, DSpace, EPrints, EQUELLA, Fedora, intraLibrary, Research-Output Repository Platform, Open Repository, and VITAL digital repository systems.

DSpace Statistics Add-on Version 2.1 Released

The RepositóriUM team at Minho University has released version 2.1 of the DSpace Statistics Add-on.

Here's an excerpt from the announcement:

The Statistics System is an add-on to the DSpace platform that allows gathering, processing and presenting usage, content and administrative statistics. Despite the fact that its development was done to meet the specific needs of RepositóriUM, the system is completely adjustable to other environments as its components can easily be configured, changed or extended, to respond to different information needs.

With the release of the current version 2.1 of the Stats System the main focus was solving some architectural issues of version 2.0, primarily:

  • Adapting the add-on to the new build and deploy mechanism of DSpace 1.5.1;
  • New mechanism for gathering the events on DSpace, avoiding DSpace logging mechanism and log4j JDBC Appender (replaced by Mark H. Wood UsageEvent plug-in);
  • New mechanism for aggregating the stats. Ported from pl/pgsql to Java;
  • Eliminate the pl/java language as a requirement;
  • Improvements on spider detection mechanism.

UT Dallas Launches Institutional Repository Using Texas Digital Library

The McDermott Library at the University of Texas at Dallas has launched its institutional repository, Treasures @ UT Dallas, using the Texas Digital Library's DSpace system.

Read more about it at "Tools Helping Library Share its Wealth of Material."

Free: All About Repositories Webinar Series

The DSpace Foundation, the Fedora Commons, Sun Microsystems, and SPARC are offering a free All About Repositories Webinar Series.

Here's an excerpt from the press release (see it for a list of the first webinars):

Got a repository? Would you like to understand more about what repositories are and how they operate? This spring DSpace Foundation, Fedora Commons, Sun Microsystems and SPARC (The Scholarly Publishing & Academic Resources Coalition) will offer a free About Repositories Webinar Series to provide professional learning opportunities for repository managers, developers, curators and decision makers. The seminar series will kick off on Feb. 18 at 9:00 a.m. PT with DSpace and Fedora: A Collaboration Update presented by Michele Kimpton, Executive Director, DSpace Foundation, and Sandy Payette, Executive Director Fedora Commons.

Each month a new topic or issue of interest to repository communities will be presented in a one-hour online format. All About Repositories Webinar Series will be web cast for synchronous event access, and will also be made available through DSpace, Fedora, Sun and SPARC web sites as an open educational reference for repository users and developers.

Future web seminars will focus on topics such as web services, and will take an in-depth look at some of the top implementations from the Innovation Fair held at the November 2008 SPARC repositories meeting. . . .

Pre-registration is required for all seminars at http://www.education-webevents.com/.

Load Testing DSpace

In "DSpace at a Third of a Million Items," Stuart Lewis reports on a load test of DSpace.

Here's an excerpt:

  • On average deposits into an empty repository took about one and a half seconds.
  • On average deposits into a repository with three hundred thousand items took about seven seconds.
  • If this linear looking relationship between number of deposits and speed of deposit were to continue at the same rate, an average deposit into a repository containing one million items would take about 19 to 20 seconds.
  • Extrapolate this to work out throughput per day, and that is about 10MB deposited every 20 seconds, 30MB per minute, or 43GB of data per day.

"Institutional Repository on a Shoestring"

George Wrenn, Carolyn J. Mueller, and Jeremy Shellhase have published "Institutional Repository on a Shoestring" in the new D-Lib Magazine issue.

Here's an excerpt:

Humboldt State University (HSU), with 7,800 students (fall 2008), is one of the smaller campuses in the 23-member California State University (CSU) system. Our institutional repository, Humboldt Digital Scholar (HDS), originated as a pilot project during the Library's August 2004 planning meeting and became a permanent Library service in April 2006. The repository functions "on a shoestring," unfunded and reliant on contributions of time from librarians and library staff for its ongoing maintenance and development.

In this article, the authors, three members of the HDS Steering Committee, describe the process of setting up and managing a digital repository: hardware and software selection; customizations; gaining campus support; developing collections; accepting submissions; and planning for the future, including participation in a system-wide effort to create a shared repository for the CSU.

Interview Podcasts from the Coalition for Networked Information's Fall 2008 Task Force Meeting

Gerry Bayne has made available podcast interviews with selected participants at the Coalition for Networked Information's Fall 2008 Task Force Meeting.

Here are three of podcasts of special interest:

Open Journal Systems SWORD Plugin

The Australian Partnership for Sustainable Repositories has released a SWORD plugin for Open Journal Systems, which was developed by Scott Yeadon and Leo Monus. The plugin requires "a significant amount of patching to DSpace," and it is recommended that testing be done with Fedora. A new version will be released next year that may eliminate the need for DSpace patching.

Grant Awarded: DSpace Foundation and Fedora Commons for DuraSpace Planning

The DSpace Foundation and Fedora Commons have received a grant from the Andrew W. Mellon Foundation to support planning for DuraSpace.

Here's an excerpt from the press release:

Over the next six months funding from the planning grant will allow the organizations to jointly specify and design "DuraSpace," a new web-based service that will allow institutions to easily distribute content to multiple storage providers, both "cloud-based" and institution-based. The idea behind DuraSpace is to provide a trusted, value-added service layer to augment the capabilities of generic storage providers by making stored digital content more durable, manageable, accessible and sharable.

Michele Kimpton, Executive Director of the DSpace Foundation, said, "Together we can leverage our expertise and open source value proposition to continue to provide integrated open solutions that support the scholarly mission of universities."

Sandy Payette, Executive Director of Fedora Commons, observes, "There is an important role for high-tech non-profit organizations in adding value to emerging cloud solutions. DuraSpace is designed with an eye towards enabling universities, libraries, and other types of organizations to take advantage of cloud storage while also addressing special requirements unique to areas such as digital archiving and scholarly communication."

The grant from the Mellon Foundation will support a needs analysis, focus groups, technical design sessions, and meetings with potential commercial partners. A working web-based demonstration will be completed during the six-month grant period to help validate the technical and business assumptions behind DuraSpace.

CARL DSpace Users Reluctant to Upgrade to 1.5

The Relog Experiment reports that, at a Canadian Association of Research Libraries meeting on institutional repositories at Access 2008, most attending libraries that used DSpace were reluctant to upgrade to 1.5 and were not using Manakin.

Here's an except that explains their reservations:

  • customizations made to DSpace 1.4 will take a lot of programming time to move over to 1.5
  • certain plug-ins and enhancements that are in heavy use in 1.4 have not yet been made available for 1.5
  • administrators are evaluating other platforms and are not willing to invest the time in upgrading to 1.5 if they end up switching platforms
  • programmers are hard to find, train and retain