Towards Repository Preservation Services. Final Report from the JISC Preserv 2 Project
Steve Hitchcock, David Tarrant, and Les Carr have self-archived Towards Repository Preservation Services. Final Report from the JISC Preserv 2 Project in the ECS EPrints Repository.
Here's the abstract:
Preserv 2 investigated the preservation of data in digital institutional repositories, focussing in particular on managing storage, data and file formats. Preserv 2 developed the first repository storage controller, which will be a feature of EPrints version 3.2 software (due 2009). Plugin applications that use the controller have been written for Amazon S3 and Sun cloud services among others, as well as for local disk storage. In a breakthrough application Preserv 2 used OAI-ORE to show how data can be moved between two repository softwares with quite distinct data models, from an EPrints repository to a Fedora repository. The largest area of work in Preserv 2 was on file format management and an 'active' preservation approach. This involves identifying file formats, assessing the risks posed by those formats and taking action to obviate the risks where that could be justified. These processes were implemented with reference to a technical registry, PRONOM from The National Archives (TNA), and DROID (digital record object identification service), also produced by TNA. Preserv 2 showed we can invoke a current registry to classify the digital objects and present a hierarchy of risk scores for a repository. Classification was performed using the Preserv2 EPrints preservation toolkit. This 'wraps' DROID in an EPrints repository environment. This toolkit will be another feature available for EPrints v3.2 software. The result of file format identification can indicate a file is at risk of becoming inaccessible or corrupted. Preserv 2 developed a repository interface to present formats by risk category. Providing risk scores through the live PRONOM service was shown to be feasible. Spin-off work is ongoing to develop format risk scores by compiling data from multiple sources in a new linked data registry.
Latest posts in Cloud Computing/SaaS
- Presentations from Repositories and the Cloud Meeting - February 28th, 2010
- Cloud Computing and Repositories: Fedorazon: Final Report - November 3rd, 2009
- Duke, NC State, and UNC Data Sharing Cloud Computing Project Launched - October 28th, 2009
Latest posts in Digital Curation/Digital Preservation
- Sustainable Economics for a Digital Planet: Ensuring Long-term Access to Digital Information - March 2nd, 2010
- A Guide to Distributed Digital Preservation - February 24th, 2010
- International Internet Preservation Consortium Launches Web Archives Registry - February 4th, 2010
Latest posts in Digital Repositories
- DSpace 1.6 Released - March 4th, 2010
- Presentations from Repositories and the Cloud Meeting - February 28th, 2010
- JISC Digital Repository infoKit - February 24th, 2010
Latest posts in EPrints
- ETD Self-Archiving Tools: ICE-TheOREM Final Report - October 12th, 2009
- SWORD2 Project Final Report - October 5th, 2009
- "Digital Preservation: Logical and Bit-Stream Preservation Using Plato, EPrints and the Cloud" - September 27th, 2009
Latest posts in Fedora
- Cloud Computing and Repositories: Fedorazon: Final Report - November 3rd, 2009
- ETD Self-Archiving Tools: ICE-TheOREM Final Report - October 12th, 2009
- "Getting Started with Fedora" - October 8th, 2009
Latest posts in Institutional Repositories
- DSpace 1.6 Released - March 4th, 2010
- University of Rochester's IR+ Institutional Repository Software - March 2nd, 2010
- Presentations from Repositories and the Cloud Meeting - February 28th, 2010













