The Digital Preservation Coalition and the National Library of Australia’s PADI program have published the the 16th issue of What’s New in Digital Preservation.
Here’s an excerpt from the padi-forum announcement:
Issue 16 features news from a range of organisations and initiatives, including the Digital Preservation Coalition (DPC), Digital Curation Centre (DCC), JISC (UK), The National Archives (UK), DigitalPreservationEurope, nestor, the Koninklijke Bibliotheek (National Library of the Netherlands), the US National Digital Information Infrastructure and Preservation Program (NDIIPP), and the PLANETS and CASPAR projects.
The Koninklijke Bibliotheek and the Nationaal Archief have released Dioscuri 0.2.0, an open-source Java-based emulator for an Intel 8086-based computer. Dioscuri can run 16-bit operating systems, such as MS DOS, and applications, such as WordPerfect 5.1.
The National Archives of Australia has released Xena 4.0, which is open source digital preservation software.
Here's a brief description of its capabilities from the project homepage:
Xena software aids digital preservation by performing two important tasks:
- Detecting the file formats of digital objects
- Converting digital objects into open formats for preservation
Fran Berman, director of the San Diego Supercomputer Center, and Brian Lavoie, a research scientist at OCLC, have been named co-chairs of a Blue Ribbon Task Force on Sustainable Digital Preservation and Access, which is being funded by the National Science Foundation and the Andrew W. Mellon Foundation. The Library of Congress, the National Archives and Records Administration, the Council on Library and Information Resources, and JISC will also be involved in the task force.
Here's an excerpt from the press release:
Berman and co-chair Brian Lavoie . . . will convene an international group of prominent leaders to develop actionable recommendations on economic sustainability of digital information for the science and engineering, cultural heritage, academic, public, and private sectors. The Task Force is expected to meet over the next two years and gather testimony from a broad set of thought leaders in preparation for the Task Force’s Final Report. . . .
The Task Force will bring together a group of national and international leaders who will focus attention on this critical grand challenge of the Information Age. Task Force members will represent a cross-section of fields and disciplines including information and computer sciences, economics, entertainment, library and archival sciences, government, and business. Over the next two years, the Task Force will convene a broad set of international experts from the academic, public and private sectors who will participate in quarterly panels and discussions. . . .
In its final report, the Task Force is charged with developing a comprehensive analysis of current issues, and actionable recommendations for the future to catalyze the development of sustainable resource strategies for the reliable preservation of digital information. During its tenure, the Task Force also will produce a series of articles about the challenges and opportunities of digital information preservation, for both the scholarly community and the public.
CLIR seeks comments on Preservation in the Age of Large-Scale Digitization by Oya Rieger. The deadline is 10/5/07.
The Institute for Public Policy Research has released MP3 files of the presentations at its "Preservation, Access and Inclusion: Balancing Opportunities in a Digital Age" seminar.
In his "Decommissioning Repositories" posting, EPrints guru Leslie Carr grapples with the issue of what to do with repositories that have served their purpose and that no one wants to maintain.
Here's an excerpt:
But now the party's over, there is no more funding, and none of the partner institutions has offered to keep the repository going in perpetuity. Not even the hosting institution or the ex-manager wants to keep their repositories going. We know that even if we don't turn them off their hosting hardware will fail in a few of years. That sounds like very bad news because a repository is supposed to be forever! Was it irresponsible to create these repositories in the first place? Should it be forbidden to create a public repository whose life is guaranteed to be less than a decade? Or perhaps that should be factored into the original policy-making—"this repository and all its contents are guaranteed up to 31st December 2017 but not after." If that were machine readable then the community could have decided whether they want to mirror the collection, or selected bits of it.
Source: Carr, Leslie. "Decommissioning Repositories." RepositoryMan, 10 September 2007.
LIFE (Life Cycle Information for E-Literature) is a joint, JISC-funded project of the University College London Library Services and the British Library that is investigating life cycle issues involved in collecting and preserving digital materials.
Here's an excerpt from the home page:
The LIFE Project has developed a methodology to model the digital lifecycle and calculate the costs of preserving digital information for the next 5, 10 or 100 years. For the first time, organisations can apply this process and plan effectively for the preservation of their digital collections.
Currently the LIFE Project is in its second phase ("LIFE2"), an 18 month project running from March 2007 to August 2008.
Documentation from the first and second phases of the project is available.
The project has just established a weblog.
Laura Edwards has made available a spreadsheet that summarizes the perpetual digital access policies of publishers. A wiki version should be up shortly.
The APSR AONS II project has released a beta version of the Automatic Obsolescence Notification System (AONS).
Here's an excerpt from the announcement on apsr_announcements:
Users can register with the service by providing a URL to a repository's format scan summary. The AONS service will display the summary and allow a repository manager to compare the formats of items in their repository with information from format registries such as PRONOM and Library of Congress. These registries flag any formats that are likely to become obsolete. Repository managers can then make curation decisions about any items at risk, such as upgrading their formats.
By downloading and installing an AONS locally, an institution can also take advantage of a pilot risk metrics implementation. . . .
The AONS software is the result of the AONS II project funded under APSR and developed by David Pearson, David Levy and Matthew Walker from the National Library of Australia (NLA) with an administrative user interface developed by David Berriman at ANU.
The software is able to be downloaded from Sourceforge at http://sourceforge.net/projects/aons and a mailing list is also available for support and feedback. As this is a beta release we welcome feedback to the Sourceforge mailing list to inform our testing which will continue until mid-September.
Please try out the pilot service by sending an email to firstname.lastname@example.org to register with the service, and tell us which institution you are from. . . .
Portico is launching a e-Book preservation study, which will last the rest of the year.
Here's an excerpt from the press release:
In response to several requests from publishers and libraries, Portico is conducting a study in order to assess how to extend its archival infrastructure and service to respond to the emerging need to preserve e-books. During the study we will analyze the structure and preservation needs of e-books and determine what adjustments to Portico's existing, operational and technological infrastructure and the economic model developed to support e-journal preservation might be required in order to respond to this new genre. Portico's e-journal archiving service was developed through a pilot project that drew heavily upon engagement with publisher and library pilot participants. We anticipate that a similar process will be essential in understanding how best to respond to the challenges of e-book preservation. . . .
The current participants in the E-Book Preservation study include:
- American Math Society
- Morgan Claypool
- Taylor and Francis
- Case Western Reserve University
- Cornell University Library
- McGill University
- Texas University Libraries
- University College of London
- Yale University Library
The German National Library and SUB Göttingen have announced the official release of the kopal Library for Retrieval and Ingest on diglib.
Here's an excerpt from the message:
The kopal project (Co-operative Development of a Long-term Digital Information Archive) was dedicated to find a solution to providing not only bitstream preservation but long-term accessibility as well in the form of a cooperatively developed and operated long-term archive for digital data. The German National Library, the Goettingen State and University Library, the Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen, and IBM Germany have been working in close cooperation on a technological solution. The now released software tools mark the successful development of such an archiving solution.
The Open-Source-Software koLibRI is a framework to integrate a long term preservation system as the IBM Digital Information Archiving System (DIAS) into the infrastructure of any institution. In particular, koLibRi organizes the creation and the import of Archival Information Packages into DIAS, and offers functions to retrieve and to govern them. Preservation methods like data customization and migration of data are part of the tasks of long term preservation. koLibRi Version 1.0 provides modules that manage future migration procedures. koLibRI Version 1.0 provides a completely functional and stable condition. Nevertheless, in the context of connecting new partners to the existing long term preservation system, the software will be constantly adjusted to the needs of different partners.
A documentation has been published with the conclusive release that describes the installation and the adjustment of a functional koLibRi-system and the basic internal layout to make individual development possible. The described release is offered for free download. . . .
The Storage Networking Industry Association has released the 100 Year Archive Requirements Survey. Access requires registration.
Here's an excerpt from the "Survey Highlights":
- 80% of respondents declared they have information they must keep over 50 years and 68% of respondents said they must keep it over 100 years. . . .
- Long-term generally means greater than 10 to 15 years—the period beyond which multiple migrations take place and information is at risk. . .
- Database information (structured data) was considered to be most at risk of loss. . .
- Over 40% of respondents are keeping e-Mail records over 10 years. . . .
- Physical migration is a big problem. Only 30% declared they were doing it correctly at 3-5 year intervals. . . .
- 60% of respondents say they are ‘highly dissatisfied’ that they will be able to read their retained information in 50 years. . .
- Help is needed—current practices are too manual, too prone to error, too costly and lack adequate coordination across the organization. . . .
The Netherlands National Commission for UNESCO and the European Commission on Preservation and Access have published Preserving the Digital Heritage: Principles and Policies.
Here's an excerpt from the "Preface":
In November 2005, the Netherlands National Commission for UNESCO, in collaboration with the Koninklijke Bibliotheek (National Library of the Netherlands) and UNESCO’s Information Society Division, organized a conference entitled Preserving the Digital Heritage (The Hague, The Netherlands, 4-5 November 2005). It focused on two important issues: the selection of material to be preserved, and the division of tasks and responsibilities between institutions. This publication contains the four speeches given by the keynote speakers, preceded by a synthesis report of the conference.
The National Library of New Zealand has released version 3.2 of its open-source Metadata Extraction Tool.
Written in Java and XML, the Metadata Extraction Tool has a Windows interface, and it runs under UNIX in command line mode. Batch processing is supported.
Here’s an excerpt from the project home page:
The Tool builds on the Library’s work on digital preservation, and its logical preservation metadata schema. It is designed to:
- automatically extracts preservation-related metadata from digital files
- output that metadata in a standard format (XML) for use in preservation activities. . . .
The Metadata Extract Tool includes a number of ‘adapters’ that extract metadata from specific file types. Extractors are currently provided for:
- Images: BMP, GIF, JPEG and TIFF.
- Office documents: MS Word (version 2, 6), Word Perfect, Open Office (version 1), MS Works, MS Excel, MS PowerPoint, and PDF.
- Audio and Video: WAV and MP3.
- Markup languages: HTML and XML.
If a file type is unknown the tool applies a generic adapter, which extracts data that the host system ‘knows’ about any given file (such as size, filename, and date created).
The Collections Council of Australia Ltd. has released Australian Framework and Action Plan for Digital Heritage Collections, Version 0.C3 for comment.
Here's an excerpt from the document:
This is the Collections Council of Australia's plan to prepare an Australian framework for digital heritage collections. It brings together information shared by people working in archives, galleries, libraries and museums at a Summit on Digital Collections held in 2006. It proposes an Action Plan to address issues shared by the Australian collections sector in relation to current and future management of digital heritage collections.
A podcast of Chris Rusbridge’s "Curation of Scientific Data: Challenges for Institutions and their Repositories" presentation at The Adaptable Repository conference is now available. Rusbridge is Director of the Digital Curation Centre in the UK.
The PowerPoint for the presentation is also available.
The Australian Partnership for Sustainable Repositories (APSR) has released Report of the Sustainability Guidelines for Australian Repositories Project (SUGAR).
Here’s an excerpt from the report:
The Sustainability Guidelines for Australian Repositories service (SUGAR)was intended to support people working in tertiary education institutions whose activities do not focus on digital preservation. The target community creates and digitises content for a range of purposes to support learning, teaching and research. While some have access to technical and administrative support many others may not be aware of what they need to know. The typical SUGAR user may have little interest in discussions surrounding metadata, interoperability or digital preservation, and may simply want to know the essential steps involved in achieving the task at hand.
A key challenge for SUGAR was to provide a suitable level and amount of information to meet the immediate focus of the user and their level of expertise while introducing and encouraging consideration of issues of digital sustainability. SUGAR was also intended to stand alone as an online service unsupported by a helpdesk.
The UNESCO Memory of the World Programme, with the support of the Australian Partnership for Sustainable Repositories, has published Towards an Open Source Repository and Preservation System: Recommendations on the Implementation of an Open Source Digital Archival and Preservation System and on Related Software Development.
Here’s an excerpt from the Executive Summary and Recommendations:
This report defines the requirements for a digital archival and preservation system using standard hardware and describes a set of open source software which could used to implement it. There are two aspects of this report that distinguish it from other approaches. One is the complete or holistic approach to digital preservation. The report recognises that a functioning preservation system must consider all aspects of a digital repositories; Ingest, Access, Administration, Data Management, Preservation Planning and Archival Storage, including storage media and management software. Secondly, the report argues that, for simple digital objects, the solution to digital preservation is relatively well understood, and that what is needed are affordable tools, technology and training in using those systems.
An assumption of the report is that there is no ultimate, permanent storage media, nor will there be in the foreseeable future. It is instead necessary to design systems to manage the inevitable change from system to system. The aim and emphasis in digital preservation is to build sustainable systems rather than permanent carriers. . . .
The way open source communities, providers and distributors achieve their aims provides a model on how a sustainable archival system might work, be sustained, be upgraded and be developed as required. Similarly, many cultural institutions, archives and higher education institutions are participating in the open source software communities to influence the direction of the development of those softwares to their advantage, and ultimately to the advantage of the whole sector.
A fundamental finding of this report is that a simple, sustainable system that provides strategies to manage all the identified functions for digital preservation is necessary. It also finds that for simple discrete digital objects this is nearly possible. This report recommends that UNESCO supports the aggregation and development of an open source archival system, building on, and drawing together existing open source programs.
This report also recommends that UNESCO participates through its various committees, in open source software development on behalf of the countries, communities, and cultural institutions, who would benefit from a simple, yet sustainable, digital archival and preservation system. . . .
Library Journal Academic Newswire reports that the University of Maine, the Toronto Public Library, and the Cincinnati Public Library will follow Emory University’s lead and digitize public domain works utilizing Kirtas scanners with print-on-demand copies being made available via BookSurge. (Also see the press release: "BookSurge, an Amazon Group, and Kirtas Collaborate to Preserve and Distribute Historic Archival Books.")
Source: "University of Maine, plus Toronto and Cincinnati Public Libraries Join Emory in Scan Alternative." Library Journal Academic Newswire, 21 June 2007.
Emory University’s Woodruff Library will use a Kirtas robotic book scanner to digitize rare books and to create PDF files that will be made available on the Internet and sold as print-on-demand books on Amazon.
Here’s an excerpt from the press release:
"We believe that mass digitization and print-on-demand publishing is an important new model for digital scholarship that is going to revolutionize the management of academic materials," said Martin Halbert, director for digital programs and systems at Emory’s Woodruff Library. "Information will no longer be lost in the mists of time when books go out of print. This is a way of opening up the past to the future."
Emory’s Woodruff Library is one of the premier research libraries in the United States, with extensive holdings in the humanities, including many rare and special collections. To increase accessibility to these aging materials, and ensure their preservation, the university purchased a Kirtas robotic book scanner, which can digitize as many as 50 books per day, transforming the pages from each volume into an Adobe Portable Document Format (PDF). The PDF files will be uploaded to a Web site where scholars can access them. If a scholar wishes to order a bound, printed copy of a digitized book, they can go to Amazon.com and order the book on line.
Emory will receive compensation from the sale of digitized copies, although Halbert stressed that the print-on-demand feature is not intended to generate a profit, but simply help the library recoup some of its costs in making out-of-print materials available.
The Library of Congress’ Network Development and MARC Standards Office unit has released Implementing the PREMIS Data Dictionary: A Survey of Approaches.
Here is an excerpt from the report’s preface:
The Preservation Metadata: Implementation Strategies (PREMIS) Working Group developed the Data Dictionary for Preservation Metadata, which is a specification containing a set of "core" preservation metadata elements that has broad applicability within the digital preservation community. The PREMIS Data Dictionary (PDD) was released in May 2005 along with a set of XML schemas to support its implementation. Since that time, institutions have begun to implement preservation metadata by providing content for semantic units expressed in the data dictionary or comparing it with planned or existing systems for long-term preservation. . . .
The Library of Congress, as part of the PREMIS maintenance activity, commissioned Deborah Woodyard-Robinson to provide this study to explore how institutions have implemented the PREMIS semantic units. . . . In this study sixteen repositories have been surveyed about their interpretation and application of the PDD, with an analysis then made on how the PREMIS core fits with the functions of a preservation repository and which PDD semantic units will be most relevant to certain types of repositories.
The Preservation and Reformatting Section (PARS) of the Association for Library Collections & Technical Services (ALCTS) has started the Defining Digital Preservation Weblog to get feedback on the efforts of a working group that has the following charge: "to draft a definition for digital preservation that would be suitable for the needs of PARS and available to support the work of ALCTS and ALA, for use on the web, in policy statements, and other documents."
The Cairo Project has released Cairo Tools Survey: A Survey of Tools Applicable to the Preparation of Digital Archives for Ingest into a Preservation Repository. It has also released a related report, Cairo Use Cases: A Survey of User Scenarios Applicable to the Cairo Ingest Tool.
Here’s a description of the Cairo Project from its home page:
Cairo will develop a tool for ingesting complex collections of born-digital materials, with basic descriptive, preservation and relationship metadata, into a preservation repository. The project is based on needs identified by the JISC-funded Paradigm project and the Wellcome Library’s Digital Curation in Action project. It is a key building block in the partner institutions’ strategy to develop digital repository architectures which can support the development of digital collections over the long-term.
The REMAP Project at the University of Hull has been funded by JISC investigate how record management and digital preservation functions can be best supported in digital repositories. It utilizes the Fedora system.
Here’s an except from the Project Aims page (I have added the links in this excerpt):
The REMAP project has the following aims:
To develop Records Management and Digital Preservation (RMDP) workflow(s) in order to understand how a digital repository can support these activities
To embed digital repository interaction within working practices for RMDP purposes
To further develop the use of a WSBPEL orchestration tool to work with external Web services, including the PRONOM Web services, to provide appropriate metadata and file information for RMDP
To develop and test a notification layer that can interact with the orchestration tool and allow RSS
syndication to individuals alerting them to RMDP tasks
To develop and test an intermediate persistence layer to underpin the notification layer and interact
with the WSBPEL orchestration tool to allow orchestrated workflows to take place over time
To test and validate the use of the enhanced WSBPEL tool with institutional staff involved in RMDP activities