PDF Beats Microformats for Long-Term Document Storage

An AIIM report, Content Creation and Delivery—The On-Ramps and Off-Ramps of ECM, indicates that PDF has surpassed microformats for long-term document storage.

Here's an excerpt from the press release:

Recent AIIM research found that 90% of organizations are using the PDF file format for long-term storage of scanned documents, and 89% are converting Office files to PDF for distribution and archive. Not surprisingly, paper is currently used by 100% of organizations, but when asked to predict the situation in 5 years time, use of paper for long-term storage dropped to 77%, whereas PDF rose to 93%. . . .

Time-honored storage on microfilm or fiche is still used by 43% of organizations, but this is expected to drop to 28% over the next five years. At the other end of the media spectrum, 34% of organizations are archiving digital video, rising to a projected 47% in 5 years. Digital audio archiving will rise from 30% to 37%.

Larry Carver Named Digital Preservation Pioneer

The National Digital Information Infrastructure and Preservation Program at the Library of Congress has named Larry Carver, retired Director of Library Technologies and Digital Initiatives at University of California at Santa Barbara, as a digital preservation pioneer.

Here's an excerpt from the UCSB press release:

"We at the UCSB Library are thrilled that Larry Carver has received this important and well-deserved recognition," said Brenda Johnson, university librarian. "His tireless and innovative work in the development of the Map and Imagery Lab and the Alexandria Digital Library has brought international attention to our library and has benefited thousands of scholars, students, and members of the public from around the world. We offer him our heartiest congratulations on being named a Library of Congress ‘Pioneer of Digital Preservation.'" . . .

Carver began his career at the library where he helped build an impressive collection of maps, aerial photography, and satellite imagery that led to the development of the Map and Imagery Laboratory (MIL) in 1979. As the MIL collections grew, Carver felt that geospatial data presented a unique challenge to the library. He believed that coordinate-based collections should be managed differently than book-based collections. But not everyone agreed with him.

"It became apparent that handling traditional geospatial content in a typical library context was just not satisfactory and another means to control that data was important," he said. "It wasn't as easy as it sounds. I was in a very conservative environment, and they were not easily convinced that this was something a library should do."

Carver and others spent years developing an exhaustive set of requirements for building a geospatial information management system. The system had a number of innovative ideas. "We included traditional methods of handling metadata but also wanted to search by location on the Earth's surface," Carver said. "The idea was that if you point to a place on the Earth you could ask the question, 'What information do you have about that space?,' as opposed to a traditional way of having to know ahead of time who wrote about it."

An opportunity to develop that system arrived in 1994 when UCSB received funding from the National Science Foundation for Carver and his team to build the Alexandria Digital Library. "We produced the first operational digital library that was based on our research," Carver said. "Our concentration was to be able to develop a system that could search millions of records with latitude and longitude coordinates and present those results via the Internet."

The basic concepts behind the Alexandria Digital Library have been widely adopted by Google Earth, Wikipedia, and others. Carver couldn't be more delighted.

"I think it's wonderful," Carver said. "We weren't trying to be the only game in town. We were just trying to raise consciousness way back in the early 1980s that this was a viable way of handling geospatial material. This approach lets people interact with data in a realistic way without having a great deal of knowledge about an individual object. It was a new way of dealing with massive amounts of information in an environment that made finding and accessing information much easier."

Read more about it at "Digital Preservation Pioneer: Larry Carver."

Springer Digital Publications to be Archived in CLOCKSS

Springer Science+Business Media has announced that its digital publications will be archived in the dark CLOCKSS archive.

Here's an excerpt from the press release:

The CLOCKSS archive allows research libraries and scholarly publishers, who launched CLOCKSS as a pilot program, to preserve and store its electronic content. Once ingested, the econtent is kept safe and secure in a dark archive until it is triggered and the CLOCKSS Board determines that the content should be copied from the archive and made freely available to all, regardless of prior subscription. Due to the success of the pilot program, the founding members unanimously agreed to incorporate and invite others to participate in CLOCKSS.

Participating CLOCKSS libraries and publishers govern the archive themselves via three tiers of governance—an executive board, a board of directors, and an advisory council. Research libraries working alongside publishers like Springer are able to help shape policy and practice in their communities.

"In a great show of confidence, Springer has joined the CLOCKSS initiatives, putting its complete trust in an archive they helped build," says Gordon Tibbitts, Co-Chair of CLOCKSS. "Springer is helping to shoulder the responsibility, alongside its publishing peers and research library customers, of keeping their scholarly assets safe and protected for future generations of scholars." . . .

In addition to storing Springer’s journal content with CLOCKSS, the publisher has submitted a proposal to the CLOCKSS Board outlining a pilot project to test the feasibility and legal issues surrounding preservation of eBook content. Because eBook contracts differ from journal contracts, Springer can only deposit eBook files when its authors' rights are protected.

CLOCKSS is a joint venture between the world’s leading scholarly publishers and research libraries. Its mission is to build a sustainable, geographically distributed dark archive with which to ensure the long-term survival of Web-based scholarly publications for the benefit of the greater global research community. Governing Libraries include the Australian National University, EDINA at the University of Edinburgh, Indiana University, New York Public Library, OCLC Online Computer Library Center, Rice University, Stanford University, the University of Alberta, the University of Hong Kong and the University of Virginia. Governing Publishers include the American Medical Association, the American Physiological Society, bepress, Elsevier, IOP Publishing, Nature Publishing Group, Oxford University Press, SAGE Publications, Springer, Taylor & Francis and Wiley-Blackwell.

JISC-PoWR Releases Preservation of Web Resources Handbook

JISC-PoWR has released the Preservation of Web Resources Handbook.

Here's an excerpt:

The Handbook is structured in two parts. The first part deals with web resources and makes practical suggestions for their management, capture, selection, appraisal and preservation. It includes observations on web content management systems, and a list of available tools for performing web capture. It concludes with a discussion of Web 2.0 issues, and a range of related case studies. The second part is more focussed on web resources within an Institution. It offers advice about institutional drivers and policies for web archiving, along with suggestions for effecting a change within an organisation; one such approach is the adoption of Information Lifecycle Management. There are separate Appendices covering Legal guidance (written by Jordan Hatcher) and records management.

The Handbook also contains a bibliography and a glossary of terms. The Handbook is aimed at an audience of information managers, asset managers, webmasters, IT specialists, system administrators, records managers, and archivists.

First Digital Curation Centre SCARP Case Study Released on Brain Image Preservation

The first Digital Curation Centre SCARP (Sharing Curation and Re-use Preservation) case study has been released: Curating Brain Images in a Psychiatric Research Group: Infrastructure and Preservation Issues.

Here's the description:

Curating neuroimaging research data for sharing and re-use involves practical challenges for those concerned in its use and preservation. These are exemplified in a case study of the Neuroimaging Group in the University of Edinburgh’s Division of Psychiatry. The study is one of the SCARP series encompassing two aims; firstly to discover more about disciplinary approaches and attitudes to digital curation through 'immersion' in selected cases, in this case drawing on ethnographic field study. Secondly SCARP aims to apply known good practice, and where possible to identify new lessons from practice in the selected discipline areas; in this case using action research to assess risks to the long term reusability of datasets, and identify challenges and opportunities for change.

Database Preservation: The International Challenge and the Swiss Solution

DigitalPreservationEurope has released Database Preservation: The International Challenge and the Swiss Solution.

Here's the abstract:

Most administrative records are stored in databases. Today’s challenge is preserving the information and making it accessible for years to come, ensuring knowledge-transfer as well as administrative sustainability. Lack of standardization has hitherto rendered the task of archiving database content highly complex. The Swiss Federal Archives have developed a new XML based format which permits long-term preservation of the relational databases content. The Software-Independent Archiving of Relational Databases (short: SIARD) offers a unique solution for preserving data content, metadata as well as the relations in an ISO conform format.

Grant Awarded: DSpace Foundation and Fedora Commons for DuraSpace Planning

The DSpace Foundation and Fedora Commons have received a grant from the Andrew W. Mellon Foundation to support planning for DuraSpace.

Here's an excerpt from the press release:

Over the next six months funding from the planning grant will allow the organizations to jointly specify and design "DuraSpace," a new web-based service that will allow institutions to easily distribute content to multiple storage providers, both "cloud-based" and institution-based. The idea behind DuraSpace is to provide a trusted, value-added service layer to augment the capabilities of generic storage providers by making stored digital content more durable, manageable, accessible and sharable.

Michele Kimpton, Executive Director of the DSpace Foundation, said, "Together we can leverage our expertise and open source value proposition to continue to provide integrated open solutions that support the scholarly mission of universities."

Sandy Payette, Executive Director of Fedora Commons, observes, "There is an important role for high-tech non-profit organizations in adding value to emerging cloud solutions. DuraSpace is designed with an eye towards enabling universities, libraries, and other types of organizations to take advantage of cloud storage while also addressing special requirements unique to areas such as digital archiving and scholarly communication."

The grant from the Mellon Foundation will support a needs analysis, focus groups, technical design sessions, and meetings with potential commercial partners. A working web-based demonstration will be completed during the six-month grant period to help validate the technical and business assumptions behind DuraSpace.

Digital Preservation: Two-Year JHOVE2 Project Funded

The National Digital Information Infrastructure Preservation Program has funded the two-year JHOVE2 project, which will " develop a next-generation JHOVE2 architecture for format-aware characterization." Project particpants are the California Digital Library, Portico, and Stanford University.

Here's an excerpt from the Digipres announcement:

Among the enhancements planned for JHOVE2 are:

  • Support for four specific aspects of characterization: signature-based identification, feature extraction, validation, and rules-based assessment
  • A more sophisticated data model supporting complex multi-file objects and arbitrarily-nested container objects
  • Streamlined APIs to facilitate the integration of JHOVE2 technology in systems, services, and workflows
  • Increased performance
  • Standardized error handling
  • A generic plug-in mechanism supporting stateful multi-module processing
  • Availability under the BSD open source license

To help focus project activities we have recruited a distinguished advisory board to represent the interests of the larger stakeholder community. The board includes participants from the following international memory institutions, projects, and vendors:

  • Deutsche Nationalbibliothek (DNB)
  • Ex Libris
  • Fedora Commons
  • Florida Center for Library Automation (FCLA)
  • Harvard University / GDFR
  • Koninklijke Bibliotheek (KB)
  • MIT/DSpace
  • National Archives (TNA)
  • National Archives and Records Administration (NARA)
  • National Library of Australia (NLA)
  • National Library of New Zealand (NLNZ)
  • Planets project

The project partners are currently engaged in a public needs assessment and requirements gathering phase. A provisional set of use cases and functional requirements has already been reviewed by the JHOVE2 advisory board. . . .

The functional requirements, along with other project information, is available on the JHOVE2 project wiki. Feedback on project goals and deliverables can be submitted through the JHOVE2 public mailing lists.

Ex Libris Digital Preservation System Live at the National Library of New Zealand

After completing a successful beta test, the National Library of New Zealand has started using the Ex Libris Digital Preservation System in production mode. (Thanks to Library Technology Guides.)

Here's an excerpt from the press release:

Based on the Open Archival Information System (OAIS) model and conforming to trusted digital repository (TDR) requirements, the Ex Libris Digital Preservation System provides institutions with the infrastructure and technology needed to preserve and facilitate access to the collections under their guardianship.

The understanding that preservation and access belong together—that they are not mutually exclusive entities—dictated a design in which preservation support is built directly into the platform rather than serving as an add-on feature. This end-to-end solution offers full security, auditing, replication, and integrity checks that maintain the safety of collections over time, while persistent identifier tools and standard APIs (Application Programming Interface) enable institutions to make their collections easily accessible to users.

The National Library of New Zealand is using the highly configurable and scalable Digital Preservation System to collect a range of digital material types from a wide variety of sources (such as publishers, government agencies, and Web sites in the New Zealand domain); to review, validate, and organize such materials; and to make them available to end users in accordance with user access rights. Risk analysis and conversion tools enable the system to provide meaningful access to the digital objects over time. The integration of the system with other National Library of New Zealand applications is facilitated by a built-in software development kit and the suite of APIs.

December 2008 will see the general release of the Digital Preservation System by Ex Libris Group.

JISC Digital Preservation Policies Study

JISC has released a two-part study of digital preservation policies: Digital Preservation Policies Study and Digital Preservation Policies Study, Part 2: Appendices—Mappings of Core University Strategies and Analysis of Their Links to Digital Preservation.

Here's an excerpt:

This JISC funded study aims to provide an outline model for digital preservation policies and to analyse the role that digital preservation can play in supporting and delivering key strategies for Higher and Further Education Institutions. Although focussing on the UK Higher and Further Education sectors, the study draws widely on policy and implementations from other sectors and countries and will be of interest to those wishing to develop policy and justify investment in digital preservation within a wide range of institutions. We have created two tools in this study: 1) a model/framework for digital preservation policy and implementation clauses based on examination of existing digital preservation policies; 2) a series of mappings of digital preservation links to other key institutional strategies in UK universities and colleges. Our aim has been to help institutions and their staff develop appropriate digital preservation policies and clauses set in the context of broader institutional strategies.

Presentations from Reinventing Science Librarianship: Models for the Future

Presentations (usually digital audio and PowerPoint slides) about data curation, e-science, virtual organizations and other topics from the ARL/CNI Fall Forum on Reinventing Science Librarianship: Models for the Future are now available.

Speakers included Sayeed Choudhury, Ron Larsen, Liz Lyon, Richard Luce, and others.

Long-Term Preservation: Results from a Survey Investigating Preservation Strategies amongst ALPSP Publisher Members

The Association of Learned and Professional Society Publishers has released Long-Term Preservation: Results from a Survey Investigating Preservation Strategies amongst ALPSP Publisher Members.

Here's an excerpt from the press release:

  • The majority of ALPSP publishers who responded to the survey believe long-term preservation to be a critical issue: 91% either agreed or strongly agreed with the statement "Long-term preservation is an issue which urgently needs to be addressed within the industry." 9% were neutral; no-one disagreed.
  • ALPSP publishers are strongly motivated to engage with preservation because of its critical importance to their customers, with over 90% of respondents citing this as a major motivating factor: a heartening response for those in the library community.
  • Although 68% of publishers reported understanding of preservation issues within their organisation to be either 'good' or 'reasonable', the survey also revealed a wide range of concerns suggesting an overall lack of confidence, at least for the present. The survey revealed a strong desire amongst almost all publishers for the development of 'best practice' and industry standards.
  • There is some confusion surrounding the nature and extent of publisher participation in long-term preservation schemes, with high numbers of respondents declaring their organisation to be participating in one or more initiatives and yet the schemes themselves reporting substantially lower numbers presently taking part.
  • Publisher views on who should take responsibility for long-term preservation also reveal some interesting contradictions: despite presently supporting a range of preservation schemes, a significant majority of publishers indicated they would in fact prefer other groups and institutions to take this responsibility on. National libraries in particular were a popular choice.
  • Finally, the survey revealed most publishers are clear about the distinction between ensuring long-term access and ensuring long-term preservation, with the majority believing they have clear responsibility for long-term access. A worryingly high number however admit to either not trusting their present strategy or not currently having any strategy to deliver here.

Open Access to and Reuse of Research Data—The State of the Art in Finland

The Finnish Social Science Data Archive has published Open Access to and Reuse of Research Data—The State of the Art in Finland.

Here's an excerpt:

In 2006, the Ministry of Education in Finland allocated resources to the Finnish Social Science Data Archive (FSD) to chart national and international practices related to open access to research data. Consequently, the FSD carried out an online survey targeting professors of human sciences, social sciences and behavioural sciences in Finnish universities. Some respondents were senior staff at research institutes. The respondents were asked about the state and use of data collected in their department/institute. Almost half of the respondents considered the preservation and use of digital research data to be relevant to their department. The number of respondents (150) is large enough to warrant statistical analysis even though response rate was low at 28%.

Funded: Towards Interoperable Preservation Repositories (TIPR): A Demonstration Project

The Florida Center for Library Automation has received a $392,649 grant (matching amount: $392,764) from the Institute of Museum and Library Services for a two-year project titled "Towards Interoperable Preservation Repositories (TIPR): A Demonstration Project." The Cornell University Library and the New York University Libraries are FCLA's grant partners.

Here's an excerpt from the announcement:

Practical repository-to-repository transfer requires agreed-upon transfer protocols, enhancements to repository software applications, and a common standards-based transfer format capable of transporting rich preservation metadata and associated digital objects. Building on prior work, this project will define a transfer format, modify three different open source repository applications to import and export information packages in this format, and test a carefully developed set of use cases to verify the usability and flexibility of the format.

DCC Methodology for Designing and Evaluating Curation and Preservation Experiments V1.0

The Digital Curation Centre has released DCC Methodology for Designing and Evaluating Curation and Preservation Experiments V1.0.

Here's an excerpt:

The purpose of this document is to describe a Digital Curation Centre (DCC) testbed methodology which will serve as a workflow framework for designing experiments to validate the effectiveness of curation and preservation strategies. The methodology is grounded in the following general principles: the methodology must

  • conform to the fundamental standards of a scientific methodology,
  • be easy to follow and implement, i.e. accommodate experimenters of all levels of technical expertise,
  • be general enough to accommodate future changes and the evolution of ideas in curation and
  • preservation theory and practice,
  • be specific enough to provide concrete guidance in the immediate short term,
  • be sufficiently flexible and extensible to allow for technological advances and the evolving
  • complexity of available resources

Repositories Support Project Launches RSP Blog Directory

The Repositories Support Project has launched the RSP Blog Directory.

Here's an excerpt from the announcement:

It provides a list of recommended and informative blogs regarding the repository scene from around the globe. Listed blogs include personal creations from those with first hand experience of repository management and/or technical development of repository software; blogs for specific repositories, projects and software developers; as well as blogs for groups and societies with an interest in the open access movement and digital curation.

Presentations/Reports from the JISC/CNI Meeting on Transforming the User Experience

Presentations are available from the JISC/CNI meeting on Transforming the User Experience.

Here's a selection:

Helen Aguera, Senior Program Officer at the National Endowment for the Humanities, has also reported on the conference in a series of Weblog postings: