Digital Preservation: SiteStory Released

Herbert van de Sompel has announced the release of SiteStory.

Here's an excerpt:

I am very pleased to announce the open source release of our SiteStory transactional web archiving solution. The solution is compatible with the Memento "Time Travel for the Web" framework and its current implementation can be used to archive Apache web servers.

Read more about it at Memento: Adding Time to the Web.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Minimum Digitization Capture Recommendations (Draft)

The Association for Library Collections and Technical Services Preservation and Reformatting Section has released a draft of its Minimum Digitization Capture Recommendations. The comment period ends on 12/31/2012.

Here's an excerpt:

This document was created as a guideline for libraries digitizing content with the objective of producing a product that will not be re-digitized at a later point. Institutions can feel secure that if an item has been digitized at, or above, these specifications, they can depend on it to continue to be viable in the future. These guidelines only speak to the technical specifications of the digitized content itself and not to the larger issue of digitally preserving said content. In some cases, institutions may want to request a digital copy to preserve themselves further safeguarding materials by preserving them in multiple locations.

| Digital Curation Resource Guide | Digital Scholarship |

Aligning National Approaches to Digital Preservation

The Educopia Institute has released Aligning National Approaches to Digital Preservation.

Here's an excerpt:

On May 23-25, 2011, more than 125 delegates from more than 20 countries gathered in Tallinn, Estonia, for the "Aligning National Approaches to Digital Preservation" conference. . . .

This publication contains a collection of peer-reviewed essays that were developed by conference panels and attendees in the months following ANADP. Rather than simply chronicling the event, the volume intends to broaden and deepen its impact by reflecting on the ANADP presentations and conversations and establishing a set of starting points for building a greater alignment across digital preservation initiatives. Above all, it highlights the need for strategic international collaborations to support the preservation of our collective cultural memory.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

"De-Mystifying the Data Management Requirements of Research Funders"

Dianne Dietrich, Trisha Adamus, Alison Miner, and Gail Steinhart have published "De-Mystifying the Data Management Requirements of Research Funders" in the latest issue of Issues in Science and Technology Librarianship.

Here's an excerpt:

Research libraries have sought to apply their information management expertise to the management of digital research data. This focus has been spurred in part by the policies of two major funding agencies in the United States, which require grant recipients make research outputs, including publications and research data, openly available. As many academic libraries are beginning to offer or are already offering assistance in writing and implementing data management plans, it is important to consider how best to support researchers. Our research examined the current data management requirements of major US funding agencies to better understand data management requirements facing researchers and the implications for libraries offering data management services for researchers.

| Research Data Curation Bibliography | Digital Scholarship |

Testing Software Tools of Potential Interest for Digital Preservation Activities at the National Library of Australia

The National Library of Australia has released Testing Software Tools of Potential Interest for Digital Preservation Activities at the National Library of Australia.

Here's an excerpt:

Four file format identification tools were tested: File Investigator Engine, Outside-In File ID, FIDO and file/libmagic. This represents a mix of commercial and open source tools. The results were analysed from the point of view of comparing the tools to determine the extent of coverage and the level of agreement between them.

Five metadata extraction tools were tested: File Investigator Engine, Exiftool, MediaInfo, pdfinfo and Apache Tika. The results were analysed in terms of the number and range of metadata items extracted for specific file subsets.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Digital Curation Resource Guide

Digital Scholarship has released the Digital Curation Resource Guide.

This resource guide presents over 200 selected English-language websites and documents that are useful in understanding and conducting digital curation. It covers academic programs, discussion lists and groups, glossaries, file formats and guidelines, metadata standards and vocabularies, models, organizations, policies, research data management, serials and blogs, services and vendor software, software and tools, and training. It is available under a Creative Commons Attribution-NonCommercial 3.0 Unported License.

The Digital Curation Resource Guide complements the Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works, which was released in June.

It is also available as an EPUB file (see How to Read EPUB Files).

Managing Research Data in Big Science

Norman Gray, Tobia Carozzi, and Graham Woan have self-archived Managing Research Data in Big Science in arXiv.org.

Here's an excerpt:

The project which led to this report was funded by JISC in 2010-2011 as part of its 'Managing Research Data' programme, to examine the way in which Big Science data is managed, and produce any recommendations which may be appropriate. . . .

This project has explored these differences using as a case-study Gravitational Wave data generated by the LSC [LIGO Scientific Collaboration], and has produced recommendations intended to be useful variously to JISC, the funding council (STFC) and the LSC community.

In Sect. 1 we define what we mean by 'big science', describe the overall data culture there, laying stress on how it necessarily or contingently differs from other disciplines.

In Sect. 2 we discuss the benefits of a formal data-preservation strategy, and the cases for open data and for well-preserved data that follow from that. . . .

In Sect. 3 we briefly discuss the LIGO data management plan, and pull together whatever information is available on the estimation of digital preservation costs.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

"A Study of Faculty Data Curation Behaviors and Attitudes at a Teaching-Centered University"

Jeanine Marie Scaramozzino, Marisa L. Ramírez, and Karen J. McGaughey have published "A Study of Faculty Data Curation Behaviors and Attitudes at a Teaching-Centered University" in the latest issue of College & Research Libraries.

Here's an excerpt:

This paper describes information gathered from a survey distributed to the College of Science and Mathematics faculty at California Polytechnic State University, San Luis Obispo (Cal Poly), a master's-granting, teaching-centered institution. There was a more than 60 percent response rate to the survey. The survey results provided insight into the science researchers' data curation awareness, behaviors, and attitudes, as well as what needs they exhibited for services and education regarding maintenance and management of data. It is important that professional librarians understand what researchers both inside and outside their own institutions know so that they can collaborate with their university colleagues to examine data curation needs.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Research Data Management: "Improving University Research Value: A Case Study"

Kelley O'Reilly, Jeffrey Johnson, and Georgiann Sanborn have published "Improving University Research Value: A Case Study" in SAGE Open.

Here's an excerpt:

This article investigates the current data management practices of university researchers at an Intermountain West land-grant research university in the United States. Key findings suggest that researchers are primarily focused on the collection and housing of research data. However, additional research value exists within the other life cycle stages for research data—specifically in the stages of delivery and maintenance. These stages are where most new demands and requirements exist for data management plans and policies that are conditional for external grant funding; therefore, these findings expose a "gap" in current research practice. These findings should be of interest to academics and practitioners alike as findings highlight key management gaps in the life cycle of research data. This study also suggests a course of action for academic institutions to coalesce campus-wide assets to assist researchers in improving research value.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Unified Digital Format Registry (UDFR) Final Report

The UC Curation Center of the California Digital Library has released the Unified Digital Format Registry (UDFR) Final Report.

Here's an excerpt:

A deep understanding of digital formats is necessary to support the long-term preservation of digital assets, as it facilitates the preservation of the information content of those assets, rather than just their bit stream representations. A format is the set of syntactic and semantic rules that govern the mapping between information and the bits that represent that information. The Unified Digital Format Registry (UDFR), http://udfr.org/, is a new open source, semantically-enabled platform for the collection, long-term management, and dissemination of the significant properties of formats of interest to the preservation community[4]. The UDFR builds upon and "unifies" the function and holdings of two existing registry solutions: PRONOM, http://www.nationalarchives.gov.uk/PRONOM, from the UK National Archives since 2002; and GDFR (Global Digital Format Registry), http://gdfr.info/, from Harvard University since 2006. While these services rely on older relational and XML database technology, the UDFR uses a semantic database in which all information is represented in RDF form and exposed as Linked Data. Use of the UDFR is open to the public, although contribution or editing of information requires prior self-service account

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

EPUB Version of Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works

Digital Scholarship has released an open access EPUB version of the Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works.

EPUB is the International Digital Publishing Forum's format standard for digital books. EPUB files can be read using free e-book reader software, such as Adobe Digital Editions and the Apple iBooks app (download e-book with Safari) as well as e-book readers, such as the Barnes & Noble Nook readers. See the EPUB Wikipedia page for more details and reader options.

Here's an excerpt from the original announcement of the book:

In a rapidly changing technological environment, the difficult task of ensuring long-term access to digital information is increasingly important. This selective bibliography presents over 650 English-language articles, books, and technical reports that are useful in understanding digital curation and preservation. It covers digital curation and preservation copyright issues, digital formats (e.g., data, media, and e-journals), metadata, models and policies, national and international efforts, projects and institutional implementations, research studies, services, strategies, and digital repository concerns.

Most sources have been published from 2000 through 2011; however, a limited number of key sources published prior to 2000 are also included. The bibliography includes links to freely available versions of included works, such as e-prints and open access articles.

The Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works is available as a paperback (98 pages, $9.95, ISBN 1477497692 and ISBN-13: 9781477497692) and an open access PDF file. All versions of the bibliography are available under a Creative Commons Attribution-NonCommercial 3.0 Unported License.

| Reviews of Digital Scholarship Publications | Digital Scholarship |

The Preservation of Complex Objects. Volume 1, Visualisations and Simulations

The POCOS project has released The Preservation of Complex Objects. Volume 1, Visualisations and Simulations.

Here's an excerpt:

Let us say that there is an implication that an atomic digital object is a single file, and that this is synonymous with the notion of simplicity. But is that really the case? A single PDF file is often put forward as an exemplar of such a straightforward file, but the recent PDF 2.0 version can contain embedded 3D objects, so can it really be considered as atomic and 'simple'? So it might be a somewhat daunting task to rigidly categorize digital material past, present and future as either atomic or complex? During the symposia, the POCOS strategy was not to seek to impose definitions or standards on the proceedings, but rather to see whether any consensus emerged during the talks and breakout sessions

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Survey Report on Digitisation in European Cultural Heritage Institutions 2012

The ENUMERATE project has released Survey Report on Digitisation in European Cultural Heritage Institutions 2012.

Here's an excerpt:

The ENUMERATE Survey Report on Digitisation in Cultural Heritage Institutions 2012 represents the first major study into the current state of digitisation in Europe. It is the result of a survey carried out by the ENUMERATE Thematic Network, with the help of national coordinators, in 29 European countries. About 2000 institutions answered the open call to participate between January and March 2012.

| Reviews of Digital Scholarship Publications | Digital Scholarship |

EPUB for Archival Preservation

The Koninklijke Bibliotheek has released EPUB for Archival Preservation.

Here's an excerpt:

Over the last few years, the EPUB format has become increasingly popular in the consumer market. A number of publishers have indicated their wish to use EPUB for supplying their electronic publications to the KB. In response to this, the KB's Departments of Collection and Collection Care requested an initial study to investigate the suitability of the format for archival preservation. The main questions were:

  • What are the main characteristics of EPUB?
  • What functionality does EPUB provide, and is this sufficient for representing e.g. content with sophisticated layout and typography requirements?
  • How well is the EPUB supported by software tools that are used in (pre-)ingest workflows?
  • How suitable is EPUB for archival preservation? What are the main risks?

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Research Data Management: Review of DCC Tools and Guidance

The REDm-MED Project has released the Review of DCC Tools and Guidance.

Here's an excerpt:

In the course of its work, the REDm-MED Project has used various tools and guidance produced by the DCC, most notably CARDIO and DMP Online, the latter in both its checklist and software forms. The Project team found CARDIO to be promising but in need of further development before being used widely. The process of setting up a DMP Online template was relatively straightforward, but unfortunately there was no opportunity to solicit feedback from researchers on using it in the context of the tool.

| Research Data Curation Bibliography | Digital Scholarship |

"Practical Limits to the Scope of Digital Preservation"

Mike Kastellec has published the "Practical Limits to the Scope of Digital Preservation" in the latest issue of Information Technology and Libraries.

Here's an excerpt:

This paper examines factors that limit the ability of institutions to digitally preserve the cultural heritage of the modern era. The author takes a wide-ranging approach to shed light on limitations to the scope of digital preservation. The author finds that technological limitations to digital preservation have been addressed but still exist, and that non-technical aspects—access, selection, law, and finances—move into the foreground as technological limitations recede. The author proposes a nested model of constraints to the scope of digital preservation and concludes that costs are digital preservation’s most pervasive limitation.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Repositories for Visual Arts Research Data: Kaptur Technical Report

The KAPTUR project has released the Kaptur Technical Report.

Here's an excerpt:

This report is framed around the research question: which technical system is most suitable for managing visual arts research data? . . . .

The Technical Manager selected 17 systems to compare with the user requirement document (Appendix B). Five of the systems had similar scores so these were short-listed. The Technical Manager created an online form into which the Project Officers entered priority scores for each of the user requirements in order to calculate a more accurate score for each of the five short-listed systems (Appendix C) and this resulted in the choice of EPrints as the software for the KAPTUR project.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works

Digital Scholarship has released the Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works.

In a rapidly changing technological environment, the difficult task of ensuring long-term access to digital information is increasingly important. This selective bibliography presents over 650 English-language articles, books, and technical reports that are useful in understanding digital curation and preservation. It covers digital curation and preservation copyright issues, digital formats (e.g., data, media, and e-journals), metadata, models and policies, national and international efforts, projects and institutional implementations, research studies, services, strategies, and digital repository concerns.

Most sources have been published from 2000 through 2011; however, a limited number of key sources published prior to 2000 are also included. The bibliography includes links to freely available versions of included works, such as e-prints and open access articles.

The Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works is available as a paperback (98 pages, $9.95, ISBN 1477497692 and ISBN-13: 9781477497692) and an open access PDF file. All versions of the bibliography are available under a Creative Commons Attribution-NonCommercial 3.0 Unported License.

| Digital Scholarship's Digital/Print Books | Digital Scholarship |

 Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works cover

DCEP Final Report; Centuries of Knowledge: Graduate School of Library and Information Science Data Curation Education Program

Melissa H. Cragin et al. have self-archived the DCEP Final Report; Centuries of Knowledge: Graduate School of Library and Information Science Data Curation Education Program in IDEALS.

Here's an excerpt:

The Centuries of Knowledge grant was designed to increase educational and research capacity in data curation at the Graduate School of Library and Information Science (GSLIS) at the University of Illinois at Urbana-Champaign. We developed the Data Curation Education Program, a specialization within our Master of Science degree program, graduating 38 students to date. New courses developed for the specialization include Foundations of Data Curation, a survey course on the emerging field, and Digital Preservation. We developed the Summer Institute on Data Curation for practicing information professionals, facilitating the development of a community of practice across U.S. and Canadian academic and research organizations. Our outreach and service activities have led to a range of new partnerships that have resulted in student fieldwork opportunities, as well as new collaborative research and education activities resulting in 4 successful grant proposals.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

Persistent Digital Archives and Library System: Final Project Report to the Library of Congress, April 19, 2012

The PeDALS project has released Persistent Digital Archives and Library System: Final Project Report to the Library of Congress, April 19, 2012 .

Here's an excerpt:

The Persistent Digital Archives and Library System (PeDALS) research project was funded by the Library of Congress' National Digital Information Infrastructure and Preservation Program as part of its Preserving State Government Information initiative. The project explored the development of a curatorial rationale to support an automated workflow to process collections of digital publications and records, specifically using Microsoft BizTalk Server middleware to manage the collections and rules-based processes for their ingest. PeDALS also examined the practicality of Stanford University's LOCKSS, or Lots of Copies Keeps Stuff Safe, storage networks as an effective and inexpensive method of distributed preservation. In addition to those technical goals, PeDALS worked at building a community of shared practice among its partner states in the hopes that shared software development and best practices would foster a system that could be applied to a variety of repositories.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

"REDDNET and Digital Preservation in the Open Cloud: Research at Texas Tech University Libraries on Long-Term Archival Storage"

James Brewer, Tracy Popp, and Joy Perrin have published "REDDNET and Digital Preservation in the Open Cloud: Research at Texas Tech University Libraries on Long-Term Archival Storage" in the latest issue of the Journal of Digital Information.

Here's an excerpt:

In open cloud systems users can develop their own software and data management, control access, and purchase their own hardware while running securely in the cloud environment. . . . It is in this context that REDDnet (Research and Education Data Depot network) is presented as the place where the Texas Tech University (TTU) Libraries have been conducting research on long-term digital archival storage. The REDDnet network by year's end will be at 1.2 petabytes (PB) with an additional 1.4 PB for a related project. . . additionally there are over 200 TB of tape storage. These numbers exclude any disk space which TTU will be purchasing during the year. National Science Foundation (NSF) funding covering REDDnet and CMS-HI was in excess of $850,000 with $850,000 earmarked toward REDDnet. In the terminology we used above, REDDnet is an open cloud system that invited TTU Libraries to participate. This means that we run software which fits the REDDnet structure. We are beginning to complete the final design of our system, and starting to move into the first stages of construction. And we have made a decision to move forward and purchase one-half petabyte of disk storage in the initial phase. The concerns, deliberations and testing are presented here along with our initial approach.

| Digital Curation and Preservation Bibliography 2010: "If you're looking for a reading list that will keep you busy from now until the end of time, this is your one-stop shop for all things digital preservation." — "Digital Preservation Reading List," Preservation Services at Dartmouth College weblog, February 21, 2012. | Digital Scholarship |

Report on Peer Review of Digital Repositories

The Alliance for Permanent Access to the Records of Science Network has released the Report on Peer Review of Digital Repositories.

Here's an excerpt:

This document reports on the work which has been undertaken in support of the European Framework for Audit and Certification of Digital Repositories which was initiated by the European Commission's unit which funds APARSEN. . . .

The main part of this report provides details of the test audits which were carried out, the problems encountered and the lessons learned. The European repositories were the Deutsche Nationalbibliothek (DNB), Koninklijke Nederlandse Akademie van Wetenschappen Data Archiving and Networked Services (DANS), UK Data Archive (UKDA), Centre Informatique National de l'Enseignement Supérieur: Département Archivage et Diffusion (CINES-DAD) and in addition, in the USA, the Socioeconomic Data and Applications Center (SEDAC) at the Center for Earth Science Information, the National Space Science Data Center (NSSDC) and the Kentucky Department for Libraries and Archives (KDLA).

| Institutional Repository and ETD Bibliography 2011 | Digital Scholarship |