Preserving Mixed Analog/Digital AV Archives: PrestoSpace Project Case Study

The Digital Curation Centre has published DCC Case Study—PrestoSpace: Preservation towards Storage and Access. Standardised Practices for Audiovisual Contents in Europe.

Here's the "Executive Summary":

Explicit strategies are needed to manage 'mixed' audio visual (AV) archives that contain both analogue and digital materials. The PrestoSpace Project brings together industry leaders, research institutes, and other stakeholders at a European level, to provide products and services for effective automated preservation and access solutions for diverse AV collections. The Project’s main objective is to develop and promote flexible, integrated and affordable services for AV preservation, restoration, and storage with a view to enabling migration to digital formats in AV archives.

Presentations from the Open Access Collections Workshop Now Available

Presentations from the Australian Partnership for Sustainable Repositories' Open Access Collections workshop are now available. Presentations are in HTML/PDF, MP3, and digital video formats. The workshop was held in association with the Queensland University Libraries Office of Cooperation and the University of Queensland Library.

Planets Project Releases White Paper: Representation Information Registries

The Planets (Preservation and Long-term Access through Networked Services) project has released White Paper: Representation Information Registries.

Here's the "Executive Summary":

This document is a report on the state-of-the-art in the field of Representation Information Registries (RIRs). RIRs are widely recognised as a critical component of digital preservation architecture in general, and a number of such registries are being developed as part of the Planets architecture in particular. This document discusses the development of the concept of representation information, and of the use of registries as a means of exposing that information for use by digital preservation services; it describes the RIR implementations which currently exist or are under development globally; it assesses planned and potential future developments in this area; it discusses the role which RIRs play within the Planets project, and concludes with recommendations for future areas of research within Planets and beyond.

Dealing with Research Data in a Federated Digital Repository: Oxford University Planning Document Released

The Oxford e-Research Centre has released Scoping Digital Repository Services for Research Data Management, a project plan for determining the requirements for handling data in a federated digital repository at Oxford University.

Here's an excerpt from the "Aims and Objectives" section:

Objectives:

  • Capture and document researchers’ requirements for digital repository services to handle research data.
  • Participate actively in the development of an interoperability framework for the federated digital repository at Oxford.
  • Make recommendations to improve and coordinate the provision of digital repository services for research data.
  • Initiate and develop collaborations with the different repository activities already occurring to ensure that communication takes place in between them.
  • Raise awareness at Oxford of the importance and advantages of the active management of research data.
  • Communicate significant national and international developments in repositories to relevant Oxford stakeholders, in order to stimulate the adoption of best practices.

Essays from the Core Functions of the Research Library in the 21st Century Meeting

The Council on Library and Information Resources has released essays from its recent Core Functions of the Research Library in the 21st Century meeting.

Here's an excerpt from the meeting home page that lists the essays:

"The Future of the Library in the Research University," by Paul Courant

"Accelerating Learning and Discovery: Refining the Role of Academic Librarians," by Andrew Dillon

"A New Value Equation Challenge: The Emergence of eResearch and Roles for Research Libraries," by Richard E. Luce

"Co-teaching: The Library and Me," by Stephen G. Nichols

"Groundskeepers to Gatekeepers: How to Change Faculty Perceptions of Librarians and Ensure the Future of the Research Library," by Daphnee Rentfrow

"The Research Library in the 21st Century: Collecting, Preserving, and Making Accessible Resources for Scholarship," by Abby Smith

"The Role of the Library in 21st Century Scholarly Publishing," by Kate Wittenberg

"Leveraging Digital Technologies in Service to Culture and Society: The Role of Libraries as Collaborators," by Lee Zia

SEASR (Software Environment for the Advancement of Scholarly Research)

The Andrew W. Mellon Foundation-funded SEASR (Software Environment for the Advancement of Scholarly Research) project is building digital humanities cyberinfrastructure.

Here's an excerpt about the project from its home page:

What can SEASR do for scholars?

  • help scholars to access existing large data stores more readily
  • provide scholars with enhanced data synthesis and query analysis: from focused data retrieval and data integration, to intelligent human-computer interactions for knowledge access, to semantic data enrichment, to entity and relationship discovery, to knowledge discovery and hypothesis generation
  • empower collaboration among scholars by enhancing and innovating virtual research environments

What kind of innovations does SEASR provide for the humanities?

  • a complete, fully integrated, state-of-the-art software environment for managing structured and unstructured data and analyzing digital libraries, repositories and archives, as well as educational platforms
  • an open source, end-to-end software system that enables researchers to develop, evolve, and maintain data interoperability, evaluation, analysis, and visualization

Read more about it at "Placing SEASR within the Digital Library Movement."

Helping Researchers Understand and Label Article Versions: VERSIONS Toolkit Released

The VERSIONS (Versions of Eprints—A User Requirements Study and Investigation Of the Need for Standards) project has released the VERSIONS Toolkit.

Here's an excerpt from the "Introduction":

If you are an experienced researcher you are likely to be disseminating your work on a personal website, in a subject archive, or in an institutional repository already. This toolkit aims to:

  • provide peer-to-peer advice about managing personal versions and revisions in order to keep your options open for future use of your work
  • clarify areas of uncertainty among researchers about agreements with publishers and how these relate to different versions of research outputs
  • suggest ways to identify your work clearly when placing it on the web in order to guide your readers to the latest and best versions of your work
  • direct you to further resources about making versions of your work openly accessible

The toolkit draws on the results of a survey of researchers’ attitudes and current practice when creating, storing and disseminating different versions of their research. As such the guidance in the toolkit represents the views of active researchers. Survey respondents were predominantly from economics and related disciplines.

Sun Centre of Excellence for Libraries to Be Created in Alberta

Sun Microsystems has announced that it is partnering with the University of Alberta Libraries and the Alberta Library to create the Sun Centre of Excellence for Libraries.

Here's an excerpt from the press release:

Sun Microsystems of Canada Inc., the University of Alberta Libraries (UAL) and The Alberta Library (TAL) today announced the creation of a new Sun Centre of Excellence for Libraries (COE). The initiative will enhance and support respective organizational projects, as well as an extensive, province-wide, multi-faceted digital library. As part of the COE the participants intend to provide a seamless search and retrieval experience resulting in unprecedented access to information for students, faculty and the public, as well as creating an enduring preservation environment.

"This initiative will facilitate new levels of access to a tremendous amount of unique information that hasn’t been widely available," said Ernie Ingles, Vice Provost and Chief Librarian, University of Alberta. "It will further our goal to act as a trusted regional repository for digital materials by facilitating approaches to the discovery, storage, and archival preservation of digital resources that will benefit all Canadians." The University of Alberta Libraries, the second largest academic library system in Canada, has more than one million unique digitized pages of content in four major collections to contribute to the new digital library.

Using a range of Sun systems, software and thin client technologies, The Alberta Library (TAL) will integrate current digital collections and electronic information resources from the Lois Hole Campus Alberta Digital Library, an Alberta Government initiative that is providing post-secondary students, faculty and researchers in every corner of the province with access to vast holdings of digital resources. The digital library currently contains more than 4.5 million licensed items, including academic journals, encyclopedias, magazine and newspaper articles, literary criticisms and video clips from 35 post-secondary institutions. The COE will also help TAL improve province-wide access to library catalogues and secure information-sharing. . . .

The COE will support distance learning and research within e-learning environments by providing access to digital collections preserved by Alberta university libraries, archives and museums. It will also yield solutions for long-term archiving of digital resources, and digital rights management. The support and technology provided by Sun will ensure the infrastructure can evolve to meet future needs and continue to support research, collaborative learning and general discovery. . . .

The Centre of Excellence for Libraries is expected to be operational by summer 2008.

JHOVE 1.1 Released: Identification, Validation, and Characterization of Digital Objects

Version 1.1 of the open-source JHOVE (JSTOR/Harvard Object Validation Environment) software has been released.

Here's an excerpt from the project home page that describes JHOVE:

JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects.

  • Format identification is the process of determining the format to which a digital object conforms; in other words, it answers the question: "I have a digital object; what format is it?"
  • Format validation is the process of determining the level of compliance of a digital object to the specification for its purported format, e.g.: "I have an object purportedly of format F; is it? . . . ."
  • Format characterization is the process of determining the format-specific significant properties of an object of a given format, e.g.: "I have an object of format F; what are its salient properties?"

Repository Presentations from the DataShare Project

The DataShare project has released two recent presentations about its activities: "Data Documentation Initiative (DDI)" and "Guidelines and Tools for Repository Planning and Assessment." A recent briefing paper, The Data Documentation Initiative (DDI) and Institutional Repositories, is also available.

Here's a description of the DataShare project from its home page:

DISC-UK DataShare, led by EDINA, arises from an existing UK consortium of data support professionals working in departments and academic libraries in universities (Data Information Specialists Committee-UK), and builds on an international network with a tradition of data sharing and data archiving dating back to the 1960s in the social sciences. By working together across four universities and internally with colleagues already engaged in managing open access repositories for e-prints, this partnership will introduce and test a new model of data sharing and archiving to UK research institutions. By supporting academics within the four partner institutions who wish to share datasets on which written research outputs are based, this network of institution-based data repositories develops a niche model for deposit of 'orphaned datasets' currently filled neither by centralised subject-domain data archives/centres/grids nor by e-print based institutional repositories (IRs).

Presentations from the Open Access Collections Workshop

Presentations from the Open Access Collections workshop are now available.

Here are selected presentations:

Muradora Version 1.2.1 Released: Federated Identity and Authorization for Fedora

The DRAMA (Digital Repository Authorization Middleware Architecture) team has released version 1.2.1 of Muradora.

Here's an excerpt from the Muradora home page that describes Muradora:

Muradora is an easy to use repository application that supports federated identity (via Shibboleth authentication) and flexible authorization (using XACML). Muradora leverages the modularity, flexibility and scalability of the well-known Fedora repository.

Muradora's unique vision is one where Fedora forms the core back-end repository, while different front-end applications (such as portlets or standalone web interfaces) can all talk to the same instance of Fedora, and yet maintain a consistent approach to access control.

Read more about it at "Muradora 1.2.1 Release."

VALA 2008 Presentations

Presentations from the VALA 2008 conference are now available.

Here's a selection of presentations:

Michelle McLean has blogged a number of VALA 2008 sessions in Connecting Librarian postings.

iRODS Version 1.0: Data Grids, Digital Libraries, Persistent Archives, and Real-Time Data Systems

The Data-Intensive Computing Environments group at the San Diego Supercomputer Center has released version 1.0 of the open-source iRODS (Integrated Rule-Oriented Data System) system, which can be used to support data grids, digital libraries, persistent archives, and real-time data systems.

Here's an excerpt from the press release:

"iRODS is an innovative data grid system that incorporates and moves beyond ten years of experience in developing the widely used Storage Resource Broker (SRB) technology," said Reagan Moore, director of the DICE group at SDSC. "iRODS equips users to handle the full range of distributed data management needs, from extracting descriptive metadata and managing their data to moving it efficiently, sharing data securely with collaborators, publishing it in digital libraries, and finally archiving data for long-term preservation. . . ."

"You can start using it as a single user who only needs to manage a small stand-alone data collection," said Arcot Rajasekar, who leads the iRODS development team. "The same system lets you grow into a very large federated collaborative system that can span dozens of sites around the world, with hundreds or thousands of users and numerous data collections containing millions of files and petabytes of data—it’s a true full-scale distributed data system." A petabyte is one million gigabytes, about the storage capacity of 10,000 of today’s PCs. . . .

Version 1.0 of iRODS is supported on Linux, Solaris, Macintosh, and AIX platforms, with Windows coming soon. The iRODS Metadata Catalog (iCAT) will run on either the open source PostgreSQL database (which can be installed via the iRODS install package) or Oracle. And iRODS is easy to install—just answer a few questions and the install package automatically sets up the system.

Under the hood, the iRODS architecture stores data on one or more servers, which may be widely separated geographically; keeps track of system and user-defined information describing the data with the iRODS Metadata Catalog (iCAT); and offers users access through clients (currently a command line interface and Web client, with more to come). As directed by iRODS rules, the system can process data where it is stored using applications called "micro-services" executed on the remote server, making possible smaller and more targeted data transfers.

Summa: A Federated Search System

Statsbiblioteket is developing Summa, a federated search system.

Birte Christensen-Dalsgaard, Director of Development, discusses Summa and other topics in a new podcast (CNI Podcast: An Interview with Birte Christensen-Dalsgaard, Director of Development at the State and University Library, Denmark).

Here's an excerpt from the podcast abstract:

Summa is an open source system implementing modular, service-based architecture. It is based on the fundamental idea "free the content from the proprietary library systems," where the discovery layer is separated from the business layer. In doing so, any Internet technology can be used without the limitations traditionally set by proprietary library systems, and there is the flexibility to integrate or to be integrated into other systems. A first version of a Fedora—Summa integration has been developed.

A white paper is available that examines the system in more detail.

E-Print Preservation: SHERPA DP: Final Report of the SHERPA DP Project

JISC has released SHERPA DP: Final Report of the SHERPA DP Project.

Here's an excerpt from the "Executive Summary":

The SHERPA DP project (2005–2007) investigated the preservation of digital resources stored by institutional repositories participating in the SHERPA project. An emphasis was placed on the preservation of e-prints—research papers stored in an electronic format, with some support for other types of content, such as electronic theses and dissertations.

The project began with an investigation of the method that institutional repositories, as Content Providers, may interact with Service Providers. The resulting model, framed around the OAIS, established a Co-operating archive relationship, in which data and metadata is transferred into a preservation repository subsequent to it being made available. . . .

The Arts & Humanities Data Service produced a demonstrator of a Preservation Service, to investigate the operation of the preservation service and accepted responsibility for the preservation of the digital objects for a three-year period (two years of project funding, plus one year).

The most notable development of the Preservation Service demonstrator was the creation of a reusable service framework that allows the integration of a disparate collection of software tools and standards. The project adopted Fedora as the basis for the preservation repository and built a technical infrastructure necessary to harvest metadata, transfer data, and perform relevant preservation activities. Appropriate software tools and standards were selected, including JHOVE and DROID as software tools to validate data objects; METS as a packaging standard; and PREMIS as a basis on which to create preservation metadata. . . .

A number of requirements were identified that were essential for establishing a disaggregated service for preservation, most notably some method of interoperating with partner institutions and he establishment of appropriate preservation policies. . . . In its role as a Preservation Service, the AHDS developed a repository-independent framework to support the EPrints and DSpace-based repositories, using OAI-PMH as common method of connecting to partner institutions and extracting digital objects.

JISC Programme Synthesis Study: Supporting Digital Preservation and Asset Management in Institutions

JISC has published JISC Programme Synthesis Study: Supporting Digital Preservation and Asset Management in Institutions: A Review of the 4-04 Programme on Digital Preservation and Asset Management in Institutions for the JISC Information Environment: Part II: Programme Synthesis.. The report covers a number of projects, including LIFE, MANDATE, PARADIGM, PRESERV, and SHERPA DP.

Here's an excerpt from UKOLN News:

Written by Maureen Pennock, DCC researcher at UKOLN, the study provides a comprehensive and categorised overview of the outputs from the entire programme. Categories include training, costs and business models, life cycles, repositories, case studies, and assessment and surveys. Each category includes detailed information on project outputs and references a number of re-usable project-generated tools that range from software services to checklists and guidance.

Stewardship of Digital Research Data: A Framework of Principles and Guidelines

The Research Information Network (RIN) has published Stewardship of Digital Research Data: A Framework of Principles and Guidelines: Responsibilities of Research Institutions and Funders, Data Managers, Learned Societies and Publishers.

Here's an excerpt from the Web page describing the document:

Research data are an increasingly important and expensive output of the scholarly research process, across all disciplines. . . . But we shall realise the value of data only if we move beyond research policies, practices and support systems developed in a different era. We need new approaches to managing and providing access to research data.

In order to address these issues, the RIN established a group to produce a framework of key principles and guidelines, and we consulted on a draft document in 2007. The framework is founded on the fundamental policy objective that ideas and knowledge, including data, derived from publicly-funded research should be made available for public use, interrogation, and scrutiny, as widely, rapidly and effectively as practicable. . . .

The framework is structured around five broad principles which provide a guide to the development of policy and practice for a range of key players: universities, research institutions, libraries and other information providers, publishers, and research funders as well as researchers themselves. Each of these principles serves as a basis for a series of questions which serve a practical purpose by pointing to how the various players might address the challenges of effective data stewardship.