Digital Repositories – Page 28

Preserving Mixed Analog/Digital AV Archives: PrestoSpace Project Case Study

The Digital Curation Centre has published DCC Case Study—PrestoSpace: Preservation towards Storage and Access. Standardised Practices for Audiovisual Contents in Europe.

Here's the "Executive Summary":

Explicit strategies are needed to manage 'mixed' audio visual (AV) archives that contain both analogue and digital materials. The PrestoSpace Project brings together industry leaders, research institutes, and other stakeholders at a European level, to provide products and services for effective automated preservation and access solutions for diverse AV collections. The Project’s main objective is to develop and promote flexible, integrated and affordable services for AV preservation, restoration, and storage with a view to enabling migration to digital formats in AV archives.

Presentations from the Open Access Collections Workshop Now Available

Presentations from the Australian Partnership for Sustainable Repositories' Open Access Collections workshop are now available. Presentations are in HTML/PDF, MP3, and digital video formats. The workshop was held in association with the Queensland University Libraries Office of Cooperation and the University of Queensland Library.

Planets Project Releases White Paper: Representation Information Registries

The Planets (Preservation and Long-term Access through Networked Services) project has released White Paper: Representation Information Registries.

Here's the "Executive Summary":

This document is a report on the state-of-the-art in the field of Representation Information Registries (RIRs). RIRs are widely recognised as a critical component of digital preservation architecture in general, and a number of such registries are being developed as part of the Planets architecture in particular. This document discusses the development of the concept of representation information, and of the use of registries as a means of exposing that information for use by digital preservation services; it describes the RIR implementations which currently exist or are under development globally; it assesses planned and potential future developments in this area; it discusses the role which RIRs play within the Planets project, and concludes with recommendations for future areas of research within Planets and beyond.

Dealing with Research Data in a Federated Digital Repository: Oxford University Planning Document Released

The Oxford e-Research Centre has released Scoping Digital Repository Services for Research Data Management, a project plan for determining the requirements for handling data in a federated digital repository at Oxford University.

Here's an excerpt from the "Aims and Objectives" section:

Objectives:

Capture and document researchers’ requirements for digital repository services to handle research data.

Participate actively in the development of an interoperability framework for the federated digital repository at Oxford.

Make recommendations to improve and coordinate the provision of digital repository services for research data.

Initiate and develop collaborations with the different repository activities already occurring to ensure that communication takes place in between them.

Raise awareness at Oxford of the importance and advantages of the active management of research data.

Communicate significant national and international developments in repositories to relevant Oxford stakeholders, in order to stimulate the adoption of best practices.

Essays from the Core Functions of the Research Library in the 21st Century Meeting

The Council on Library and Information Resources has released essays from its recent Core Functions of the Research Library in the 21st Century meeting.

Here's an excerpt from the meeting home page that lists the essays:

"The Future of the Library in the Research University," by Paul Courant

"Accelerating Learning and Discovery: Refining the Role of Academic Librarians," by Andrew Dillon

"A New Value Equation Challenge: The Emergence of eResearch and Roles for Research Libraries," by Richard E. Luce

"Co-teaching: The Library and Me," by Stephen G. Nichols

"Groundskeepers to Gatekeepers: How to Change Faculty Perceptions of Librarians and Ensure the Future of the Research Library," by Daphnee Rentfrow

"The Research Library in the 21st Century: Collecting, Preserving, and Making Accessible Resources for Scholarship," by Abby Smith

"The Role of the Library in 21st Century Scholarly Publishing," by Kate Wittenberg

"Leveraging Digital Technologies in Service to Culture and Society: The Role of Libraries as Collaborators," by Lee Zia

Fedora Commons Launches HatCheck Newsletter

Fedora Commons has launched HatCheck, a quarterly newsletter about the popular Fedora digital repository software.

Articles in the first issue include "Engineering Punchlist" and "Welcome to HatCheck: A Place to 'Check Your Hat' and Learn More About Fedora Commons."

SEASR (Software Environment for the Advancement of Scholarly Research)

The Andrew W. Mellon Foundation-funded SEASR (Software Environment for the Advancement of Scholarly Research) project is building digital humanities cyberinfrastructure.

Here's an excerpt about the project from its home page:

What can SEASR do for scholars?

help scholars to access existing large data stores more readily

provide scholars with enhanced data synthesis and query analysis: from focused data retrieval and data integration, to intelligent human-computer interactions for knowledge access, to semantic data enrichment, to entity and relationship discovery, to knowledge discovery and hypothesis generation

empower collaboration among scholars by enhancing and innovating virtual research environments

What kind of innovations does SEASR provide for the humanities?

a complete, fully integrated, state-of-the-art software environment for managing structured and unstructured data and analyzing digital libraries, repositories and archives, as well as educational platforms

an open source, end-to-end software system that enables researchers to develop, evolve, and maintain data interoperability, evaluation, analysis, and visualization

Read more about it at "Placing SEASR within the Digital Library Movement."

Helping Researchers Understand and Label Article Versions: VERSIONS Toolkit Released

The VERSIONS (Versions of Eprints—A User Requirements Study and Investigation Of the Need for Standards) project has released the VERSIONS Toolkit.

Here's an excerpt from the "Introduction":

If you are an experienced researcher you are likely to be disseminating your work on a personal website, in a subject archive, or in an institutional repository already. This toolkit aims to:

provide peer-to-peer advice about managing personal versions and revisions in order to keep your options open for future use of your work

clarify areas of uncertainty among researchers about agreements with publishers and how these relate to different versions of research outputs

suggest ways to identify your work clearly when placing it on the web in order to guide your readers to the latest and best versions of your work

direct you to further resources about making versions of your work openly accessible

The toolkit draws on the results of a survey of researchers’ attitudes and current practice when creating, storing and disseminating different versions of their research. As such the guidance in the toolkit represents the views of active researchers. Survey respondents were predominantly from economics and related disciplines.

ORE Specification and User Guide Released

At a March 3rd meeting at the Johns Hopkins University, the Open Archives Initiative (OAI) introduced the Object Reuse and Exchange (ORE) specifications. The ORE Specification and User Guide was released the prior day.

Here's an excerpt from the press release about the meeting:

The ORE specifications are developed in response to a significant challenge that has emerged in eScholarship. In contrast to the paper publications of traditional scholarship, or even their digital counterparts, the artifacts of eScholarship are complex aggregations. These aggregations consist of multiple resources with varying media types, semantics types, network locations, and intra- and inter-relationships. The future scholarly communication, research, and higher education infrastructure requires standardized approaches to identify, describe, and exchange these new outputs of scholarship.

The ORE specifications address this challenge with the ORE data model that defines how to associate an identifier, a URI, with aggregations of web resources. By referring to these identifiers, aggregations can then be linked to, cited, and described with metadata, in the same manner as any web resource. The ORE data model also makes it possible to describe the structure and semantics of these aggregations. The ORE specifications define how these descriptions can then be packaged in the XML-based Atom syndication format or in RDF/XML, making them available to a variety of applications.

In addition to their utility in eScholarship, the ORE specifications also apply to our everyday web use where we often encounter aggregations such as multi-page HTML documents, and collections of multi-format images on sites like flickr. OAI-ORE descriptions of these aggregations can be used to improve search engine behavior, provide input for browser-based navigation tools, and develop automated web services to analyze and preserve this information.

Read more about it at "The Vision of ORE."

Sun Centre of Excellence for Libraries to Be Created in Alberta

Sun Microsystems has announced that it is partnering with the University of Alberta Libraries and the Alberta Library to create the Sun Centre of Excellence for Libraries.

Here's an excerpt from the press release:

Sun Microsystems of Canada Inc., the University of Alberta Libraries (UAL) and The Alberta Library (TAL) today announced the creation of a new Sun Centre of Excellence for Libraries (COE). The initiative will enhance and support respective organizational projects, as well as an extensive, province-wide, multi-faceted digital library. As part of the COE the participants intend to provide a seamless search and retrieval experience resulting in unprecedented access to information for students, faculty and the public, as well as creating an enduring preservation environment.

"This initiative will facilitate new levels of access to a tremendous amount of unique information that hasn’t been widely available," said Ernie Ingles, Vice Provost and Chief Librarian, University of Alberta. "It will further our goal to act as a trusted regional repository for digital materials by facilitating approaches to the discovery, storage, and archival preservation of digital resources that will benefit all Canadians." The University of Alberta Libraries, the second largest academic library system in Canada, has more than one million unique digitized pages of content in four major collections to contribute to the new digital library.

Using a range of Sun systems, software and thin client technologies, The Alberta Library (TAL) will integrate current digital collections and electronic information resources from the Lois Hole Campus Alberta Digital Library, an Alberta Government initiative that is providing post-secondary students, faculty and researchers in every corner of the province with access to vast holdings of digital resources. The digital library currently contains more than 4.5 million licensed items, including academic journals, encyclopedias, magazine and newspaper articles, literary criticisms and video clips from 35 post-secondary institutions. The COE will also help TAL improve province-wide access to library catalogues and secure information-sharing. . . .

The COE will support distance learning and research within e-learning environments by providing access to digital collections preserved by Alberta university libraries, archives and museums. It will also yield solutions for long-term archiving of digital resources, and digital rights management. The support and technology provided by Sun will ensure the infrastructure can evolve to meet future needs and continue to support research, collaborative learning and general discovery. . . .

The Centre of Excellence for Libraries is expected to be operational by summer 2008.

Need Help with DSpace? Use the DSpace Provider Grid

The DSpace Foundation has published the Provider Grid, which lists vetted companies that can provide DSpace customization, design, development, hosting, integration, installation/configuration, metadata import/export, training and analysis, and other services.

JHOVE 1.1 Released: Identification, Validation, and Characterization of Digital Objects

Version 1.1 of the open-source JHOVE (JSTOR/Harvard Object Validation Environment) software has been released.

Here's an excerpt from the project home page that describes JHOVE:

JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects.

Format identification is the process of determining the format to which a digital object conforms; in other words, it answers the question: "I have a digital object; what format is it?"

Format validation is the process of determining the level of compliance of a digital object to the specification for its purported format, e.g.: "I have an object purportedly of format F; is it? . . . ."

Format characterization is the process of determining the format-specific significant properties of an object of a given format, e.g.: "I have an object of format F; what are its salient properties?"

Repository Presentations from the DataShare Project

The DataShare project has released two recent presentations about its activities: "Data Documentation Initiative (DDI)" and "Guidelines and Tools for Repository Planning and Assessment." A recent briefing paper, The Data Documentation Initiative (DDI) and Institutional Repositories, is also available.

Here's a description of the DataShare project from its home page:

DISC-UK DataShare, led by EDINA, arises from an existing UK consortium of data support professionals working in departments and academic libraries in universities (Data Information Specialists Committee-UK), and builds on an international network with a tradition of data sharing and data archiving dating back to the 1960s in the social sciences. By working together across four universities and internally with colleagues already engaged in managing open access repositories for e-prints, this partnership will introduce and test a new model of data sharing and archiving to UK research institutions. By supporting academics within the four partner institutions who wish to share datasets on which written research outputs are based, this network of institution-based data repositories develops a niche model for deposit of 'orphaned datasets' currently filled neither by centralised subject-domain data archives/centres/grids nor by e-print based institutional repositories (IRs).

Presentations from the CETIS Metadata and Digital Repository Special Interest Group Meeting

CETIS (Centre for Educational Technology & Interoperability Standards) has released presentations from its Metadata and Digital Repository Special Interest Group's February 12, 2008 meeting, which featured talks about selected projects in the JISC Repositories and Preservation Programme. SlideShare presentations and MP3 files are available.

The projects covered were Becta Vocabulary Management Services, DC-Education application profile, EnTag (Enhanced Tagging for Discovery), FeedForward, OAI-ORE, and SWORD.

The Data Documentation Initiative (DDI) and Institutional Repositories

JISC’s DataShare project has released The Data Documentation Initiative (DDI) and Institutional Repositories.

Here's an excerpt from the "Introduction":

One of the key issues present in IRs is dealing with the description of items held in the repository. Data are not different from other digital materials and need to be described not just for discovery but also for preservation and reuse. The social science data archiving community has been working for many years on a metadata standard to describe datasets, and a new version is about to be published, the Data Documentation Initiative (DDI) 3.0.

This document reports back from the DDI 3 workshop "Using DDI 3.0 to Support Preservation, Management, Access and Dissemination Systems for Social Science Data" held at the Schloss Dagstuhl in Germany in November 2007. It intends to present the DDI standard to repository managers, data librarians and data managers and provide background information to help them to examine how the DDI fits with developments in their institutional repositories for research-generated data. The report discusses the appropriateness of using the different DDI versions to address the requirements of research data in IRs. It brings together some of the key questions of the DataShare project with regards to access management, linking to other materials and versioning of datasets.

Presentations from the Open Access Collections Workshop

Presentations from the Open Access Collections workshop are now available.

Here are selected presentations:

Muradora Version 1.2.1 Released: Federated Identity and Authorization for Fedora

The DRAMA (Digital Repository Authorization Middleware Architecture) team has released version 1.2.1 of Muradora.

Here's an excerpt from the Muradora home page that describes Muradora:

Muradora is an easy to use repository application that supports federated identity (via Shibboleth authentication) and flexible authorization (using XACML). Muradora leverages the modularity, flexibility and scalability of the well-known Fedora repository.

Muradora's unique vision is one where Fedora forms the core back-end repository, while different front-end applications (such as portlets or standalone web interfaces) can all talk to the same instance of Fedora, and yet maintain a consistent approach to access control.

Read more about it at "Muradora 1.2.1 Release."

DSpace 1.5 Beta Now Available

The first beta version of the DSpace 1.5 repository software is now available.

Read more about at "DSpace 1.5 Beta 1 Released."

VALA 2008 Presentations

Presentations from the VALA 2008 conference are now available.

Here's a selection of presentations:

Michelle McLean has blogged a number of VALA 2008 sessions in Connecting Librarian postings.

iRODS Version 1.0: Data Grids, Digital Libraries, Persistent Archives, and Real-Time Data Systems

The Data-Intensive Computing Environments group at the San Diego Supercomputer Center has released version 1.0 of the open-source iRODS (Integrated Rule-Oriented Data System) system, which can be used to support data grids, digital libraries, persistent archives, and real-time data systems.

Here's an excerpt from the press release:

"iRODS is an innovative data grid system that incorporates and moves beyond ten years of experience in developing the widely used Storage Resource Broker (SRB) technology," said Reagan Moore, director of the DICE group at SDSC. "iRODS equips users to handle the full range of distributed data management needs, from extracting descriptive metadata and managing their data to moving it efficiently, sharing data securely with collaborators, publishing it in digital libraries, and finally archiving data for long-term preservation. . . ."

"You can start using it as a single user who only needs to manage a small stand-alone data collection," said Arcot Rajasekar, who leads the iRODS development team. "The same system lets you grow into a very large federated collaborative system that can span dozens of sites around the world, with hundreds or thousands of users and numerous data collections containing millions of files and petabytes of data—it’s a true full-scale distributed data system." A petabyte is one million gigabytes, about the storage capacity of 10,000 of today’s PCs. . . .

Version 1.0 of iRODS is supported on Linux, Solaris, Macintosh, and AIX platforms, with Windows coming soon. The iRODS Metadata Catalog (iCAT) will run on either the open source PostgreSQL database (which can be installed via the iRODS install package) or Oracle. And iRODS is easy to install—just answer a few questions and the install package automatically sets up the system.

Under the hood, the iRODS architecture stores data on one or more servers, which may be widely separated geographically; keeps track of system and user-defined information describing the data with the iRODS Metadata Catalog (iCAT); and offers users access through clients (currently a command line interface and Web client, with more to come). As directed by iRODS rules, the system can process data where it is stored using applications called "micro-services" executed on the remote server, making possible smaller and more targeted data transfers.

Summa: A Federated Search System

Statsbiblioteket is developing Summa, a federated search system.

Birte Christensen-Dalsgaard, Director of Development, discusses Summa and other topics in a new podcast (CNI Podcast: An Interview with Birte Christensen-Dalsgaard, Director of Development at the State and University Library, Denmark).

Here's an excerpt from the podcast abstract:

Summa is an open source system implementing modular, service-based architecture. It is based on the fundamental idea "free the content from the proprietary library systems," where the discovery layer is separated from the business layer. In doing so, any Internet technology can be used without the limitations traditionally set by proprietary library systems, and there is the flexibility to integrate or to be integrated into other systems. A first version of a Fedora—Summa integration has been developed.

A white paper is available that examines the system in more detail.

Caplan's The Preservation of Digital Materials Published

Library Technology Reports has published The Preservation of Digital Materials, written by noted digital preservation expert Priscilla Caplan (Assistant Director for Digital Library Services at the Florida Center for Library Automation), as its volume 4, number 2 issue.

E-Print Preservation: SHERPA DP: Final Report of the SHERPA DP Project

JISC has released SHERPA DP: Final Report of the SHERPA DP Project.

Here's an excerpt from the "Executive Summary":

The SHERPA DP project (2005–2007) investigated the preservation of digital resources stored by institutional repositories participating in the SHERPA project. An emphasis was placed on the preservation of e-prints—research papers stored in an electronic format, with some support for other types of content, such as electronic theses and dissertations.

The project began with an investigation of the method that institutional repositories, as Content Providers, may interact with Service Providers. The resulting model, framed around the OAIS, established a Co-operating archive relationship, in which data and metadata is transferred into a preservation repository subsequent to it being made available. . . .

The Arts & Humanities Data Service produced a demonstrator of a Preservation Service, to investigate the operation of the preservation service and accepted responsibility for the preservation of the digital objects for a three-year period (two years of project funding, plus one year).

The most notable development of the Preservation Service demonstrator was the creation of a reusable service framework that allows the integration of a disparate collection of software tools and standards. The project adopted Fedora as the basis for the preservation repository and built a technical infrastructure necessary to harvest metadata, transfer data, and perform relevant preservation activities. Appropriate software tools and standards were selected, including JHOVE and DROID as software tools to validate data objects; METS as a packaging standard; and PREMIS as a basis on which to create preservation metadata. . . .

A number of requirements were identified that were essential for establishing a disaggregated service for preservation, most notably some method of interoperating with partner institutions and he establishment of appropriate preservation policies. . . . In its role as a Preservation Service, the AHDS developed a repository-independent framework to support the EPrints and DSpace-based repositories, using OAI-PMH as common method of connecting to partner institutions and extracting digital objects.

JISC Programme Synthesis Study: Supporting Digital Preservation and Asset Management in Institutions

JISC has published JISC Programme Synthesis Study: Supporting Digital Preservation and Asset Management in Institutions: A Review of the 4-04 Programme on Digital Preservation and Asset Management in Institutions for the JISC Information Environment: Part II: Programme Synthesis.. The report covers a number of projects, including LIFE, MANDATE, PARADIGM, PRESERV, and SHERPA DP.

Here's an excerpt from UKOLN News:

Written by Maureen Pennock, DCC researcher at UKOLN, the study provides a comprehensive and categorised overview of the outputs from the entire programme. Categories include training, costs and business models, life cycles, repositories, case studies, and assessment and surveys. Each category includes detailed information on project outputs and references a number of re-usable project-generated tools that range from software services to checklists and guidance.

Stewardship of Digital Research Data: A Framework of Principles and Guidelines

The Research Information Network (RIN) has published Stewardship of Digital Research Data: A Framework of Principles and Guidelines: Responsibilities of Research Institutions and Funders, Data Managers, Learned Societies and Publishers.

Here's an excerpt from the Web page describing the document:

Research data are an increasingly important and expensive output of the scholarly research process, across all disciplines. . . . But we shall realise the value of data only if we move beyond research policies, practices and support systems developed in a different era. We need new approaches to managing and providing access to research data.

In order to address these issues, the RIN established a group to produce a framework of key principles and guidelines, and we consulted on a draft document in 2007. The framework is founded on the fundamental policy objective that ideas and knowledge, including data, derived from publicly-funded research should be made available for public use, interrogation, and scrutiny, as widely, rapidly and effectively as practicable. . . .

The framework is structured around five broad principles which provide a guide to the development of policy and practice for a range of key players: universities, research institutions, libraries and other information providers, publishers, and research funders as well as researchers themselves. Each of these principles serves as a basis for a series of questions which serve a practical purpose by pointing to how the various players might address the challenges of effective data stewardship.