OAI-ORE – DigitalKoans

"Experimental DML over Digital Repositories in Japan"

Takao Namiki, Hiraku Kuroda, and Shunsuke Naruse have self-archived "Experimental DML over Digital Repositories in Japan" in arXiv.org.

Here's an excerpt:

In this paper the authors show an overview of Virtual Digital Mathematics Library in Japan (DML-JP), contents of which consist of metadata harvested from institutional repositories in Japan and digital repositories in the world. DML-JP is, in a sense, a subject specific repository which collaborate with various digital repositories. Beyond portal website, DML-JP provides subject-specific metadata through OAI-ORE. By the schema it is enabled that digital repositories can load the rich metadata which were added by mathematicians.

“Adding eScience Assets to the Data Web”

Herbert Van de Sompel, Carl Lagoze, Michael L. Nelson, Simeon Warner, Robert Sanderson, and Pete Johnston have self-archived "Adding eScience Assets to the Data Web" on arXiv.org.

Here's an excerpt:

Aggregations of Web resources are increasingly important in scholarship as it adopts new methods that are data-centric, collaborative, and networked-based. The same notion of aggregations of resources is common to the mashed-up, socially networked information environment of Web 2.0. We present a mechanism to identify and describe aggregations of Web resources that has resulted from the Open Archives Initiative – Object Reuse and Exchange (OAI-ORE) project. The OAI-ORE specifications are based on the principles of the Architecture of the World Wide Web, the Semantic Web, and the Linked Data effort. Therefore, their incorporation into the cyberinfrastructure that supports eScholarship will ensure the integration of the products of scholarly research into the Data Web.

“An Overview of the OAI Object Reuse and Exchange Interoperability Framework” Presentation

Herbert Van de Sompel has made his recent "An Overview of the OAI Object Reuse and Exchange Interoperability Framework" presentation available on Slideshare. (Thanks to Pintiniblog.)

Indiana University Digital Library Program Releases IN Harmony Sheet Music Cataloging Tool

The Indiana University Digital Library Program has released the IN Harmony Sheet Music Cataloging Tool.

Here's an excerpt from the tool's page:

The IN Harmony Sheet Music Cataloging Tool is an open source tool developed by the Indiana University Digital Library Program with funding from the Institute of Museum and Library Services as part of the IN Harmony: Sheet Music From Indiana project. This tool has been designed to assist libraries, archives, museums, and individual collectors describe their sheet music collections in a robust and standards-based way. This is a production system of the Indiana University Digital Library Program and was used to catalog more than 10,000 pieces of sheet music for the IN Harmony project.

The tool collects descriptive metadata about sheet music and exports it in the MODS, simple Dublin Core, and OAI-PMH Static Repository formats.

“Aligning METS with the OAI-ORE Data Model”

Jerome P. McDonough has made "Aligning METS with the OAI-ORE Data Model" available in IDEALS.

Here's an excerpt:

(OAI-ORE) specifications provide a flexible set of mechanisms for transferring complex data objects between different systems. In order to serve as an exchange syntax, OAI-ORE must be able to support the import of information from localized data structures serving various communities of practice. In this paper, we examine the Metadata Encoding & Transmission Standard (METS) and the issues that arise when trying to map from a localized structural metadata schema into the OAI-ORE data model and serialization syntaxes.

“Using OAI-ORE to Transform Digital Repositories into Interoperable Storage and Services Applications”

The latest issue of The Code4Lib Journal includes "Using OAI-ORE to Transform Digital Repositories into Interoperable Storage and Services Applications."

Here's an excerpt:

In the digital age libraries are required to manage large numbers of diverse objects. One advantage of digital objects over fixed physical objects is the flexibility of ‘binding’ them into publications or other useful aggregated intellectual entities while retaining the ability to reuse them independently in other contexts. An emerging framework for managing flexible aggregations of digital objects is provided by the Open Archives Initiative (OAI) with its work on Object Reuse and Exchange (ORE). This paper will show how OAI-ORE is being used to manage content in digital repositories, in particular institutional repositories, and has the potential ultimately to transform the conception of digital repositories.

Herbert Van de Sompel et al. on “Adding eScience Assets to the Data Web”

Herbert Van de Sompel et al.'s paper on "Adding eScience Assets to the Data Web" is now available on the Linked Data on the Web (LDOW2009) Web site.

Here's an excerpt:

Aggregations of Web resources are increasingly important in scholarship as it adopts new methods that are data-centric, collaborative, and networked-based. The same notion of aggregations of resources is common to the mashed-up, socially networked information environment of Web 2.0. We present a mechanism to identify and describe aggregations of Web resources that has resulted from the Open Archives Initiative-Object Reuse and Exchange (OAI-ORE) project. The OAI-ORE specifications are based on the principles of the Architecture of the World Wide Web, the Semantic Web, and the Linked Data effort. Therefore, their incorporation into the cyberinfrastructure that supports eScholarship will ensure the integration of the products of scholarly research into the Data Web.

A Web-Based Resource Model for eScience: Object Reuse & Exchange

Carl Lagoze, Herbert Van de Sompel, Michael Nelson, Simeon Warner, Robert Sanderson, and Pete Johnston have deposited "A Web-Based Resource Model for eScience: Object Reuse & Exchange" in arXiv.org.

Here's the abstract:

Work in the Open Archives Initiative-Object Reuse and Exchange (OAI-ORE) focuses on an important aspect of infrastructure for eScience: the specification of the data model and a suite of implementation standards to identify and describe compound objects. These are objects that aggregate multiple sources of content including text, images, data, visualization tools, and the like. These aggregations are an essential product of eScience, and will become increasingly common in the age of data-driven scholarship. The OAI-ORE specifications conform to the core concepts of the Web architecture and the semantic Web, ensuring that applications that use them will integrate well into the general Web environment.

Production Release of Object Reuse and Exchange Specifications

The Object Reuse and Exchange (OAI-ORE) project has released the first production version of its Object Reuse and Exchange specifications.

Here's an excerpt from the press release:

These standards provide the foundation for applications and services that can visualize, preserve, transfer, summarize, and improve access to the aggregations that people use in their daily Web interaction: including multiple page Web documents, multiple format documents in institutional repositories, scholarly data sets, and online photo and music collections. The OAI-ORE standards leverage the core Web architecture and concepts emerging from related efforts including the semantic web, linked data, and Atom syndication. As a result, they integrate both with the emerging machine-readable web, Web 2.0, and the future evolution of networked information. . . .

The documents in the release describe a data model to introduce aggregations as resources with URIs on the web. They also detail the machine-readable descriptions of aggregations expressed in the popular Atom syndication format, in RDF/XML, and RDFa.

Podcast: Interview with Herbert van de Sompel

Talis has released a podcast of an interview with Herbert van de Sompel, Digital Library Researcher at the Research Library of the Los Alamos National Laboratory, about SFX, OAI, and digital repositories.

Foresite Project OAI-ORE Resource Maps Software

The Foresite Project has released the foresite-toolkit.

Here's an excerpt from the announcement (footnotes removed):

The Foresite project is pleased to announce the initial code of two software libraries for constructing, parsing, manipulating and serialising OAI-ORE Resource Maps. These libraries are being written in Java and Python, and can be used generically to provide advanced functionality to OAI-ORE aware applications, and are compliant with the latest release (0.9) of the specification. The software is open source, released under a BSD licence, and is available from a Google Code repository . . . .

Foresite is a JISC funded project which aims to produce a demonstrator and test of the OAI-ORE standard by creating Resource Maps of journals and their contents held in JSTOR, and delivering them as ATOM documents via the SWORD interface to DSpace. DSpace will ingest these resource maps, and convert them into repository items which reference content which continues to reside in JSTOR. The Python library is being used to generate the resource maps from JSTOR and the Java library is being used to provide all the ingest, transformation and dissemination support required in DSpace.

OAI-ORE: Object Reuse and Exchange Wiki Established

Rob Sanderson at the University of Liverpool has established the OAI-ORE: Object Reuse and Exchange Wiki, which is now open for content contributions. A preliminary content structure has been set-up for users to populate.

Public Beta of Object Reuse and Exchange Specifications (OAI-ORE) Released

The Open Archives Initiative has released the public beta of Object Reuse and Exchange Specifications.

Here's an excerpt from the press release:

Over the past eighteen months the Open Archives Initiative (OAI), in a project called Object Reuse and Exchange (OAI-ORE), has gathered international experts from the publishing, web, library, and eScience community to develop standards for the identification and description of aggregations of online information resources. These aggregations, sometimes called compound digital objects, may combine distributed resources with multiple media types including text, images, data, and video. The goal of these standards is to expose the rich content in these aggregations to applications that support authoring, deposit, exchange, visualization, reuse, and preservation. Although a motivating use case for the work is the changing nature of scholarship and scholarly communication, and the need for cyberinfrastructure to support that scholarship, the intent of the effort is to develop standards that generalize across all web-based information including the increasing popular social networks of “web 2.0”. The beta version of the OAI-ORE specifications and implementation documents are released to the public on June 2, 2008. These documents describe a data model to introduce aggregations as resources with URIs on the web. They also detail the machine-readable descriptions of aggregations expressed in the popular Atom syndication format, in RDF/XML, and RDFa.

Updated Alpha Version of ORE Specification and User Guide Released

The Open Archives Initiative's Object Reuse and Exchange project has released version 0.3 of the ORE Specification and User Guide.

Read more about it at "OAI-ORE Alpha Specifications Updated."

Two JISC Open Archives Initiative Object Reuse and Exchange Projects

JISC is funding two projects to do small-scale OAI-ORE tests:

TheOREM (Theses with ORE Metadata), at the University of Cambridge, aims to:

Test the applicability of the ORE standard in a realistic scholarly setting—thesis description, submission and publication.

Demonstrate the advantages of the ORE approach in complex object publication, by combining it with existing web-standards compliant technologies.

Provide examples to fully exercise the ORE specifications in order to provide validation and future direction.

FORESITE (Functional Object Reuse and Exchange: Supporting Information Topology Experiments) will create Resource Map descriptions of JSTOR's holdings, and then ingest them into the DSpace institutional repository system via the SWORD protocol, creating external references back to the original files. The description work will be automated, and the system for achieving this implemented at the University of Liverpool. The SWORD protocol will be implemented within DSpace by HP Labs along with other extensions necessary.

For further information, see the FORESITE proposal, A Preview of the TheOREM Project, and the TheOREM proposal.

Project Reports from the Andrew W. Mellon Foundation's 2008 Research in Information Technology Retreat

Project reports from the Andrew W. Mellon Foundation's 2008 Research in Information Technology retreat are now available.

Here are selected project briefing reports:

OAI-ORE for Fedora: Oreprovider Released

Oskar Grenholm of the National Library of Sweden has released oreprovider, an open-source Java application that "will let you disseminate digital objects stored in a Fedora repository as OAI-ORE Resource Maps."

In the announcement, he says:

The idea behind it all is that you have a Java web application (oreprovider.war) that, on the fly, will generate Resource Maps serialized as Atom feeds (using OAI4J) for objects in Fedora. All you have to do in Fedora is to add information in RELS-EXT what datastreams belongs to which Resource Map (exactly how to do this can be seen at the projects web page).

OAI4J: OAI-PMH/OAI-ORE Software

The open source OAI4J software from the National Library of Sweden "can be used to harvest metadata from OAI-PMH compliant repositories. It can also be used to create new OAI-ORE Resource Maps from scratch, to parse existing ones and to serialize them to xml." You can download the client, which is written in Java, from SourceForge.

ORE Specification and User Guide Released

At a March 3rd meeting at the Johns Hopkins University, the Open Archives Initiative (OAI) introduced the Object Reuse and Exchange (ORE) specifications. The ORE Specification and User Guide was released the prior day.

Here's an excerpt from the press release about the meeting:

The ORE specifications are developed in response to a significant challenge that has emerged in eScholarship. In contrast to the paper publications of traditional scholarship, or even their digital counterparts, the artifacts of eScholarship are complex aggregations. These aggregations consist of multiple resources with varying media types, semantics types, network locations, and intra- and inter-relationships. The future scholarly communication, research, and higher education infrastructure requires standardized approaches to identify, describe, and exchange these new outputs of scholarship.

The ORE specifications address this challenge with the ORE data model that defines how to associate an identifier, a URI, with aggregations of web resources. By referring to these identifiers, aggregations can then be linked to, cited, and described with metadata, in the same manner as any web resource. The ORE data model also makes it possible to describe the structure and semantics of these aggregations. The ORE specifications define how these descriptions can then be packaged in the XML-based Atom syndication format or in RDF/XML, making them available to a variety of applications.

In addition to their utility in eScholarship, the ORE specifications also apply to our everyday web use where we often encounter aggregations such as multi-page HTML documents, and collections of multi-format images on sites like flickr. OAI-ORE descriptions of these aggregations can be used to improve search engine behavior, provide input for browser-based navigation tools, and develop automated web services to analyze and preserve this information.

Read more about it at "The Vision of ORE."

Alpha Release of the ORE Specification and User Guide

The Open Archives Initiative Object Reuse and Exchange has released an alpha version of the ORE Specification and User Guide. Comments can be made on the OAI-ORE discussion group or via email to ore@openarchives.org.

Here's an excerpt from the introduction:

The World Wide Web is built upon the notion of atomic units of information called resources that are identified with URIs such as http://www.openarchives.org/ore/0.1/toc (this page). In addition to these atomic units, aggregations of resources are often units of information in their own right. . . .

A mechanism to associate identities with these aggregations and describe them in a machine-readable manner would make them visible to Web agents, both humans and machines. This could be useful for a number of applications and contexts. For example:

Crawler-based search engines could use such descriptions to index information and provide search results sets at the granularity of the aggregations rather than their individual parts.

Browsers could leverage them to provide users with navigation aids for the aggregated resources, in the same manner that machine-readable site maps provide navigation clues for crawlers.

Other automated agents such as preservation systems could use these descriptions as guides to understand a "whole document" and determine the best preservation strategy.

Systems that mine and analyze networked information for citation analysis/bibliometrics could achieve better accuracy with knowledge of aggregation structure contained in these descriptions.

These machine-readable descriptions could provide the foundation for advanced scholarly communication systems that allow the flexible reuse and refactoring of rich scholarly artifacts and their components [Value Chains].

Compound Information Objects: An OAI-ORE Perspective

The Open Archives Initiative—Object Reuse and Exchange has released Compound Information Objects: An OAI-ORE Perspective by Carl Lagoze and Herbert Van de Sompel.

Here’s an excerpt from the document’s "Introduction and Motivation" section:

In summary, the web architecture expresses the notion of linked URI-identified resources. Information systems can leverage this architecture to publish the components of a compound object and thereby make them available to web clients and services. But due to the absence of commonly accepted standards, the notion of an identified compound object with a distinct boundary and typed relationships among its component resources is lost.

The absence of these standards affects the functionality of a number of existing and possible web services and applications. Crawler-based search engines might be more useful if the granularity of their result sets corresponded to compound objects (a book or chapter, in this example) rather than individual resources (single pages). The ranking algorithms of these search engines might improve if the links among the components of a compound object were treated differently than links to the object as a whole, or if the number of in-links to the various component resources was accumulated to the level of the compound object instead of counted separately. Citation analysis systems would also benefit from a mechanism for citing the compound object itself, rather than arbitrary parts of the object. Finally, a standard for representing compound objects might enable a new class of "whole object" services such as "preserve a compound object".

Recent Object Reuse and Exchange (ORE) Documents

In a previous posting, I discussed the Open Archives Initiative’s Object Reuse and Exchange (ORE) project. ORE is worth watching closely.

Two new documents were released this January:

"Report of the January 2007 ORE-TC Meeting," which is: "A detailed report of the results of the meeting of OAI-ORE Technical Committee describing features and requirements of the ORE model and its context in the Web Architecture."
"Open Repositories 2007," which is: "A presentation describing OAI-ORE and progress based on the January 2007 ORE Technical Committee Meeting."

OAI’s Object Reuse and Exchange Initiative

The Open Archives Initiative has announced its Object Reuse and Exchange (ORE) initiative:

Object Reuse and Exchange (ORE) will develop specifications that allow distributed repositories to exchange information about their constituent digital objects. These specifications will include approaches for representing digital objects and repository services that facilitate access and ingest of these representations. The specifications will enable a new generation of cross-repository services that leverage the intrinsic value of digital objects beyond the borders of hosting repositories. . . . its real importance lies in the potential for these distributed repositories and their contained objects to act as the foundation of a new digitally-based scholarly communication framework. Such a framework would permit fluid reuse, refactoring, and aggregation of scholarly digital objects and their constituent parts—including text, images, data, and software. This framework would include new forms of citation, allow the creation of virtual collections of objects regardless of their location, and facilitate new workflows that add value to scholarly objects by distributed registration, certification, peer review, and preservation services. Although scholarly communication is the motivating application, we imagine that the specifications developed by ORE may extend to other domains.

OAI-ORE is being funded my the Andrew W. Mellon Foundation for a two-year period.

Presentations from the Augmenting Interoperability across Scholarly Repositories meeting are a good source of further information about the thinking behind the initiative as is the "Pathways: Augmenting Interoperability across Scholarly Repositories" preprint.