Archive for the 'Metadata' Category

CrossRef’s Geoffrey Bilder on Author Identifiers

Posted in Metadata on April 15th, 2009

Gobbledygook has interviewed CrossRef's Geoffrey Bilder about author identifiers.

Here's an excerpt:

Of course, lots of the same issues can be raised with CrossRef, right? What guarantees that CrossRef won’t become evil and co-opt all of our identities? This, of course is the big fear underlining the knee-jerk reaction against "centralized systems" in favor of "distributed systems". The problem with this, as I mentioned in the FriendFeed thread is that my personal and unfashionable observation is that "distributed" begets "centralized." For every distributed service created, we’ve then had to create a centralized service to make it useable again (ICANN, Google, Pirate Bay, CrossRef, DOAJ, ticTocs, WorldCat, etc.). This gets us back to square one and makes me think the real issue is- how do you make the centralized system that eventually emerges accountable? This is, of course, a social issue more than a technical issue and involves making sure that whatever entity emerges has clearly defined data portability policies and a "living will" that attempts to guarantee that the service can be run in perpetuity- even if by another organization. For the record, I don’t think adopting the slogan "don’t be evil" is enough ;).

Share

“On the Communication of Scientific Results: The Full-Metadata Format”

Posted in Metadata on April 14th, 2009

Moritz Riede, Rico Schueppel, Kristian O. Sylvester-Hvid, et al. have self-archived "On the Communication of Scientific Results: The Full-Metadata Format" in arXiv.

Here's an excerpt:

In this paper, we introduce a scientific format for text-based data files, which facilitates storing and communicating tabular data sets. The so-called Full-Metadata Format builds on the widely used INI-standard and is based on four principles: readable self-documentation, flexible structure, fail-safe compatibility, and searchability. As a consequence, all metadata required to interpret the tabular data are stored in the same file, allowing for the automated generation of publication-ready tables and graphs and the semantic searchability of data file collections. The Full-Metadata Format is introduced on the basis of three comprehensive examples. The complete format and syntax is given in the appendix.

Share

Indiana University Digital Library Program Releases IN Harmony Sheet Music Cataloging Tool

Posted in Metadata, OAI-ORE, Open Source Software on April 14th, 2009

The Indiana University Digital Library Program has released the IN Harmony Sheet Music Cataloging Tool.

Here's an excerpt from the tool's page:

The IN Harmony Sheet Music Cataloging Tool is an open source tool developed by the Indiana University Digital Library Program with funding from the Institute of Museum and Library Services as part of the IN Harmony: Sheet Music From Indiana project. This tool has been designed to assist libraries, archives, museums, and individual collectors describe their sheet music collections in a robust and standards-based way. This is a production system of the Indiana University Digital Library Program and was used to catalog more than 10,000 pieces of sheet music for the IN Harmony project.

The tool collects descriptive metadata about sheet music and exports it in the MODS, simple Dublin Core, and OAI-PMH Static Repository formats.

Share

“Aligning METS with the OAI-ORE Data Model”

Posted in Metadata, OAI-ORE on April 8th, 2009

Jerome P. McDonough has made "Aligning METS with the OAI-ORE Data Model" available in IDEALS.

Here's an excerpt:

(OAI-ORE) specifications provide a flexible set of mechanisms for transferring complex data objects between different systems. In order to serve as an exchange syntax, OAI-ORE must be able to support the import of information from localized data structures serving various communities of practice. In this paper, we examine the Metadata Encoding & Transmission Standard (METS) and the issues that arise when trying to map from a localized structural metadata schema into the OAI-ORE data model and serialization syntaxes.

Share

Special Issue of Library Trends on Institutional Repositories

Posted in Big Data, Data Curation, Open Data, and Research Data Management, Digital Curation/Digital Preservation, Digital Repositories, Institutional Repositories, Metadata on March 30th, 2009

The latest issue of Library Trends (57, no. 2, Fall 2008) is about institutional repositories.

Here are the articles (links are to article preprints):

  • "Introduction: Institutional Repositories: Current State and Future"
  • "Innkeeper at the Roach Motel"
  • "Institutional Repositories in the UK: The JISC Approach"
  • "Strategies for Institutional Repository Development: A Case Study of Three Evolving Initiatives"
  • "Perceptions and Experiences of Staff in the Planning and Implementation of Institutional Repositories"
  • "Institutional Repositories and Research Data Curation in a Distributed Environment"
  • "At the Watershed: Preparing for Research Data Management and Stewardship at the University of Minnesota Libraries"
  • "Case Study in Data Curation at Johns Hopkins University"
  • "Describing Scholarly Works with Dublin Core: A Functional Approach"
  • "The 'Wealth of Networks' and Institutional Repositories: MIT, DSpace, and the Future of the Scholarly Commons"
  • "Leveraging Short-term Opportunities to Address Long-term Obligations: A Perspective on Institutional Repositories and Digital Preservation Programs"
  • "Shedding Light on the Dark Data in the Long Tail of Science"
Share

Andy Powell on Persistent URIs and Digital Repositories

Posted in Digital Repositories, Institutional Repositories, Metadata on March 4th, 2009

In “How Uncool? Repository URIs. . .,” Andy Powell analyzes URI structure in 107 repositories to determine whether their items’ URIs are likely to be persistent..

Here's an excerpt:

So what is an uncool URI? An uncool URI is one that is unlikely to be persistent, typically because the person who first assigned it didn’t think hard enough about likely changes in organisational structure, policy or technology and the impact that changes in those areas might have on the persistence of the URI into the future.

Share

Automatic Metadata Generation for Repositories: MetaTools: Final Report

Posted in Digital Repositories, Institutional Repositories, Metadata on February 26th, 2009

JISC has released MetaTools: Final Report .

Here's an excerpt from the announcement:

Automatic metadata generation has sometimes been posited as a solution to the 'metadata bottleneck' that repositories and portals are facing as they struggle to provide resource discovery metadata for a rapidly growing number of new digital resources. Unfortunately there is no registry or trusted body of documentation that rates the quality of metadata generation tools or identifies the most effective tool(s) for any given task.

The aim of the first stage of the project was to remedy this situation by developing a framework for evaluating tools used for the purpose of generating Dublin Core metadata. . . .

A test program was then implemented using metrics from the framework. It evaluated the quality of metadata generated from 1) Web pages (html) and 2) scholarly works (pdf) by four of the more widely-known metadata generation tools—Data Fountains, DC-dot, SamgI, and the Yahoo! Term Extractor. . . .

It was found that the output from Data Fountains was generally superior to that of the other tools that the project tested. But the output from all of the tools was considered to be disappointing and markedly inferior to the quality of metadata that Tonkin and Muller report that PaperBase has extracted from scholarly works. Over all, the prospects for generating high-quality metadata for scholarly works appear to be brighter because of their more predictable layout. . . .

In the third stage of the project SOAP and RESTful Web Service interfaces were developed for three metadata generation tools—Data Fountains, SamgI and Kea. This had a dual purpose. Firstly, the creation of an optimal metadata record usually requires the merging of output from several tools each of which, until now, had to be invoked separately because of the ad hoc nature of their interfaces. As Web services, they will be available for use in a network such as the Web with well-defined interfaces that are implementation-independent. These services will be exposed for use by clients without them having to be concerned with how the service will execute their requests. Repositories should be able to plug them into their own cataloguing environments and experiment with automatic metadata generation under more 'real-life' circumstances than hitherto. Secondly, and more importantly (in view of the relatively poor quality of current tools) they enabled the project to experiment with the use of a high-level ontology for describing metadata generation tools.

Share

ARL Report: Ad Hoc Task Force to Review the Proposed OCLC Policy for Use and Transfer of WorldCat Records

Posted in ARL Libraries, Metadata, OCLC on February 22nd, 2009

The Association of Research Libraries has released Ad Hoc Task Force to Review the Proposed OCLC Policy for Use and Transfer of WorldCat Records: Final Report to the ARL Board.

Here's an excerpt from the press release:

OCLC's release of the policy elicited serious concern from the ARL community, as well as the broader library community. As a result, ARL established an ad hoc task force to review the policy and identify issues of particular interest to research libraries. The task force report includes a brief overview of the policy and the task force's understanding of the policy's intent. This is followed by an explication of specific issues and the task force's findings regarding both the policy itself and the implementation process. The report concludes with recommendations for OCLC and the library community.

Share

Metadata Tool under Development: Update on Duke's Project Trident

Posted in Digital Repositories, Metadata on February 17th, 2009

Will Sexton has posted an update about Duke University Libraries' Project Trident, which is developing a metadata tool for use with its digital repository.

Here's an excerpt:

Our specifications essentially treat the Editor as a client of the Repository, and the Repository not as an implementation, but as an API with RESTful bindings. . . .

We took this approach to promote the modularity of the Editor tool. Other organizations or institutions may choose to implement the RESTful API “on top of” their local repository implementation, and adopt the Trident Editor for their own needs. We intend to implement our Repository module using Fedora, but the Editor module should be reusable with a variety of repository platforms.

Share

Briefing Paper on UMID—Unique Material Identifier

Posted in Digital Curation/Digital Preservation, Metadata on February 13th, 2009

DigitalPreservationEurope has released a briefing paper on UMID—Unique Material Identifier

Here's an excerpt from the announcement:

A Unique Material Identifier (UMID) is a special code that is used to identify audiovisual (AV) materials. The UMID is a core component in tagging AV content to enable its reliable access and tracking, especially in networked storage, production and dissemination systems. Its purpose is to provide unambiguous identification of material through the production and emission chain, as well as to make possible the reliable linking of essence with its metadata. The UMID is a locally generated and globally unique identifier, standardised by the Society of Motion Picture and Television Engineers (SMPTE), and presents a key component in digital media asset management systems.

Share

LC Releases Understanding PREMIS

Posted in Digital Curation/Digital Preservation, Metadata on February 3rd, 2009

The Library of Congress Network Development and MARC Standards Office has released Understanding PREMIS.

Here's an excerpt:

This guide is a relatively brief overview of the PREMIS preservation metadata standard. It will not give you enough information to implement PREMIS, but it will give you some idea of what PREMIS is all about. For many readers, this will be enough. For those who do need to master the PREMIS Data Dictionary for Preservation Metadata, this guide may serve as a gentle introduction that makes the larger document feel more familiar.

Share

New from DLF: Future Directions in Metadata Remediation for Metadata Aggregators

Posted in Metadata on February 2nd, 2009

The Digital Library Federation has published Future Directions in Metadata Remediation for Metadata Aggregators.

Here's an excerpt:

With support from The Gladys Krieble Delmas Foundation, the Digital Library Federation embarked on a project to inventory existing tools and services for metadata mapping, remediation, and enhancement. Once identified, tools were evaluated for general applicability across digital library and other cultural heritage environments.

The results of the research show that a handful of tools are usable as-is, but many tools need more work to be generally applicable in a variety of environments and significant development would be required to create a robust and well-defined set of metadata remediation services. Key points of note:

  • Relatively few tools are available that can work directly on metadata records rather than full text, and those that are available need to be customized for each aggregator.
  • Workable tools are available for date normalization, and also for normalizing and matching coordinates to U.S. geographic names.
  • A statistical topic model program for subject clustering has been developed.
  • Both named entity and topical keyword extraction remain problematic, with a fairly high percentage of errors.
  • Authority files may be used to break up pre-coordinated Library of Congress subject strings into topical, name, geographic, temporal, and genre facets to improve searching.
  • Mappings between different thesauri, which should allow for better search processing in aggregations containing multiple subject vocabularies, are still under development.
  • Infrastructure for work collocation, appropriate to aggregators with significant published materials, is still underdeveloped and will probably need to wait for the widespread adoption of the new standard for resource description, Resource Description and Access (RDA).
  • Unambiguous identifiers for entities such as names and works would be useful when the community infrastructure is developed, but are not yet supported by most metadata formats.
  • Unambiguous, machine-actionable rights statements are also an area where the community infrastructure is still under development.
Share

Page 5 of 11« First...34567...10...Last »

DigitalKoans

DigitalKoans

Digital Scholarship

Copyright © 2005-2012 by Charles W. Bailey, Jr.

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.