Linking, Linked Data, and Semantic Web – Page 2

"Linked Data Services for Theses and Dissertations"

Thomas Johnson and Michael Boock have self-archived "Linked Data Services for Theses and Dissertations" in ScholarsArchive at Oregon State University.

Here's an excerpt:

This paper details work at Oregon State University to create a Linked Dataset covering the University's theses and dissertations. Using data from existing MARC and Qualified Dublin Core records, we have established a process and model for crosswalking data from existing records into a variety of Semantic Web vocabularies. Our approach is to create basic services on a dedicated thesis and dissertation interface, incrementally extending those available through our institutional repository. We describe services implemented, those in progress and plans for continued work. We also address the limitations of our existing metadata and resulting challenges in crosswalking and interoperability.

| Institutional Repository and ETD Bibliography 2011 | Digital Scholarship |

Open Annotation Core Data Model

The Open Annotation Collaboration has released the draft "Open Annotation Core Data Model."

Here's an excerpt:

The Open Annotation Core Data Model specifies an interoperable framework for creating associations between related resources, annotations, using a methodology which conforms to the Architecture of the World Wide Web. Open Annotations can easily be shared between platforms, with sufficient richness of expression to satisfy complex requirements while remaining simple enough to also allow for the most common use cases, such as attaching a piece of text to a single web resource.

An Annotation is considered to be a set of connected resources, including a body and target, and conveys that the body is somehow about the target. The full model supports additional functionality, enabling semantic tagging, embedding content, selecting segments of resources, choosing the appropriate representation of a resource and providing styling hints for consuming clients.

See also the draft “Open Annotation Extension Specification.”

| Research Data Curation Bibliography | Digital Scholarship |

Nature Publishing Group Launches Linked Data Platform and Puts Data in Public Domain

The Nature Publishing Group has launched a linked data platform.

Here's an excerpt from the press release:

Nature Publishing Group (NPG) today is pleased to join the linked data community by opening up access to its publication data via a linked data platform. NPG's Linked Data Platform is available at http://data.nature.com.

The platform includes more than 20 million Resource Description Framework (RDF) statements, including primary metadata for more than 450,000 articles published by NPG since 1869. In this first release, the datasets include basic citation information (title, author, publication date, etc) as well as NPG specific ontologies. These datasets are being released under an open metadata license, Creative Commons Zero (CC0), which permits maximal use/re-use of this data.

| Digital Scholarship's Digital/Print Books | Digital Scholarship |

Linked Data for Libraries, Museums, and Archives: Survey and Workshop Report

The Council on Library and Information Resources has released Linked Data for Libraries, Museums, and Archives: Survey and Workshop Report.

Here's an excerpt:

In June 2011, Stanford University hosted a group of librarians and technologists to examine issues and challenges surrounding the use of linked data for library applications. This report summarizes the activities and discussions that took place during the workshop, describes what came out of the workshop, outlines next steps identified by the participants, and provides contextual and background information, including preliminary reports and biographies of workshop participants. The workshop report was produced and edited by the participants and staff at Stanford University Libraries.

As background for workshop participants, CLIR commissioned Jerry Persons, technology analyst at Knowledge Motifs and Chief Information Architect emeritus at Stanford, to produce a survey of the linked-data landscape, and the projects and individuals associated with it. The survey focuses on the practical aspects of understanding and applying linked data practices and technologies to the metadata and content of libraries, museums, and archives. There are numerous links in the report and the survey that lead readers to many other sources and examples regarding the use of linked data methods.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

Library Linked Data Incubator Group Final Report

The W3C Incubator Group has released Library Linked Data Incubator Group Final Report.

Here's an excerpt:

Key recommendations of the report are:

That library leaders identify sets of data as possible candidates for early exposure as Linked Data and foster a discussion about Open Data and rights;

That library standards bodies increase library participation in Semantic Web standardization, develop library data standards that are compatible with Linked Data, and disseminate best-practice design patterns tailored to library Linked Data;

That data and systems designers design enhanced user services based on Linked Data capabilities, create URIs for the items in library datasets, develop policies for managing RDF vocabularies and their URIs, and express library data by re-using or mapping to existing Linked Data vocabularies;

That librarians and archivists preserve Linked Data element sets and value vocabularies and apply library experience in curation and long-term preservation to Linked Data datasets.

| Scholarly Electronic Publishing Bibliography 2010 | Digital Scholarship |

Three Persistent Identifier Studies Released

The Knowledge Exchange has released three persistent identifier studies:

Here's an excerpt from the announcement:

The studies have aimed to overcome the confusing variety of existing persistent identifier systems, by analysing the current national URN:NBN and other identifier initiatives; by providing guidelines for an international harmonized persistent identifier framework that serves the long-term preservation needs of the research and cultural heritage communities, and advise these communities about a roadmap to gain the potential benefits. This roadmap also includes a blueprint for an organisation for the distribution and maintenance of the Persistent Identifier infrastructure

| Digital Scholarship | Digital Scholarship Publications Overview | Scholarly Electronic Publishing Bibliography 2010 |

"Why Linked Data is Not Enough for Scientists"

Sean Bechhofer et al. have self-archived "Why Linked Data is Not Enough for Scientists" in the ECS EPrints Repository

Here's an excerpt:

Scientific data stands to represent a significant portion of the linked open data cloud and science itself stands to benefit from the data fusion capability that this will afford. However, simply publishing linked data into the cloud does not necessarily meet the requirements of reuse. Publishing has requirements of provenance, quality, credit, attribution, methods in order to provide the reproducibility that allows validation of results. In this paper we make the case for a scientific data publication model on top of linked data and introduce the notion of Research Objects as first class citizens for sharing and publishing.

OpenURL Link Resolver: SFX 4.0 Released

The Ex Libris Group has released SFX 4.0.

Here's an excerpt from the press release:

Ex Libris® Group . . . is pleased to announce the general release of version 4.0 of its SFX® OpenURL link resolver, already deployed at over 1800 institutions in 53 countries. With the updated and enhanced administrative interface and the redesigned structure of the SFX KnowledgeBase, librarians benefit from streamlined workflows, new functionality, and more frequent KnowledgeBase updates for both hosted and local SFX installations.

New administrative functions—many of which are the direct result of feedback from the customer community—further emphasize the importance that libraries attribute to maintaining full control over the way in which they expose their e-collection to their users and brand the library’s scholarly services. And what’s more, SFX has been keeping up with the times. As the scholarly environment has evolved, configurations have been added to SFX to accommodate changes in library services and the development of new ones, such as the bX article recommender service.

University of Southampton’s School of Electronics and Computer Science Releases All Public Data in Open Linked Data Format

The University of Southampton's School of Electronics and Computer Science has released all of its public data in an open linked data format.

Here's an excerpt from the press release:

In what is believed also to be a world-first, ECS has become the UK’s first University department to release all its public data in open linked data format.

The School of Electronics and Computer Science (ECS) at the University of Southampton is at the forefront of the open linked data initiative through the work of its Professors Sir Tim Berners-Lee and Nigel Shadbolt.

Now, in accordance with the spirit of the initiative, ECS has released all its own data for public reuse. This includes data about research papers in the EPrints archive (announced this in the official global rankings as one of the top ten in the world), people in the School, research groups, teaching modules, seminars and events, buildings and rooms.

All public (RDF) data from rdf.ecs.soton.ac.uk and eprints.ecs.soton.ac.uk is now available and can be reused for any legal purpose, including derivative works and commercial use. The School has opted for a creative commons public domain (CC0) license to allow the data to be reused.

Christopher Gutteridge, ECS Web Projects Manager, comments: "We believe that in the future this will become common practice for certain types of open data, and it is our responsibility to lead the way in setting the standards of best practice."

"We have decided not to make attribution of our data a legal requirement, as this makes it difficult to create large scale mashups."

"So, rather than "MUST attribute", our policy is 'please attribute'. Obviously an attribution would be nice, but we don't want to restrict innovation by requiring it under all circumstances."

Professor Nigel Shadbolt comments: "The University of Southampton has pioneered some of the most important developments in the Semantic Web and Open Access in recent years. This announcement will ensure more data is released in the right format to enable new innovative uses of the information."

"This kind of open data policy will become the standard by which all public institutions are judged. Working with the UK government over the past year Tim Berners-Lee and I have been looking to change everyone's attitude to data. Publicly-held non-personal data is now being released all over the country and as this continues we'll see innovation to exploit it and applications that use it." . . .

More information on the available data from ECS: http://id.ecs.soton.ac.uk/docs/.

"The Semantic Web, Linked and Open Data: A Briefing Paper"

JISC CETIS has released "The Semantic Web, Linked and Open Data: A Briefing Paper."

Here's an excerpt:

This briefing paper will provide a high level overview of key concepts relating to the Semantic Web, semantic technologies, linked and open data; along with references to relevant examples and standards. The briefing is intended to provide a starting point for those within the teaching and learning community who may have come across the concept of semantic technologies and the Semantic Web but who do not regard themselves as experts and wish to learn more. The examples and links are intended as starting points for further exploration.

Knowledge = Information in Context

Europeana has released Knowledge = Information in Context.

Here's an excerpt from the announcement:

Europeana's first White Paper looks at the key role linked data will play in Europeana's development and in helping Europe's citizens make connections between existing knowledge to achieve new cultural and scientific developments. Without linked data, Europeana could be seen as a simple collection of digital objects. With linked data, the potential is far greater, as the author of the white paper, Prof. Stefan Gradmann, explains.

NISO Recommended Practice: KBART: Knowledge Bases and Related Tools

NISO has released KBART: Knowledge Bases and Related Tools (NISO RP-9-2010).

Here's an excerpt from the announcement:

UKSG and NISO are pleased to announce the first report by the KBART (Knowledge Bases and Related Tools) Working Group, a joint initiative that is exploring data problems within the OpenURL supply chain. The KBART Recommended Practice (NISO RP-9-2010) contains practical recommendations for the timely exchange of accurate metadata between content providers and knowledge base developers.

The KBART Recommended Practice, a report from Phase I of the KBART project, provides all parties in the information supply chain with straightforward guidance about the role of metadata within the OpenURL linking standard, and recommends data formatting and exchange guidelines for publishers, aggregators, agents, technology vendors, and librarians to adhere to when exchanging information about their respective content holdings.

Shared OpenURL Data Infrastructure Investigation: Final Report

JISC has released the Shared OpenURL Data Infrastructure Investigation: Final Report.

Here's an excerpt:

The project team set out to gain a good understanding of the technical, legal, and administrative challenges and opportunities related to sharing and using OpenURL link server data and to assess the relative and complementary value of data from the OpenURL router and from OpenURL resolvers within institutions by gathering and inspecting those data. We also sought to explore potential uses of these data through consultation and through manipulating the sample data available. Our conclusions are organised by four themes: (1) the level of interest and viability of services based on aggregated OpenURL data; (2) libraries' willingness to share data; (3) the availability of OpenURL resolver usage data; and (4) the value of the OpenURL Router as a source of data on which useful services may be built.

“RKBExplorer: Repositories, Linked Data and Research Support”

Hugh Glaser, Ian Millard, and Les Carr have self-archived "RKBExplorer: Repositories, Linked Data and Research Support" in the ECS EPrints Repository.

Here's an excerpt:

RKBExplorer (http://rkbexplorer.com/) is a system for publishing Linked Data to Semantic Web standards, also providing a browser that allows users to explore this interlinked Web of Data, primarily in the domain of scientific endeavour. As part of the activity, we have harvested the metadata from a number of the larger ePrints repositories into http://eprints.rkbexplorer.com, and republished it as Linked Data. This allows the RKBExplorer browser to present a unified view of these repositories and related data from other sources such as dblp and dbpedia (a Semantic Web version of Wikipedia). Users can thus investigate concepts related to the ePrints people and articles, such as related people, projects and institutions.

Podcast: Interview with Herbert van de Sompel

Talis has released a podcast of an interview with Herbert van de Sompel, Digital Library Researcher at the Research Library of the Los Alamos National Laboratory, about SFX, OAI, and digital repositories.

Handle System Workshop Presentations Available

Presentations from the Corporation for National Research Initiatives' Handle System Workshop are now available.

Here's a description of he Handle System from its home page:

The Handle System is a general purpose distributed information system that provides efficient, extensible, and secure HDL identifier and resolution services for use on networks such as the Internet. It includes an open set of protocols, a namespace, and a reference implementation of the protocols. The protocols enable a distributed computer system to store identifiers, known as handles, of arbitrary resources and resolve those handles into the information necessary to locate, access, contact, authenticate, or otherwise make use of the resources. This information can be changed as needed to reflect the current state of the identified resource without changing its identifier, thus allowing the name of the item to persist over changes of location and other related state information.

reSearcher: Open Source Citation Management, Federated Searching, Link Resolution, and Serials Management

Simon Fraser University Library's Linux-based reSearcher, which is widely used in Canada, is an open source software suite that includes:

Citation Manager: "Citation Manager allows faculty, students and staff to quickly and accurately capture citations or references from library resources into their own personal, online database."
CUFTS (serials management): "As a knowledgebase of over 375 fulltext resources, CUFTS provides Electronic Resource Management services, an integrated serials database, link resolving, and MARC records for your library."
dbWIZ (federated searching): "dbWiz provides library users with a single interface for searching a wide range of library resources, and returns records in an integrated result listing."
GODOT (link resolution): "Launched from a link embedded in your library's citation databases or other resources, GODOT provides direct links to your fulltext collections, using the CUFTS knowledge base, and also reveals holdings in your catalogue or in other locations."

Digital Library Federation and 10 Vendors/Developers Reach Accord about ILS Basic Discovery Interfaces

Ten vendors and application developers have agreed to support standard ILS interfaces that will permit integration and interoperability with emerging discovery services. These interfaces will be developed by the Digital Library Federation's ILS-Discovery Interface Committee. The participants are AquaBrowser, BiblioCommons, California Digital Library, Ex Libris, LibLime, OCLC, Polaris Library Systems, SirsiDynix, Talis, and VTLS.

Here's an excerpt from the announcement:

On March 6, representatives of the Digital Library Federation (DLF), academic libraries, and major library application vendors met in Berkeley, California to discuss a draft recommendation from the DLF for standard interfaces for integrating the data and services of the Integrated Library System (ILS) with new applications supporting user discovery. Such standard interfaces will allow libraries to deploy new discovery services to meet ever-growing user expectations in the Web 2.0 era, take full advantage of advanced ILS data management and services, and encourage a strong, innovative community and marketplace in next-generation library management and discovery applications.

At the meeting, participants agreed to support a set of essential functions through open protocols and technologies by deploying specific recommended standards.

These functions are:

Harvesting. Functions to harvest data records for library collections, both in full, and incrementally based on recent changes. Harvesting options could include either the core bibliographic records, or those records combined with supplementary information (such as holdings or summary circulation data). Both full and differential harvesting options are expected to be supported through an OAI-PMH interface.

Availability. Real-time querying of the availability of a bibliographic (or circulating) item. This functionality will be implemented through a simple REST interface to be specified by the ILS-DI task group.

Linking. Linking in a stable manner to any item in an OPAC in a way that allows services to be invoked on it; for example, by a stable link to a page displaying the item's catalog record and providing links for requests for that item. This functionality will be implemented through a URL template defined for the OPAC as specified by the ILS-DI task group.

Citation, Location, and Deposition in Discipline & Institutional Repositories

The JISC CLADDIER project has published Citation, Location, and Deposition in Discipline & Institutional Repositories: CLADDIER Project Report III, Recommendations for Data/Publication Linkage.

Here's an excerpt from the abstract:

A key aim of the CLADDIER project is to investigate the cross-linking and citation of resources (in particular data and their associated publications) held in institutional and subject-based repositories within the research sector. Typically traditional citations are partial in that they are "backward citations", referring to work which influenced the current research, and they only cite other formal publications, ignoring other artefacts which are the output of research, in particular research data. Online repositories storing more dynamic digital objects gives the opportunity to provide a more complete picture of the relationships between them, with backward and forward citations to data and publications being propagated between repositories.

This report motivates the cross-citations of data from the CLADDIER use case example, and considers the approaches which have been implemented to harvest and propagate citation information. Most of these existing approaches depend on centralised services, which were considered unsatisfactory in an environment where independent repositories wish to maintain control of their resources and do not wish to be dependant on third-party services. Criteria are identified for building a Citation Notification Service to propagate citation references and links between repositories, including using a peer-to-peer protocol. A number of different architectures are proposed and evaluated.

The requirement for a light-weight peer-to-peer service which is as widely applicable as possible lead to the selection of Linkback services, in particular Trackback which provides an existing simple specification which can be implemented quickly and adapted to the requirements of citation notification. A detailed description the Trackback protocol is then given, together with the design of the adaptations and extensions identified as required for citation notification. This extended Trackback protocol has been implemented in the STFC ePubs institutional repository; this implementation is described and a use case is described.

Geoffrey Bilder has commented on the report in "CLADDIER Final Report."

OpenURL Referrer Now Available for Both Firefox and IE

OCLC now offers its OpenURL Referrer plug-in for both Firefox and Internet Explorer. It provides OpenURL links to local electronic resources for COinS-enabled (Context Objects in Spans) Web documents, Google Scholar search results, and Google News Archive search results.

Link Resolvers and the Serials Supply Chain Report

The UK Serials Group has issued a report by James Culling titled Link Resolvers and the Serials Supply Chain: Final Report for UKSG.

Here’s a summary of major issues and barriers from the "Summary of Findings":

Whilst some content providers are very aware of the role of link resolvers and the significance of data feeds to them for driving traffic to their content, there remains a significant number that do not make their collection details available to resolver suppliers at all, simply through not realising that this is a desirable thing to do.

Whilst some content providers are very aware of the role of link resolvers and the significance of data feeds to them for driving traffic to their content, there remains a significant number that do not make their collection details available to resolver suppliers at all, simply through not realising that this is a desirable thing to do.

Whilst link resolver suppliers state that the level of co-operation from some publishers is still not all that it might be, many publishers comment that a lack of open engagement and transparency regarding knowledge base requirements from the link resolver suppliers (as a group) has been problematic for them.

Where data is provided to link resolver suppliers and libraries by content providers, a lack of understanding or appreciation as to the use to which the data will be put may be a factor in incompleteness and inaccuracy.

Most of the link resolver suppliers have separately invested much time and staff resource in working around difficulties with data from content providers, rather than trying to address the problems at source. Many have concluded that full text aggregators in particular focus their energies in other areas and metadata accuracy is never (voluntarily at least) going to be of high concern to them.

Competition between organisations in the supply chain sometimes hinders co-operation and data sharing.

There is a lack of clarity and transparency in the supply chain regarding: standards for data formats, expected frequency of data updates, construction of inbound linking syntaxes and OpenURL support. These issues hinder broader adoption and limit the pace of information transfer through the supply chain, restricting the potential of link resolver systems.

Whilst the communityâ€™s attention has been mostly focused on what it means to be OpenURL compliant, a code of practice and information standards to ensure optimal knowledge base compliance have been sorely absent and overlooked.