Australian National University's Harvester Service Released

The Australian National University has released its Harvester Service.

Here's an excerpt from the announcement:

The Harvester Service is a proxy harvester for processing and routing OAI-PMH Data Provider responses to various applications. It is intended it be used for integration with other applications requiring a harvesting service.

EM-Loader: Making Self-Archiving Easier

Building on the work of the SWORD Project, the EM-Loader project will build software that allows authors to use the metadata from PublicationsList.org to deposit works in the Depot.

Here's an excerpt from the announcement:

We will show proof of concept at an early stage by building a web service module that connects two existing services: the Depot, the JISC repository for researchers who do not have other provision; and PublicationsList.org, a service for researchers to build a web page listing their publications. Instead of recreating interoperability standards from scratch, the project has adopted and expanded the SWORD Deposit API.

In our revised approach we suggest that depositing papers into repositories can be made easier and rewarding for researchers by concentrating initially on compiling a personal publications list with complete metadata and then performing a batch submission to the repository.

Traditionally stage 1—compiling a personal bibliography—is by manual entry, but this can be made much easier with batch search and select of items from citation databases such as Web of Science and PubMed, and import from personal bibliography tools such as BibTeX, EndNote and Reference Manager. Full text of papers can be uploaded and attached to metadata in stage 2 (typically in PDF or DOC formats).

Functionality for stages 1 and 2 already exists and is provided to this project through PublicationsList.org. The main focus of our project activity is to build the workflow to enable all the structured metadata to be forwarded to the appropriate repository, alongside the associated digital object (full text) where available.

Read more about it at: EM-Loader and the EM-Loader proposal.

(Thanks to Open Access News.)

D-NET Version 1.0 Released by Digital Repository Infrastructure Vision for European Research

DRIVER (Digital Repository Infrastructure Vision for European Research) has released version 1.0 of D-NET.

Here's an excerpt from the announcement:

The first of its kind, this open source software offers a tool-box for deploying a customizable distributed system featuring tools for harvesting and aggregating heterogeneous data sources. A variety of end-user functionalities are applied over this integration, ranging from search, recommendation, collections, profiling to innovative tools for repository manager users. . . .

The DRIVER D-NET v. 1.0 software is released under the Open Source Apache license with accompanying documentation, and with (limited to capacity) technical support by the DRIVER Consortium technical partners. . . .

In particular, the DRIVER software can be used for two main reasons:

  • Deploying new services on top of an operational DRIVER infrastructure Running instances of the DRIVER Infrastructure can be enriched in any moment with new service instances so as to empower or expand the available functionalities. Examples are:
    1. Deployment and configuration of customized portals for designated communities over the aggregated data (e.g. a portal over national repositories or over subject-driven content, such as RECOLECTA and DART Europe DEEP above);
    2. Deployment of new aggregation services so as to distribute and delegate harvesting and aggregating activities to specialized DRIVER National or Community Correspondents, carrying out their tasks over an assigned selection of repositories.
  • Deploying a new DRIVER infrastructure to serve other service providers and communities

CNI Spring 2008 Task Force Meeting Presentations

Presentations and project briefings from the CNI Spring 2008 Task Force Meeting are available. Podcast interviews with a few attendees are also available.

Here's a selection of project briefings:

Digital Research Data Curation: Overview of Issues, Current Activities, and Opportunities for the Cornell University Library

Cornell University Library's Data Working Group has deposited its Digital Research Data Curation: Overview of Issues, Current Activities, and Opportunities for the Cornell University Library report in the eCommons@Cornell repository.

Here's the abstract:

Advances in computational capacity and tools, coupled with the accelerating collection and accumulation of data in many disciplines, are giving rise to new modes of conducting research. Infrastructure to promote and support the curation of digital research data is not yet fully-developed in all research disciplines, scales, and contexts. Organizations of all kinds are examining and staking out their potential roles in the areas of cyberinfrastructure development, data-driven scholarship, and data curation. The purpose of the Cornell University Library's (CUL) Data Working Group (DaWG) is to exchange information about CUL activities related to data curation, to review and exchange information about developments and activities in data curation in general, and to consider and recommend strategic opportunities for CUL to engage in the area of data curation. This white paper aims to fulfill this last element of the DaWG's charge.

Solr Search Engine Plug-In for Fedora Released

The DRAMA team has released a Solr plug-in for Fedora.

Here's a description of Solr from its home page:

Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. It runs in a Java servlet container such as Tomcat.

CIC Shared Digital Repository Project Update

A recently updated description of the Committee on Institutional Cooperation's Shared Digital Repository Project is available at Indiana University's Project: Shared Digital Repository page.

Here's an excerpt:

Description: The Shared Digital Repository (SDR) leverages the tradition of leadership in collaboration among the institutions of the Committee on Institutional Cooperation (CIC). The SDR operates under the leadership of the Repository Administrators (Indiana University and the University of Michigan), which also provide a large part of the funding. Additional governance and financial support are provided by the charter participating libraries of the CIC, and by other libraries and library consortia wishing to archive digital content.

Outcome: The SDR offers persistent and high-availability storage for digitized book and journal content, beginning with the Google content from the CIC members and later extending to other digitized content. The SDR will leverage technology investments and developments at the University of Michigan to build (through IU/UM collaboration) more generalized versions of Michigan's services and gain efficiencies from Michigan's investments. . . .

Milestones and status:

As of April 11, 2008, the SDR contains:

  • 1,122,007 volumes
  • 791,460 titles
  • approximately 393 million pages
  • 213,379 individual volumes in the public domain (19% of the total)

Timeline:

  • Early 2008: Bloomington backup storage installed
  • January-March 2008: Page turner mechanism with branding; ability to publish virtual collections (UM-specific version); assessment of global searching functionality; access mechanisms for persons with visual disabilities
  • September-December 2008: Mechanism for direct ingest of non-Google content; compliance with the required elements in the "Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist"

Aberystwyth University Launches CADAIR Institutional Repository

Aberystwyth University has launched CADAIR, its DSpace-based institutional repository.

Here's an excerpt from the press release:

The new service has been developed by the Subject Support and E-Library team in Information Services, led by Dr Talat Chaudhri and Stuart Lewis.

A successful two year pilot project, during which the team worked closely with the Departments of Computer Science and Information Studies, and the Institute of Mathematics and Physics, was concluded in early 2008. Currently the site features approximately 500 academic papers and dissertations by taught masters and PhD students.

Second Beta Version of Fedora 3.0 Released

The Fedora Commons has released the second beta version of Fedora 3.0.

Here's an excerpt from the announcement:

Fedora 3.0 features the Content Model Architecture (CMA), an integrated structure for persisting and delivering the essential characteristics of digital objects in Fedora. . . . The Fedora CMA plays a central role in the Fedora architecture, in many ways forming the over-arching conceptual framework for future development of Fedora Repositories.

Like a well-thumbed book on a shelf, digital content is stored with the expectation that intellectual works will be the same each time they are accessed, whether the content was put away yesterday, or many years ago. Fedora is a simple, flexible and evolvable approach to delivering and sharing the "essential characteristics" of enduring digital content. Librarians, archivists, records managers, media producers, authors and publishers use patterns of expression formats such as books, journals, articles, collections to convey the essential characteristics of content. The capabilities of digital tools combined with essential characteristics of digital works result in well-understood patterns of expression for different types of content models.

The software engineering community also utilizes patterns of expression for the development of complex computer systems. The same concepts that satisfy agile IT infrastructures can help provide solutions for creating, accessing and preserving content. The Fedora CMA builds on the Fedora architecture-downloaded more than 18,000 times in the last 12 months—to simplify use while unlocking potential.

Dan Davis explains the CMA in the context of Fedora 3.0, "It's a hybrid. The Fedora CMA handles content models that are used by publishers and others, and is also a computer model that describes an information representation and processing architecture." By combining these viewpoints, Fedora CMA has the potential to provide a way to build an interoperable repository for integrated information access within organizations and to provide durable access to our intellectual works.

UK ETD Support: Updated EThOS Toolkit Released

The EThOSnet Project has released an updated version of the EThOS Toolkit.

Here's an excerpt from the announcement:

In addition to full details of how your institution can participate, the interactive Toolkit provides practical information on how theses can be produced by students at your Institution so they can be accessed via EThOS and from your Institutional Repository. Accessed from its new location at http://ethostoolkit.cranfield.ac.uk the toolkit provides guidance on:

  • Putting forward the case for the importance of electronic theses (Culture Change)
  • Outlining the business case including information on which participation options suit (Business Needs)
  • Clear standards provided on technical requirements (Technical Requirements)
  • Practical materials and templates to be used for authors and supervisors in contributing to EThOS (Training and Guidance)

Presentations from APSR Workshop about Author Identity Management in Scholarly Communication Systems

The Australian Partnership for Sustainable Repositories has released presentations from its Identifying Researchers workshop. Both PDF and MP3 files are available.

Here's an excerpt from the workshop's web page:

The issue of managing researcher and author identities is a significant one that has an impact on a range of situations including, but not limited to, scholarly communications. This is an issue not only for researchers who nowadays interact with multiple identity and security systems but also for scholarly communications where the need to accurately identify authors and describe their scholarly resources is increasing in importance.

BagIt: New LC/CDL Format for Transferring Digital Content between Cultural Institutions

The Library of Congress and the California Digital Library have established a new format called BagIt for transferring large data collections between cultural institutions.

Read more about it at "The BagIt File Package Format (V0.94)" and "Library Develops Format for Transferring Digital Content."

Foresite Project OAI-ORE Resource Maps Software

The Foresite Project has released the foresite-toolkit.

Here's an excerpt from the announcement (footnotes removed):

The Foresite project is pleased to announce the initial code of two software libraries for constructing, parsing, manipulating and serialising OAI-ORE Resource Maps. These libraries are being written in Java and Python, and can be used generically to provide advanced functionality to OAI-ORE aware applications, and are compliant with the latest release (0.9) of the specification. The software is open source, released under a BSD licence, and is available from a Google Code repository . . . .

Foresite is a JISC funded project which aims to produce a demonstrator and test of the OAI-ORE standard by creating Resource Maps of journals and their contents held in JSTOR, and delivering them as ATOM documents via the SWORD interface to DSpace. DSpace will ingest these resource maps, and convert them into repository items which reference content which continues to reside in JSTOR. The Python library is being used to generate the resource maps from JSTOR and the Java library is being used to provide all the ingest, transformation and dissemination support required in DSpace.

Version 72, Scholarly Electronic Publishing Bibliography

Version 72 of the Scholarly Electronic Publishing Bibliography is now available from Digital Scholarship. This selective bibliography presents over 3,250 articles, books, and other digital and printed sources that are useful in understanding scholarly electronic publishing efforts on the Internet.

This version adds hundreds of links to freely available journal articles from publishers as well as to e-prints of published articles housed in disciplinary archives and institutional repositories. All article references were checked for the availability of such free content.

These links have also been added to a revised version of the Scholarly Electronic Publishing Bibliography: 2007 Annual Edition. Annual editions of the Scholarly Electronic Publishing Bibliography are PDF files designed for printing.

The bibliography has the following sections (revised sections are in italics):

1 Economic Issues
2 Electronic Books and Texts
2.1 Case Studies and History
2.2 General Works
2.3 Library Issues
3 Electronic Serials
3.1 Case Studies and History
3.2 Critiques
3.3 Electronic Distribution of Printed Journals
3.4 General Works
3.5 Library Issues
3.6 Research
4 General Works
5 Legal Issues
5.1 Intellectual Property Rights
5.2 License Agreements
6 Library Issues
6.1 Cataloging, Identifiers, Linking, and Metadata
6.2 Digital Libraries
6.3 General Works
6.4 Information Integrity and Preservation
7 New Publishing Models
8 Publisher Issues
8.1 Digital Rights Management
9 Repositories, E-Prints, and OAI
Appendix A. Related Bibliographies
Appendix B. About the Author
Appendix C. SEPB Use Statistics

Scholarly Electronic Publishing Resources includes the following sections:

Cataloging, Identifiers, Linking, and Metadata
Digital Libraries
Electronic Books and Texts
Electronic Serials
General Electronic Publishing
Images
Legal
Preservation
Publishers
Repositories, E-Prints, and OAI
SGML and Related Standards

An article about the bibliography ("Evolution of an Electronic Book: The Scholarly Electronic Publishing Bibliography") has been published in The Journal of Electronic Publishing.

DSpace Foundation and Fedora Commons Investigate Joint Collaboration

The DSpace Foundation and the Fedora Commons have been recently investigating the possibility of joint collaboration.

Here's an excerpt from a Dspace-General message:

Over the last few weeks, we (Michele Kimpton and Sandy Payette) have been discussing the possibilities of our organizations collaborating. . . .

Over the past couple of weeks, we have had informal discussions with members of our communities, leaders in libraries and higher education, and Board members to get initial feedback as to whether they would support collaboration and the outcomes they would like to see as a result.

This past week, we convened members of both communities during the PASIG conference to get input and ideas regarding a collaboration.

Thus far, all of the stakeholders we have had the opportunity to talk with have been extremely supportive and excited about the possibility of the Fedora and DSpace communities working together in some capacity.

As a result of these discussions, we have agreed to move forward in our exploration of collaborative possibilities. Over the next several weeks our organizations will meet to plan the next steps in the process. Our intent is to bring together the ideas and expertise within both communities to come up with the most compelling issues to work on to best serve our communities.

Sustainability and Revenue Models for Online Academic Resources: An Ithaka Report Released

The Strategic Content Alliance has released Sustainability and Revenue Models for Online Academic Resources: An Ithaka Report.

Here's an excerpt from the announcement:

This paper was commissioned by the Joint Information Systems Committee (JISC) is the first step in a three-stage process aimed at gaining a more systematic understanding of the mechanisms for pursuing sustainability in not-for-profit projects. It focuses on what we call 'online academic resources' (OARs), which are projects whose primary aim is to make content and scholarly discourse available on the web for research, collaboration, and teaching. This includes scholarly journals and monographs as well as a vast array of new formats that are emerging to disseminate scholarship, such as preprint servers and wikis. It also includes digital collections of primary source materials, datasets, and audio-visual materials that universities, libraries, museums, archives and other cultural and educational institutions are putting online.

This work is being done as part of the planning work for the Strategic Content Alliance (SCA), so it emphasises the development and maintenance of digital content useful in the networked world. In this first stage, we have conducted an initial assessment of the relevant literature focused on not-for-profit sustainability, and have compared the processes pursued in the not-for-profit and education sectors with those pursued by commercial organisations, specifically in the newspaper industry. The primary goal of this initial report is to determine to what extent it would make sense to conduct a more in-depth study of the issues surrounding sustainability.

Public Beta of Object Reuse and Exchange Specifications (OAI-ORE) Released

The Open Archives Initiative has released the public beta of Object Reuse and Exchange Specifications.

Here's an excerpt from the press release:

Over the past eighteen months the Open Archives Initiative (OAI), in a project called Object Reuse and Exchange (OAI-ORE), has gathered international experts from the publishing, web, library, and eScience community to develop standards for the identification and description of aggregations of online information resources. These aggregations, sometimes called compound digital objects, may combine distributed resources with multiple media types including text, images, data, and video. The goal of these standards is to expose the rich content in these aggregations to applications that support authoring, deposit, exchange, visualization, reuse, and preservation. Although a motivating use case for the work is the changing nature of scholarship and scholarly communication, and the need for cyberinfrastructure to support that scholarship, the intent of the effort is to develop standards that generalize across all web-based information including the increasing popular social networks of “web 2.0”. The beta version of the OAI-ORE specifications and implementation documents are released to the public on June 2, 2008. These documents describe a data model to introduce aggregations as resources with URIs on the web. They also detail the machine-readable descriptions of aggregations expressed in the popular Atom syndication format, in RDF/XML, and RDFa.

Muradora 1.3 Released: Web-Based GUI for Fedora

The DRAMA team at Macquarie University has released version 1.3 release of Muradora.

Here's an excerpt from the announcement:

Muradora is a web-based GUI for the popular Fedora repository, built using enterprise Java Spring and Struts 2 frameworks. Amongst the common features found in a typical repository such as search, browse, self-submission, and versioning supports, Muradora enables flexible access control for end users (based on the XACML standard), inter-domain authentication and federated identity (using Shibboleth implementation of the SAML standard), and multiple metadata schema management (via W3C XForms standard).

Notable features in 1.3 release:

  • Faceted Search: By incorporating GSearch 2.0 with Solr support, users can perform faceted searches, i.e. one can now narrow down search results based on other categories.
  • All-in-one installation: There is now an installation script for Unix/Linux systems which will install all the necessary components for Muradora. The complete package is called "muradora-allinone".
  • RSS/Atom Feeds: Users can subscribe to collections (even non-public collections) and get notifications of new objects added to those collections.
  • Thumbnail preview and gallery view: Thumbnails are now generated automatically for images. Thanks to the work by the MediaShelf team, one can browse and search using either the traditional listing view or with the gallery view.

OAI2LODServer Version 0.2 Released

MediaSpaces has released Version 0.2 of the OAI2LODServer.

Here's a description from the software's home page:

The OAI2LOD Server exposes any OAI-PMH compliant metadata repository according to the Linked Data guidelines. This makes things and media objects accessible via HTTP URIs and query able via the SPARQL protocol. Parts of the OAI2LOD architecture, especially the front-end, are based on the D2R Server implementation.

Further, it provides a configurable linking mechanism based on string similarity metrics. This allows the automatic linking of OAI-PMH data with other open data sets such as DBPedia or any other OAI-PMH repository exposed via the OAI2LOD Server.

Repositories Support Project Briefings Released

The Repositories Support Project has released several new or updated briefings:

Key Services [ Paper ]

This briefing paper gives an overview of some of the
key services currently available to repository managers and provides further details on how to access and use them.

Metadata [ Paper ]

This paper explores the topic of metadata in the repository and includes advice and information on metadata schemas and application profiles.

Making Effective Use of Your Repository [ Paper ]

Repositories are both part of an institution’s local information provision and part of the developing global open access information environment. This briefing paper discusses these contexts, helping the repository to serve the institution’s business needs effectively.

Repository Policy Framework – Updated [Paper]

Updated information about giving structure to your repository planning through the implementation of a policy framework.