University of Michigan Libraries Release the UMich OAI Toolkit

The University of Michigan Libraries have released the UMich OAI Toolkit.

Here's an excerpt from the announcement:

This toolkit contains both harvester and data provider, both written in Perl. . . .

UMHarvester is a robust tool using LWP for harvesting nigh on every OAI data provider available. It allows for incremental harvesting, has multiple re-try options, and a batch harvest tool (Batch_UMHarvest) that can automatically perform incremental harvesting.

UMProvider relies heavily on libxml (XML::LibXML) and will store the data in nearly any relational database. It functions by harvesting from a database of records, making rights determinations from a separate database, and providing the resulting set of records.

Originally, only the UMHarvester was available from UM's DLXS software site. The UMProvider tool is newly developed and takes the place of our DLXS data provider tool.

Rice University Releases Travelers in the Middle East Archive

Rice University has released the Travelers in the Middle East Archive under a Creative Commons Attribution 2.5 Generic License.

Here's an excerpt from the announcement:

IMEA provides access to:

  • Nearly 1,000 images, including stereocards, postcards and book illustrations
  • More than 150 historical maps representing the Middle East as it was in the 19th and early 20th centuries
  • Interactive geographical information systems (GIS) maps that serve as an interface to the collection and present detailed information about features such as waterways, elevation and populated places
  • Successive editions of classic travel guides and major museum collection catalogues
  • Convenient educational modules that set materials from the collection in historical and geographic context and explore the research process

TIMEA is able to offer seamless access for researchers by providing a common user interface to digital objects housed in three repositories. Texts, historical maps and images reside in DSpace, an open-source digital repository system. Educational research modules are presented within Connexions, an open-content commons and publishing platform for educational materials. TIMEA also uses Google Maps and ESRI’s ArcIMS map server.

New Release of BioMed Central's Open Repository, a Hosted Institutional Repository Service

BioMed Central has released version 1.4.9 of Open Repository, its DSpace-based, hosted institutional repository service.

Here's an excerpt from the press release:

Open Repository version 1.4.9 has several new features that are designed to enhance the customer experience. The release offers an improved user interface, making it easier for customers to browse and submit their material online. Additionally, institutions can convert their Word, Excel, PowerPoint, Text and RTF documents to PDF format. Customers can also set up RSS feeds, and customize lists and search fields, adding value to the already robust platform.

Pitt's Libraries and University Press Establish Open Access Book Program

The University of Pittsburgh University Library System and the University of Pittsburgh University Press have established the University of Pittsburgh University Press Digital Editions, which offers free access to digitized versions of print books from the press.

Here's an excerpt from the press release:

The University of Pittsburgh’s University Library System (ULS) and University Press have formed a partnership to provide digital editions of press titles as part of the library system’s D-Scribe Digital Publishing Program. Thirty-nine books from the Pitt Latin American Series published by the University of Pittsburgh Press are now available online, freely accessible to scholars and students worldwide. Ultimately, most of the Press’ titles older than 2 years will be provided through this open access platform.

For the past decade, the University Library System has been building digital collections on the Web under its D-Scribe Digital Publishing Program, making available a wide array of historical documents, images and texts which can be browsed by collection and are fully searchable. The addition of the University of Pittsburgh Press Digital Editions collection marks the newest in an expanding number of digital collaborations between the University Library System and the University Press.

The D-Scribe Digital Publishing Program includes digitized materials drawn from Pitt collections and those of other libraries and cultural institutions in the region, pre-print repositories in several disciplines, the University’s mandatory electronic theses and dissertations program, and electronic journals during the past eight years, sixty separate collections have been digitized and made freely accessible via the World Wide Web. Many of these projects have been carried out with content partners such as Pitt faculty members, other libraries and museums in the area, professional associations, and most recently, with the University of Pittsburgh Press with several professional journals and the new University of Pittsburgh Press Digital Editions. . . .

More titles will be added to the University of Pittsburgh Press Digital Editions each month until most of the current scholarly books published by the Press are available both in print and as digital editions. The collection will eventually include titles from the Pitt Series in Russian and East European Studies, the Pitt-Konstanz Series in the Philosophy and History of Science, the Pittsburgh Series in Composition, Literacy, and Culture, the Security Continuum: Global Politics in the Modern Age, the History of the Urban Environment, back issues of Cuban Studies, and numerous other scholarly titles in history, political science, philosophy, and cultural studies.

Stable Version of SPECTRa Released: Software for Depositing Chemical Data into Repositories

A stable version of SPECTRa has been released. SPECTRa is designed to facilitate the deposit of chemical data into digital repositories.

The JISC-funded SPECTRa (Submission, Preservation and Exposure of Chemistry Teaching and Research Data a Digital Repository for the Chemical Community) project's final report is also available.

Institute of Physics Launches an Open Access Earth and Environmental Science Proceedings Service

The Institute of Physics has launched the IOP Conference Series: Earth and Environmental Science, an open access proceedings service. A FAQ is available.

Here's an excerpt from the press release:

Based on IOP Publishing’s highly successful open access proceedings in physics, EES allows conference organizers to create a comprehensive record of their event and make a valuable contribution to the open access literature that will be of long-lasting benefit to their research communities.

As part of the service’s launch, EES is waiving a total of US$5000 of publication fees for a number of conferences who expect to publish their proceedings during 2008.

We are delighted to announce that the first conference to qualify for this is the 14th International Symposium for the Advancement of Boundary Layer Remote Sensing (ISARS2008) which takes place on 23–25 June 2008, Risø National Laboratory, DTU, Roskilde, Denmark.

Eduserv Releases Study about the Use of Open Content Licenses By UK Heritage Organizations

The Eduserv Foundation has released Snapshot Study on the Use of Open Content Licences in the UK Cultural Heritage Sector (Appendices).

Here's an excerpt from the "Executive Summary":

This study investigates the awareness and use of open content licences in the UK cultural heritage community by way of a survey. Open content licensing generally grants a wide range of permission in copyright for use and re-use of works such as images, sounds, video, and text, whilst retaining a relatively small set of rights: often described as a ‘some rights reserved’ approach to copyright. For those wishing to share content using this model, Creative Archive (CA) and Creative Commons (CC) represent the two main sets of open content licences available for use in the United Kingdom.

The year of this survey, 2007, marks five years from the launch of the Creative Commons licences, two years since the launch of the UK-specific CC licences and two years as well since the launch of the UK-only Creative Archive licence.

This survey targeted UK cultural heritage organisations—primarily museums, libraries, galleries, archives, and those in the media community that conduct heritage activities (such as TV and radio broadcasters and film societies). In particular, this community produces trusted and highly valued content greatly desired by the general public and the research and education sectors. They are therefore a critical source of high-demand content and thus the focus for this project. The key objective has been to get a snapshot of current licensing practices in this area in 2007 for use by the sector and funding bodies wishing to do more work in this area.

Over 100 organisations responded to this web-based survey. Of these respondents:

  • Only 4 respondents out of 107 indicated that they held content but were not making it available online nor had plans to make it available online;
  • Images and text are the two content types most likely to be made available online;
  • Sound appears to be the most held content type not currently available online and with no plans to make it available in the future;
  • Many make some part of their collection available online without having done any formal analysis of the impact this may have;
  • 59 respondents were aware of Creative Archive or Creative Commons;
  • 10 use a CA or CC licence for some of their content; and
  • 12 have plans to use a CA or CC licence in the future.

House Doesn't Override Presidential Veto of Labor-HHS Bill Which Contains NIH OA Mandate

By two votes, the House failed to override President Bush's veto of the Departments of Labor, Health and Human Services, and Education, and Related Agencies Appropriations Act, 2008, which contained the NIH open access mandate (the vote was 277-141). Bloomberg reports that Senate Democrats have a new strategy:

Senate Majority Leader Harry Reid said Democrats will combine the 11 unfinished appropriations bills still needed to fund the federal government into one measure that exceeds the administration's request by $11 billion—half the $22 billion Democrats initially supported.

However, CQPolitics reports that:

The White House brushed off Reid’s proposal Thursday, as administration officials have done previously when Democrats have said they are willing to negotiate on funding levels.

"The president has been clear that Congress should adhere to the budgetary process and pass individual funding bills at reasonable and responsible spending levels," said Sean Kevelighan, a spokesman for the White House budget office. "Perhaps [the] Democratic leadership in Congress. . . should concern itself less with capturing political news cycles and more on their fundamental responsibility to fund the federal government."

Peter Suber had this to say about the override failure:

OK, on to Plan B.  The OA mandate for the NIH is a small part of a big bill to pay for about one-thirteenth of the federal government.  Some version of the appropriation will certainly pass and get the President's signature.  You can already see the jockeying between Congressional leaders and the White House about the contours of that version.  There are four grounds for optimism:

  1. The OA mandate was approved by both houses of Congress.  The easiest provisions to delete are those approved by just one chamber and kept by the House-Senate conference committee.
  2. The OA mandate has bipartisan support in Congress and Republican friends in the Executive Branch.
  3. The President has expressed strong objection to some of the policy provisions of the bill, but his stated concern about the OA provision is very mild by comparison.  If Congress deletes some of the more sensitive provisions in the spirit of compromise, it needn't touch the OA mandate.  In fact, deleting the OA provision would do virtually nothing to ingratiate the President.
  4. To reduce overall spending levels in the bill, Congress will cut some of the appropriations.   But the OA mandate is a policy change, not an appropriation.  There's no need to cut it to satisfy the President's fiscal objections to the current bill.   Stay tuned.

ALA Urgent Call for Action about the Presidential Veto of the Labor-HHS Bill

The American Library Association has issued an urgent call for action about the presidential veto of the FY 2008 Health and Human Services, Education, and Related Agencies appropriations bill, which includes the NIH Public Access Policy mandate and essential funding for library programs.

You can easily contact your senators using the ALA Action Alert Web form.

I've created a cut-and-paste version of prior ALA/Alliance for Taxpayer Access text about the NIH open access mandate and added brief information about key library programs funded by the bill. You can use this text to simplify the process of sending an e-mail via the ALA Action Alert Web form, but personalizing this text with an added sentence or two is recommended.

National Science Digital Library Releases Initial Fedora-based NCore Components

The National Science Digital Library Core Integration team at Cornell University has released a partial version of NCore, a "general platform for building semantic and virtual digital libraries united by a common data model and interoperable applications," which is built upon Fedora.

Here's an excerpt from the NSDL posting:

The NCore platform consists of a central repository built on top of Fedora, a data model, an API, and a number of fundamental services such as full-text search or OAI-PMH. Innovative NSDL services and tools that empower users as content creators are now built on, or transitioning to, the NCore platform. These include: the Expert Voices blogging system (http://expertvoices.nsdl.org/);the NSDL Wiki (http://wiki.nsdl.org/index.php/NSDL_Wiki); the NSDL OAI-PMH metadata ingest aggregation system; the OAI-PMH service for distributing public NSDL metadata; the NSDL Collection System (NCS), derived from the DLESE Collection system (DCS); the NSDL Search service, and the OnRamp content management and distribution system (http://onramp.nsdl.org).

Because NCore is a general Fedora-based open source platform useful beyond NSDL, Core Integration developers at Cornell University have made the repository and API code components of NCore available for download at the NCore project on Sourceforge (http://sourceforge.net/projects/nsdl-core). Over the next six months, NSDL will release the code for major tools and services that comprise the full NCore suite on SourceForge.

For further information, see the NCore presentation.

President Bush Vetoes Bill Containing NIH Open Access Mandate

President Bush has vetoed the FY 2008 Labor, Health and Human Services and Education Appropriations bill, which contained the NIH open access mandate.

Here's the open access mandate in the bill:

The Director of the National Institutes of Health shall require that all investigators funded by the NIH submit or have submitted for them to the National Library of Medicine's PubMed Central an electronic version of their final, peer-reviewed manuscripts upon acceptance for publication, to be made publicly available no later than 12 months after the official date of publication: Provided, That the NIH shall implement the public access policy in a manner consistent with copyright law

Here's Peter Suber's analysis of the President's veto:

  • First, don't panic.  This has been expected for months and the fight is not over.  Here's a reminder from my November newsletter:  "There are two reasons not to despair if President Bush vetoes the LHHS appropriations bill later this month.  If Congress overrides the veto, then the OA mandate language will become law.  Just like that.  If Congress fails to override the veto, and modifies the LHHS appropriation instead, then the OA mandate is likely to survive intact."  (See the rest of the newsletter for details on both possibilities.)
  • Also expected:  Bush vetoed the bill for spending more than he wants to spend, not for its OA provision.
  • Second, it's time for US citizens to contact their Congressional delegations again.  This time around, contact your Representative in the House as well as your two Senators.  The message is:  vote yes on an override of the President's veto of the LHHS appropriations bill.  (Note that the LHHS appropriations bill contains much more than the provision mandating OA at the NIH.)
  • The override votes—one in each chamber—haven't yet been scheduled.  They may come this week or they may be delayed until after Thanksgiving.  But they will come and it's not too early to contact your Congressional delegation.  For the contact info for your representatives (phone, email, fax, local offices), see CongressMerge.
  • Please spread the word!

Fedora Meets Web 2.0: Repository Redux Presentation from Access 2007

A digital video of Mark Leggott's (University Librarian, University of Prince Edward Island) presentation from Access 2007 is now available.

Here's an excerpt from the program that describes the talk:

The University of Prince Edward Island has embarked on a substantial project to support the institutions Administrative, Learning and Research communities using a Web 2.0/3.0 framework and the Fedora/Drupal/Moodle systems as the foundation. The session will describe the architecture and demo some of the core systems, such as Learn@UPEI, UPEI VRE (Virtual Research Environment) and some sample digital library collections.

Primary Research Group Publishes International Institutional Repository Survey

The Primary Research Group has published The International Survey of Institutional Digital Repositories. Paper and PDF versions are available at $89.50 each.

Here's an excerpt from the press release:

The study presents data from 56 institutional digital repositories from eleven countries, including the USA, Canada, Australia, Germany, South Africa, India, Turkey and other countries. The 121-page study presents more than 300 tables of data and commentary and is based on data from higher education libraries and other institutions involved in institutional digital repository development. . . .

Close to 41% of survey participants purchased software to develop their digital repositories. US-based institutions were much more likely than others to purchase software for this purpose. . . .

On average, a drop more than 12% of the content in the repositories came from pre-existing repositories maintained by academic departments or some other institutional unit.

A sixth of the libraries in the sample used Digital Commons software, and 28% of US-based repositories used this product. . . .

Those repositories in the sample that required less than 500 hours of labor per year had budgets of just less than $9,000 US. The largest repositories, those requiring 3,600 hours or more annually, had budgets averaging $145,444. 5.21% of the overall labor required to run the digital repositories in the sample came from academic departments not connected to the library. . . .

The mean number of journal articles held by the repositories in the sample was 772 with a mean of 162. . . .

15.56% of the repositories in the sample were funded largely through grants.

Version 1.0 of SWORD, A Smart Deposit Tool for Repositories, Has Been Released

Version 1.0 of SWORD has been released The release includes DSpace (1.5 only) and Fedora implementations, GUI/CLI clients, and the common Java library.

Here's an excerpt from the SWORD Wiki that describes the project:

SWORD (Simple Web-service Offering Repository Deposit) will take forward the Deposit protocol developed by a small working group as part of the JISC Digital Repositories Programme by implementing it as a lightweight web-service in four major repository software platforms: EPrints, DSpace, Fedora and IntraLibrary. The existing protocol documentation will be finalised by project partners and a prototype 'smart deposit' tool will be developed to facilitate easier and more effective population of repositories. The project intends to take an iterative approach to developing and revising the protocol, web-services and client implementation through evaluative testing and feedback mechanisms. Community acceptance and take-up will be sought through dissemination activities. The project is led by UKOLN, University of Bath, with partners at the University of Wales, Aberystwyth, the University of Southampton and Intrallect Ltd. The project aims to improve the efficiency and quality of repository deposit and to diversity and expedite the options for timely population of repositories with content whilst promoting a common deposit interface and supporting the Information Environment principles of interoperability.

SPARC/ACRL Explore Sustainability Issues with Three Open Access Journal Publishers

SPARC and ACRL have released podcasts/transcripts of interviews about sustainability issues with Bryan Vickery (BioMed Central), Mark Patterson (Public Library of Science), and Paul Peters (Hindawi Publishing Corporation). It has also released a matrix that analyzes the responses of these OA journal publishers about sustainability issues.

Update on the British Public Library/Microsoft Digitization Project

Jim Ashling provides an update on the progress that the British Public Library and Microsoft have made in their project to digitize about 100,000 books for access in Live Book Search in his Information Today article "Progress Report: The British Library and Microsoft Digitization Partnership."

Here's an excerpt from the article:

Unlike previous BL digitization projects where material had been selected on an item-by-item basis, the sheer size of this project made such selectivity impossible. Instead, the focus is on English-language material, collected by the BL during the 19th century. . . .

Scanning produces high-resolution images (300 dpi) that are then transferred to a suite of 12 computers for OCR (optical character recognition) conversion. The scanners, which run 24/7, are specially tuned to deal with the spelling variations and old-fashioned typefaces used in the 1800s. The process creates multiple versions including PDFs and OCR text for display in the online services, as well as an open XML file for long-term storage and potential conversion to any new formats that may become future standards. In all, the data will amount to 30 to 40 terabytes. . . .

Obviously, then, an issue exists here for a collection of 19th-century literature when some authors may have lived beyond the late 1930s [British/EU law gives authors a copyright term of life plus 70 years]. An estimated 40 percent of the titles are also orphan works. Those two issues mean that item-by-item copyright checking would be an unmanageable task. Estimates for the total time required to check on the copyright issues involved vary from a couple of decades to a couple of hundred years. The BL’s approach is to use two databases of authors to identify those who were still living in 1936 and to remove their work from the collection before scanning. That, coupled with a wide publicity to encourage any rights holders to step forward, may solve the problem.

Boston Public Library/Open Content Alliance Contract Made Public

Boston Public Library has made public its digitization contract with the Open Content Alliance.

Some of the most interesting provisions include the intent of the Internet Archive to provide perpetual free and open access to the works, the digitization cost arrangements (BPL pays for transport and provides bibliographic metadata, the Internet Archive pays for digitization-related costs), the specification of file formats (e.g., JPEG 2000, color PDF, and various XML files), the provision of digital copies to BPL (copies are available immediately after digitization for BPL to download via FTP or HTTP within 3 months), and use of copies (any use by either party as long as provenance metadata and/or bookplate data is not removed).

Yale Will Work with Microsoft to Digitize 100,000 Books

The Yale University Library and Microsoft will work together to digitize 100,000 English-language out-of-copyright books, which will be made available via Microsoft’s Live Search Books.

Here’s an excerpt from the press release:

The Library and Microsoft have selected Kirtas Technologies to carry out the process based on their proven excellence and state-of-the art equipment. The Library has successfully worked with Kirtas previously, and the company will establish a digitization center in the New Haven area. . . .

The project will maintain rigorous standards established by the Yale Library and Microsoft for the quality and usability of the digital content, and for the safe and careful handling of the physical books. Yale and Microsoft will work together to identify which of the approximately 13 million volumes held by Yale’s 22 libraries will be digitized. Books selected for digitization will remain available for use by students and researchers in their physical form. Digital copies of the books will also be preserved by the Yale Library for use in future academic initiatives and in collaborative scholarly ventures.

Open-Source IRStats Released: Use Statistics for EPrints and DSpace

Eprints.org has released IRStats, an open source use statistics analysis package that analyzes both EPrints (versions 2 and 3) and DSpace (beta functionality) logs. The software is under a BSD license, and it requires Perl, awstats, MySQL, Maxmind Organisation Database, ChartDirector, and a CGI-capable Web server.

A description of IRStats features is available as well as examples of its use. For additional information on the project, see "Introduction to IRS."

Creative Commons Seeks Feedback from Librarians about LiveDVD

Timothy Vollmer has announced on Lita-L (10/28/07 message) that the Creative Commons is looking for feedback about its LiveDVD for libraries, which is part of its LiveContent project.

Here's an excerpt from the message:

Creative Commons is working with Fedora on creating a LiveDVD for libraries that contains free, open source software (like OpenOffice, The Gimp, Inkscape, Firefox) and open content, including CC-licensed media such as audio, video, photographs, text and open educational resources. . . .

The next iteration we're working on is a LiveDVD for libraries, providing an informational resource and creative tool that would allow library patrons to test open source software, view (and rip, remix, reuse) open content, and even create new content with the software contained on the disc. . . .

We want to get some more feedback/comments/suggestions on the project and are also looking to identify librarians and interested groups to test out the LiveDVD!

DSpace 1.5 Alpha Released

The 1.5 alpha version of the popular DSpace repository software has been released.

Here's an excerpt from "DSpace 1.5 Alpha with Experimental Binary Distribution" by Richard Jones:

There are big changes in this code base, both in terms of functionality and organisation. First, we are now using Maven to manage our build process, and have carved the application into a set of core modules which can be used to assemble your desired DSpace instance. . . .

The second big and most exciting thing is that Manakin is now part of our standard distribution, and we want to see it taking over from the JSP UI over the next few major releases. . . .

In addition to this, we have an Event System which should help us start to decouple tightly integrated parts of the repository. . . . Browsing is now done with a heavily configurable system . . . . Tim Donohue's much desired Configurable Submission system is now integrated with both JSP and Manakin interfaces and is part of the release too.

Further to this we have a bunch of other functionality including: IP Authentication, better metadata and schema registry import, move items from one collection to another, metadata export, configurable multilingualism support, Google and html sitemap generator, Community and Sub-Communities as OAI Sets, and Item metadata in XHTML head ‹meta› elements.