Two JISC Open Archives Initiative Object Reuse and Exchange Projects

JISC is funding two projects to do small-scale OAI-ORE tests:

TheOREM (Theses with ORE Metadata), at the University of Cambridge, aims to:

  • Test the applicability of the ORE standard in a realistic scholarly setting—thesis description, submission and publication.
  • Demonstrate the advantages of the ORE approach in complex object publication, by combining it with existing web-standards compliant technologies.
  • Provide examples to fully exercise the ORE specifications in order to provide validation and future direction.

FORESITE (Functional Object Reuse and Exchange: Supporting Information Topology Experiments) will create Resource Map descriptions of JSTOR's holdings, and then ingest them into the DSpace institutional repository system via the SWORD protocol, creating external references back to the original files. The description work will be automated, and the system for achieving this implemented at the University of Liverpool. The SWORD protocol will be implemented within DSpace by HP Labs along with other extensions necessary.

For further information, see the FORESITE proposal, A Preview of the TheOREM Project, and the TheOREM proposal.

Isilon's IQ Clustered Storage System Chosen by Michigan and Rice for Digital Repository Storage

Isilon Systems has announced that its IQ Clustered Storage System will be used to support the Michigan Digitization Project and the Rice Digital Scholarship Archive.

Here's an excerpt from the press release about Michigan:

Isilon Systems . . . today announced that the University of Michigan (U-M) has selected Isilon's IQ clustered storage system as the primary repository for its Michigan Digitization Project. In partnership with Google, the University of Michigan and its Michigan Digitization Project are digitizing more than 7.5 million books, ensuring these valuable resources are available to the public into perpetuity. This enormous undertaking includes the storage of digital copies of all unique books within the libraries of the entire Big-Ten Conference and directly supports Google Book Search, which aims to create a single, comprehensive, searchable, virtual card catalog of all books in all languages. The University of Michigan, in partnership with Indiana University (IU), is leveraging Isilon's IQ clustered storage system to create a Shared Digital Repository (SDR) of the universities' published library materials. Using Isilon IQ, U-M and IU are able consolidate digital copies of millions of books into one, single, shared pool of storage to meet the rapidly growing storage demand of its massive book digitization project. . . .

In conjunction with the Committee for Institutional Cooperation (CIC), an academic partnership formed by the universities of the Big-Ten Conference and the University of Chicago, the University of Michigan and Indiana University are working to create a Shared Digital Repository (SDR) which will mirror the content from U-M and the CIC libraries found in Google Book Search. Using Isilon IQ clustered storage, featuring its OneFS® operating system software, U-M has eliminated disparate data silos to create a shared pool of storage for the digitization efforts of these partner institutions. Each digitized book is approximately 55 MB in size, downloading at a rate of 3 MB/second, 24 hours a day, 7 days a week, for the entire six year duration of the project. Isilon IQ reduces storage management time, enabling U-M to accelerate the book scanning process, preserve valuable materials, and ultimately expand the research and learning capabilities for millions of users across the globe.

Here's an excerpt from the press release about Rice:

Isilon . . . today announced that Rice University has selected Isilon's IQ clustered storage system as its central repository for digital multimedia, including video of selected speeches by international dignitaries and musical performances from the Shepherd School of Music. In an effort to preserve the many historic events held at these prestigious venues and ensure the productions are available to the public into perpetuity, Rice has deployed Isilon clustered storage to consolidate hundreds of recorded musical performances and keynote speeches into a single, highly scalable and reliable shared pool of storage for the Rice Digital Scholarship Archive, an institutional repository based on the DSpace software platform. . . .

Through a cooperative effort between Rice University's Digital Library Initiative, Fondren Library and Central IT department, the university has created a central repository for all its critical multi-media content, enabling a variety of departments to execute on vital, content-driven projects simultaneously, activity that was impossible with traditional storage. Prior to using Isilon IQ, Rice's storage management for the Digital Scholarship archiving system was unable to effectively support management of large digital video and audio files that required streaming for delivery. These assets, therefore, were stored on a variety of streaming servers by various groups across campus, creating multiple access bottlenecks that led to inefficient storage management and undue IT cost and complexity. By unifying all of its digital content onto one, easy to use, "pay as you grow" clustered storage system, Rice University has removed costly data access and management barriers and dramatically simplified its storage architecture. Additionally, using Isilon's SmartQuotas provisioning and quota management software application, Rice is also storing its Language Center's multi-media course work and its Central IT department's webcasts on Isilon IQ, delivering immediate, concurrent data access to multiple users and user groups, further reducing storage management costs to maximize system efficiency.

Rice University will stream its collection of musical performances from the Shepherd School, as well as its video library of the many world leaders and dignitaries that have spoken at the Baker Institute, to thousands of users online. This operation necessitates the use of multiple media servers, using Windows, Quicktime and Real Player formats. Isilon clustered storage communicates natively over CIFS, NFS FTP, and HTTP, as well as interoperating with Windows, Mac and Linux environments, enabling seamless integration with Rice's variety of server formats and enabling all content to be streamed from one, central, easily and immediately accessible storage system. With Isilon IQ, Rice's entire collection of multi-media is accessible to all its servers 24x7x365, ensuring that the media streaming operations are not only efficient and cost-effective, but prepared to meet high user demand.

Summary of Experiences with E-Journal Publishing Software and Institutional Repositories

Sunny Yoon, Digital Resources Coordinator at the City University of New York, posted a query on the CODE4LIB list about the use of e-journal publishing software and its integration into institutional repositories.

She has now posted an interesting summary of responses to her query.

You can also read the replies that were posted to the list under the heading "e-journal publishing software."

Open Repositories 2008 Presentations

Presentations from the Open Repositories 2008 conference are available in the OR08 Publications repository.

The easiest way to find presentations is to use the Browse by Subject capability; however, both simple and advanced search functions are available as well.

Currently, the repository holds over 90 documents. You can track new additions at the Latest Additions to OR08 Publications page (RSS feed). It's anticipated that all documents will be available by 4/13/08.

Here's a brief selection of available presentations:

Project Reports from the Andrew W. Mellon Foundation's 2008 Research in Information Technology Retreat

Project reports from the Andrew W. Mellon Foundation's 2008 Research in Information Technology retreat are now available.

Here are selected project briefing reports:

Weblog Reports from Open Repositories 2008

Below are selected Weblog reports from Open Repositories 2008.

ARL Publishes Research Library Publishing Services: New Options for University Publishing

The Association of Research Libraries has published Research Library Publishing Services: New Options for University Publishing by Karla L. Hahn.

Here's an excerpt from the "Executive Summary":

To foster a deeper understanding of an emerging research library role as publishing service provider, in late 2007 the Association of Research Libraries surveyed its membership to gather data on the publishing services they were providing. Following the survey, publishing program managers at ten institutions participated in semi-structured interviews to delve more deeply into several aspects of service development: the sources and motivations for service launch, the range of publishing services, and relationships with partners.

The survey verified that research libraries are rapidly developing publishing services. By late 2007, 44% of the 80 responding ARL member libraries reported they were delivering publishing services and another 21% were in the process of planning publishing service development. Only 36% of responding institutions were not active in this arena.

These libraries are publishing many kinds of works, but the main focus is journals; 88% of publishing libraries reported publishing journals compared to 79% who publish conference papers and proceedings, and 71% who publish monographs. Established journal titles dominate this emerging publishing sector and are the main drivers of service development, although new titles are also being produced. Although the numbers of titles reported represent a very thin slice of the scholarly publishing pie, the survey respondents work with 265 titles: 131 are established titles, 81 are new titles, and 53 were under development at the time of the survey. On average, these libraries work with 7 or 8 titles with 6 currently available. . . .

Peer reviewed works dominate library publishing programs and editors or acquisitions committees typically maintain their traditional roles in identifying quality content. Libraries often provide technical support for streamlined peer review workflows, but they are not providing peer review itself. The manuscript handling services provided by some publishing programs were a significant attraction to the editors of established publications.

Library publishing program managers report substantial demand for hosting services. Libraries increasingly are positioned to provide at least basic hosting services. Open source software such as the Public Knowledge Project’s Open Journal Systems and DPubs along with new commercial services such as those offered by The Berkeley Electronic Press (bepress) through Digital Commons allows libraries to support basic journal hosting relatively easily.

Advice and consulting regarding a variety of publishing practices and decisions are perhaps even more popular services. There are pressing demands for information and advice about issues such as moving print publications into electronic publishing, discontinuing print in favor of electronic alternatives, publishing works with limited revenue-generating capability, revenue generation, standards of various sorts, markup and encoding, metadata generation, preservation, contracting with service providers, and copyright management.

Ball State University Libraries Move Ahead with Ambitious Digital Initiative Program

The Ball State Libraries have nurtured an ambitious digital initatives program that has established an institutional repository, a CONTENTdm system for managing digital assets, a Digital Media Repository with over 102,000 digital objects, a Digitization Center and Mobile Digitization Unit, an e-Archives for university records, and a virtual press (among other initiatives). Future goals are equally ambitious.

Read more about it at "Goals for Ball State University Libraries' Digital Initiative."

Tracking Deposit Growth: UK Repository Records Statistics

Chris Keene, Technical Development Manager at the University of Sussex Library, has released UK Repository Records Statistics, which provides U.K. institutional repository record growth data from July 2006 onwards based on ROAR statistics. For example, the site has a table showing monthly record totals.

Repository Planning Checklist and Guidance Released: Presents Planning Tool for Trusted Electronic Repositories (PLATTER)

DigitalPreservationEurope has released Repository Planning Checklist and Guidance.

Here's an excerpt from the "Executive Summary and Introduction to Platter":

The purpose of this document is to present a tool, the Planning Tool for Trusted Electronic Repositories (PLATTER) which provides a basis for a digital repository to plan the development of its goals, objectives and performance targets over the course of its lifetime in a manner which will contribute to the repository establishing trusted status amongst its stakeholders. PLATTER is not in itself an audit or certification tool but is rather designed to complement existing audit and certification tools by providing a framework which will allow new repositories to incorporate the goal of achieving trust into their planning from an early stage. A repository planned using PLATTER will find itself in a strong position when it subsequently comes to apply one of the existing auditing tools to confirm the adequacy of its procedures for maintaining the long term usability of and access to its material. . . .

The PLATTER process is centred around a group of Strategic Objective Plans (SOPs) through which a repository specifies its current objectives, targets, or key performance indicators in those areas which have been identified as central to the process of establishing trust. In the future, PLATTER can and should be used as the basis for an electronic tool in which repositories will be able to compare their targets with those adopted by other similar (suitably anonymised) repositories. The intention is that the SOPs should be living documents which evolve with the repository, and PLATTER therefore defines a planning cycle through which the SOPs can develop symbiotically with the repository organisation.

OAI-ORE for Fedora: Oreprovider Released

Oskar Grenholm of the National Library of Sweden has released oreprovider, an open-source Java application that "will let you disseminate digital objects stored in a Fedora repository as OAI-ORE Resource Maps."

In the announcement, he says:

The idea behind it all is that you have a Java web application (oreprovider.war) that, on the fly, will generate Resource Maps serialized as Atom feeds (using OAI4J) for objects in Fedora. All you have to do in Fedora is to add information in RELS-EXT what datastreams belongs to which Resource Map (exactly how to do this can be seen at the projects web page).

DSpace Version 1.5 Released

Version 1.5 of DSpace, which is a major upgrade, has been released.

Here's an excerpt from the announcement:

The DSpace community is pleased to announce the release of DSpace 1.5! This is an important release of DSpace with many new features, including a completely new theme-able Manakin user interface, SWORD integration, many new configurable options, and scalability improvements. . . .

New Features:

  • Maven DSpace 1.5 introduces a new Maven-based build system. Maven is a software tool from Apache that allows developers to compile and distribute software projects. Maven also enables DSpace to be more modular by arranging the software into sub-components. In addition, it makes customizations easier by giving developers the tools to maintain customizations, and provides the ability to manage new features as DSpace continues its accelerating growth rate. . . .
  • Manakin Customize your repository look-and-feel with the new Manakin theme-able user interface. Manakin introduces a new modular framework, enabling an institution to customize their interface according to the specific needs of the particular repository, community, or collection. . . .
  • Light Network Interface Integrate DSpace with legacy or local systems that need to manage content in the repository through the new Light Network Interface. This interface provides a programmatic mechanism to manage content within the repository through a WebDAV or SOAP based protocol. . . .
  • SWORD Integrate with the new SWORD (Simple Web-service Offering Repository Deposit) protocol. Based upon the Atom Publishing Protocol, this interface allows for cross-repository deposit of new content. This protocol may enable future tools that will provide for 'one click' deposit. . . .
  • Browsing The browsing system has been completely re-implemented to provide improved scalability and configuration. The new browsing system enables administrators to easily create new browse indexes. . . .
  • Submissions The item submission system is now more configurable by managing the steps a user follows when submitting a new item to the repository. The new submission system allows for these steps to be rearranged, removed, and even allows for new steps to be added. . . .
  • Events Another under-the-hood improvement introduced in DSpace 1.5 is the event system, which improves scalability and modularity by introducing an event model to the architecture. This feature will allow future add-ons to automatically manage content in the repository based upon when an object has been added, modified, or removed from the system.

Microsoft to Unveil Research-Output Repository Platform at Open Repositories 2008

Microsoft will unveil its Windows-based research-output repository platform in early April at Open Repositories 2008. Initially, the software will be used internally to support a repository for Microsoft Research. At a later date, it will be made available for public download, possibly as open-source software.

Here's an excerpt from "Microsoft and 'Research-Output' Repositories":

The platform has a "semantic computing" flavor. The concepts of "resource" and "relationship" are first-class citizens in our platform API. We do offer a number of "research-output"-related entities for those who want to use them (e.g. "technical report", "thesis", "book", "software download", "data", etc.), all of which inherit from "resource". However, new entities can be introduced into the system (even programmatically) while the existing ones can be further extended through the addition of properties. . . .

We are already well into the process of developing a collection of tools and interfaces on top of the platform as tangible examples of how to use it. We already have implementations of OAI-PMH, BibTeX import/export, customized feed syndication service, ASP.NET controls providing access to the repository, and working on Search and a simple Web UI. We are also working on WPF and Silverlight tools for visualizing the relationships between the resources within our repository. . . .

At the Open Repositories 2008 conference, we will formally unveil our work in advance of its official release and initiate interactions/exchanges with the DSpace, EPrints, Fedora, and other players in the repository community. This is crucial to us because—like every other project our group undertakes—we are intensely focused on interoperability.

I want to be very transparent here: our effort is intended to provide a repository option to those institutions/organizations that already license or have access to Microsoft software (including the free versions of the products, like SQL Server Express). Our platform is intended to sit on top of the existing Microsoft "stack". By providing this new research-output repository platform at no cost, we can offer added value for our existing (and future) customers in the academic and research space. It is critical to point out that we are making every effort to ensure our platform is optimized to make the best use of Microsoft technologies AND to also interoperate with all other existing systems and platforms in the repository ecosystem. We are actively seeking engagement and feedback from the community!

Read more about it at “Microsoft Famulus: New IR Software.”

Microsoft Developing Authoring Add-in for Microsoft Office Word 2007 with NLM DTD Support

Microsoft is developing an Article Authoring Add-in for Microsoft Office Word 2007, which will support the NLM DTD. A Technology Preview of the Add-in is available.

Here's an excerpt from the Technical Computing @ Microsoft—Scholarly Publishing page:

In support of the increased emphasis on electronic publishing and archiving of scholarly articles, Microsoft has developed the Article Authoring Add-in for Microsoft Office Word 2007. This add-in will support the XML format from the National Library of Medicine (NLM), which is commonly used in the scientific, technical, and medical (STM) publishing market as part of the publishing workflow and as the format used for the archiving of articles. Pre-release versions of this add-in will target the staff at STM journals and publishers, at information repositories, and in-house and commercial software developers supporting the STM market.

The Article Authoring Add-in for Word 2007 will enable or simplify a number of activities that are part of the authoring and scholarly publishing process, such as:

  • gathering information about the authors and article content at the time the article is written;
  • enabling journals to provide authors with templates containing the structure for articles, and information for self-classification of the articles by the authors;
  • enabling access to the authors and article metadata contained in the Word file through the use of the NLM format and OpenXML document structure;
  • enabling the editorial staff to have access to the article and journal metadata directly within Word; and
  • enabling two-way conversion between Office OpenXML and the NLM format.

Greg Tananbaum consulted with Microsoft on the development of the tool.

Presentations from the Open Access Collections Workshop Now Available

Presentations from the Australian Partnership for Sustainable Repositories' Open Access Collections workshop are now available. Presentations are in HTML/PDF, MP3, and digital video formats. The workshop was held in association with the Queensland University Libraries Office of Cooperation and the University of Queensland Library.

Iowa Provost Issues Statement about Open Access MFA Theses Dust-Up

MFA students at the University of Iowa have been upset about a requirement that would make their theses available as open access documents either immediately or in two years (if they ask for an extension). A number of student blog postings have protested this requirement. Part of the problem is that MFA theses can be creative works (or other types of works, such as nonfiction works) that may have commercial potential. Peter Suber has analyzed the situation in his "Controversy over OA for Fine Arts Theses and Dissertations" posting.

The Interim Provost, Lola Lopes, has now issued a statement about the conflict.

Here's an excerpt from that statement:

For some time now our library, like most major academic research libraries, has been exploring ways to make its collections more accessible by digitizing some materials. As part of that process, there has been discussion about the possibility of making graduate student dissertations and theses available in electronic format. But any such process must be preceded by developing policies and procedures that allow authors to decide whether and when to allow distribution.

On Monday, March 17, I will begin pulling together a working group with representatives from the Graduate College, University Libraries, our several writing programs, and all other constituencies who wish to be part of the process. Under the leadership of Carl Seashore in 1922, Iowa became the first university in the United States to award MFA degrees based on creative projects. Although this has been a rocky start, I like to think that Iowa will again lead the way by developing policies and procedures that safeguard intellectual property rights while preserving materials for the use of scholars in generations to come.

Read more about it at "Iowa's 'Open Access' Policy Is Nothing but a Trojan Horse"; "Students, UI Grapple over Online Publishing"; "Thesis Policy Sparks Uproar"; "U. of Iowa Writing Students Revolt Against a Plan They Say Would Give Away Their Work on the Web"; and "Writing Students Want UI Not to Give Away Their Work."

Gordon Tibbitts Named as Berkeley Electronic Press CEO

Berkeley Electronic Press, a low-cost scholarly journal publisher whose Digital Commons institutional repository software is widely used, has named Gordon Tibbitts, former President of Blackwell Publishing, as its Chief Executive Officer.

Here's an excerpt from the press release:

Tibbitts comes to bepress after seven years as President of Blackwell Publishing, where he grew the company into the world's leading society publisher, and led the effort to develop an online platform for Blackwell journals. Tibbitts first entered the publishing field in 1980 as Director of Information Systems at Aster Publishing (later Advanstar), before moving to the Thomson Corporation in 1993, where he served as a vice-president until 1999. He holds a BS degree in Computer Science and an MBA from the University of Oregon.

"Gordon Tibbitts is a great match for Berkeley Electronic Press," said Chairman and Co-founder Aaron Edlin. "The past years have seen some great successes at bepress, and we are poised for substantial growth. Gordon is the right person to make it happen—a dynamic, energetic leader with valuable technical and publishing experience and vision."

In addition to his 25 years of experience at major publishing firms, Tibbitts is a founder and board chair of CLOCKSS and board member of LOCKSS, and has served on the Google publishing advisory board and as an advisor to ScholarOne and Atypon Systems, Inc. He frequently speaks and moderates at publishing, library, and technology meetings.

Dealing with Research Data in a Federated Digital Repository: Oxford University Planning Document Released

The Oxford e-Research Centre has released Scoping Digital Repository Services for Research Data Management, a project plan for determining the requirements for handling data in a federated digital repository at Oxford University.

Here's an excerpt from the "Aims and Objectives" section:

Objectives:

  • Capture and document researchers’ requirements for digital repository services to handle research data.
  • Participate actively in the development of an interoperability framework for the federated digital repository at Oxford.
  • Make recommendations to improve and coordinate the provision of digital repository services for research data.
  • Initiate and develop collaborations with the different repository activities already occurring to ensure that communication takes place in between them.
  • Raise awareness at Oxford of the importance and advantages of the active management of research data.
  • Communicate significant national and international developments in repositories to relevant Oxford stakeholders, in order to stimulate the adoption of best practices.

The Texas Digital Library Repository Is Live

Although there appears to have been no formal public announcement about its roll out, the DSpace-based Texas Digital Library Repository is available.

The TDL Repository contains some initial materials (mainly ETDs and Seventeenth-Century News) from three of the four founding TDL members (Texas A&M University at College Station, Texas Tech University at Lubbock, and the University of Texas at Austin; there are no materials from the University of Houston) as well as from the University of Texas at Arlington.

Using Open Journal Systems, TDL also provides access to the Journal of Digital Information, which is supported by the Texas A&M University Libraries.

The Texas Digital Library Shibboleth Federation has made progress in providing Shibboleth access to TDL for three of the four founding members (the status as of August 2007 was: Texas A&M University at College Station: fully deployed, Texas Tech University at Lubbock: agreement reached, and the University of Texas at Austin: fully deployed; there was no activity at the University of Houston). Progress was also being made for Shibboleth access for Baylor University, Texas State University, and the University of North Texas.

Helping Researchers Understand and Label Article Versions: VERSIONS Toolkit Released

The VERSIONS (Versions of Eprints—A User Requirements Study and Investigation Of the Need for Standards) project has released the VERSIONS Toolkit.

Here's an excerpt from the "Introduction":

If you are an experienced researcher you are likely to be disseminating your work on a personal website, in a subject archive, or in an institutional repository already. This toolkit aims to:

  • provide peer-to-peer advice about managing personal versions and revisions in order to keep your options open for future use of your work
  • clarify areas of uncertainty among researchers about agreements with publishers and how these relate to different versions of research outputs
  • suggest ways to identify your work clearly when placing it on the web in order to guide your readers to the latest and best versions of your work
  • direct you to further resources about making versions of your work openly accessible

The toolkit draws on the results of a survey of researchers’ attitudes and current practice when creating, storing and disseminating different versions of their research. As such the guidance in the toolkit represents the views of active researchers. Survey respondents were predominantly from economics and related disciplines.