JISC Digital Repositories and Archives Inventory Project Catalogs 3,707 Free Digital Collections

With the completion of phase two of the project, the JISC Digital Repositories and Archives Inventory project has cataloged 3,707 free digital collections. The phase two records will be added to the JISC Information Environment Service Registry (IESR), which already contains the phase one records.

Here's an excerpt from the announcement:

The brief of the inventory was to identify all the repositories and achives in the UK that are relevant to UK higher education and are free at point of use. For the purposes of this project a very loose definition of repositories and archives was used. The only sites that were excluded were those that restricted access and those with little or no structure.

Phase 1 of the project discovered 1,924 resources and phase 2 discovered 1,783. The records from phase 1 are already in the IESR and records from phase 2 will be added soon.

Phase 2 also enriched the metadata collected about all the resources and contacted resource owners to approve or extend the data collected about their resources. This produced a very positive response with approximately 800 resource owners providing extra information about their collections.

The project has released its final report, JISC Final Report—Digital Repositories and Archives Inventory Project.

Fedora Commons Wiki Re-Launched

The Fedora Commons Wiki has been re-launched using Atlassian’s Confluence enterprise Wiki software.

Here's an excerpt from the announcement:

The new Fedora Commons Wiki provides a stable environment for developing Fedora software, documentation and communities. The new wiki features additional personalization and development tools for communication and tracking including a feature long-requested by the community–automated account registration with "Capcha." So it’s easy to join our community while making it difficult for spammers.

Please note that you do not need an account to read the Fedora Commons wiki—it’s open to everyone. You must register at the new wiki, however, if you want to add comments, articles, ask for help, or participate with other members of the community.

Ithaka’s 2006 Studies of Key Stakeholders in the Digital Transformation in Higher Education

Ithaka Harbors has published Ithaka’s 2006 Studies of Key Stakeholders in the Digital Transformation in Higher Education.

Here'e an excerpt from the Faculty and Librarian Surveys Web page:

Our 2006 survey of faculty members sought to determine their attitudes related to online resources, electronic archiving, teaching and learning and related subjects. This study affords the opportunity to develop trend analysis of many measurements that we collected in the 2003 and 2000 faculty surveys. . . . In 2006, for the first time, we are also able to offer extensive comparison with the attitudes and perspectives of academic librarians on the perceived roles of the library and librarian on campuses; the impact of transitioning to electronic material on library practices; the place of digital repositories in the campus information-services landscape; and the future plans of academic libraries. Librarians surveyed include both directors and collection development leaders from a wide variety of 4-year academic institutions across the United States.

Repositories Support Project Launches RSP Blog Directory

The Repositories Support Project has launched the RSP Blog Directory.

Here's an excerpt from the announcement:

It provides a list of recommended and informative blogs regarding the repository scene from around the globe. Listed blogs include personal creations from those with first hand experience of repository management and/or technical development of repository software; blogs for specific repositories, projects and software developers; as well as blogs for groups and societies with an interest in the open access movement and digital curation.

Presentations/Reports from the JISC/CNI Meeting on Transforming the User Experience

Presentations are available from the JISC/CNI meeting on Transforming the User Experience.

Here's a selection:

Helen Aguera, Senior Program Officer at the National Endowment for the Humanities, has also reported on the conference in a series of Weblog postings:

Fedora 2.2.3 Released: Important Security Fix

Fedora Commons has released version 2.2.3 of Fedora, which contains an important security fix.

Here's an excerpt from the press release:

Today Fedora Commons released version 2.2.3 of the popular Fedora software that includes the repair of a serious security defect and several bug fixes. Dan Davis, Chief Software Architect, Fedora Commons, explained, "Every installation of Fedora 2 should update to 2.2.3 due to the security update. There have been no exploits that have been discovered but it is important to maintain repositories at the latest security update level." Fedora 2.2.3 is strictly a maintenance update; new features may be found in Fedora 3.0 which was released for general availability on July 29th . Also, the license has been changed to the familiar Apache License 2.0 for Fedora 2.2.3. Fedora 2 will be maintained until August 2009 and thereafter be placed in an "end of life" status. At least one more release of Fedora 2 is planned though there may be additional releases to fix critical defects.

A Look at the Development and Future of Scholarly Communication in High Energy Physics

Robert Aymar, Director-General of CERN, has deposited a e-print of "Scholarly Communication in High-Energy Physics: Past, Present and Future Innovations" in the CERN Document Server.

Here's an excerpt from the abstract:

Unprecedented technological advancements have radically changed the way we communicate and, at the same time, are effectively transforming science into e-Science. In turn, this transformation calls for an evolution in scholarly communication. This review describes several innovations, spanning the last decades of scholarly communication in High Energy Physics: the first repositories, their interaction with peer-reviewed journals, a proposed model for Open Access publishing and a next-generation repository for the field.

Of particular interest is his description of the INSPIRE Project, "a fully integrated HEP information platform for the future," that will have "text- and data-mining applications, citation analysis and other tools, and Web 2.0 features."

For further information about INSPIRE, see "Information Systems in HEP get INSPIREd" and the INSPIRE Wiki.

Responses to Chris Rusbridge's Proposed Research Repository System

Chris Rusbridge, Director of the Digital Curation Centre, has summarized responses that he has received to his proposed Research Repository System.

Here's an excerpt from "Negative Click, Positive Value Research Repository Systems," where he outlined the key features of the system.

The main elements that I think the RRS should support are (not in any particular order):

Major Upgrade: Fedora 3.0 Released

Fedora Commons has released version 3.0 Fedora, which "completes all general release features."

Here's an excerpt from the press release:

Dan Davis, Chief Software Architect, Fedora Commons, explained, "We are pleased to offer a Fedora 3.0 that is a foundational step towards a model-driven content architecture." He went on to say, "Users will find it simpler to maintain and operate their repositories with version 3.0-it's more scalable and fits better into the Web."

Fedora 3.0 features the Content Model Architecture (CMA), an integrated structure for persisting and delivering the essential characteristics of digital objects in Fedora. The software is available at http://www.fedora-commons.org/ and at http://sourceforge.net/projects/fedora-commons. The Fedora CMA plays a central role in the Fedora architecture, in many ways forms the over-arching conceptual framework for future development of Fedora Repositories. Fedora 3.0 features include:

Overview of new Features in Fedora 3.0 Release

  • Content Model Architecture—Provides a model-driven approach for persisting and delivering the essential characteristics of digital content in Fedora
  • Fedora REST API—A new API that exposes a subset of the Access and Management API using a RESTful Web interface contributed by MediaShelf
  • Mulgara Support—Fedora supports the Mulgara 2.0 Semantic Triplestore replacing Kowari
  • Migration Utility—Provides an update utility to convert existing collections for Content Model Architecture compatibility
  • Relational Index Simplification—The Fedora schema was simplified making changes easier without having to reload the database and significantly increasing scalability
  • Dynamic Behaviors—Objects may be added or removed dynamically from the system moving system checks into run-time errors
  • Error Reporting—Provides improved run-time error details
  • Multiple Owner as a CSV String—Enables using a CSV string as ownerID and in XACML policies
  • Java 6 Compatibility—Fedora may be optionally compiled using Java 6 while retaining support for Java Enterprise Edition 1.5 deployments
  • Relationships API—API-M has been extended to enable adding, removing, and discovering RDF relations between Fedora objects
  • Revised Fedora Object XML Schemas—The new schemas are simpler, supporting the CMA and removing Disseminators
  • Atom Support—Fedora objects can now be imported and exported in the Atom format
  • Messaging Support—Integrates JMS messaging for sending notification of important events
  • Validation Framework—Provides system operators a way to validate all or part of their repository, based on content models
  • 3.0-Compatible Service Releases—New versions of the OAI Provider and GSearch services are compatible with Fedora 3.0. The GSearch release also enables messaging support for GSearch, which allows for more robust and seamless integration with the Fedora repository.
  • Many new enhancements—see the Release Notes:
    http://www.fedora-commons.org/documentation/3.0b2/
    userdocs/distribution/release-notes.html
    .

New CONTENTdm Add-on: OCLC Web Harvester

OCLC has announced the availability of Web Harvester, which allows CONTENTdm sites to import Web content into their systems.

Here's an excerpt from the press release:

OCLC's Web Harvester evolved from collaboration with several state libraries, state archives and universities over a period of seven years. Participants emphasized the increasing importance of collecting and managing Web-based content as information resources move online yet remain within libraries' and archives' collection scopes.

The Web Harvester is integrated into library workflows, allowing library staff to capture content as part of the cataloging process. The captured content is then sent to the organization's digital collections where it can be managed with other CONTENTdm digital content. . . .

The Web Harvester is accessed via the Connexion client, OCLC's powerful cataloging service, and captures content ranging from single, Web-based documents to entire Web sites. Once retrieved, users can review the captured Web content and add it to a collection managed by OCLC's CONTENTdm software, a complete solution for storing, managing and delivering a library's digital collections to the Web. Once in CONTENTdm, then Web content can be accessed and managed in conjunction with other digital collections. Harvested items are discoverable from WorldCat.org, WorldCat Local and the CONTENTdm Web interface.

For additional security, master files of the captured content also can be ingested to the OCLC Digital Archive, the service for long-term storage of originals and master files from libraries' digital collections.

OpenDOAR/Google Maps Mashup

OpenDOAR is mapping repository data using Google Maps.

Here's an excerpt from the announcement:

SHERPA is pleased to announce the addition of a Google Maps extension to OpenDOAR, its directory of open access repositories (http://www.opendoar.org/find). Just run any search of the directory, and then change the output format from "Summaries" to "Google Map".

Here are a few examples:

1. http://www.opendoar.org/find?format=gmap&cID=jp
—Repositories in Japan . . .

3. http://www.opendoar.org/find?format=gmap&cID=us&ctID=6
—United States repositories holding theses & dissertations

4. http://www.opendoar.org/find?format=gmap&search=Nottingham
—Keyword search for "Nottingham"

5. http://www.opendoar.org/find?format=gmap&rSoftWareName=
CONTENTdm

—Repositories using CONTENTdm software

DSpace Foundation and Fedora Commons Announce Decision to Collaborate

The DSpace Foundation and Fedora Commons have announced that they will collaborate on future digital repository initiatives.

Here's an excerpt from the press release:

Today two of the largest providers of open source software for managing and providing access to digital content, the DSpace Foundation and Fedora Commons, announced plans to combine strengths to work on joint initiatives that will more closely align their organizations' goals and better serve both open source repository communities in the coming months. . . .

The collaboration is expected to benefit over 500 organizations from around the world who are currently using either DSpace (examples include MIT, Rice University, Texas Digital Library and University of Toronto) or Fedora (examples include the National Library of France, New York Public Library, Encyclopedia of Chicago and eSciDoc) open source software to create repositories for a wide variety of purposes. . . .

The decision to collaborate came out of meetings held this spring where members of DSpace and Fedora Commons communities discussed multiple dimensions of cooperation and collaboration between the two organizations. Ideas included leveraging the power and reach of open source knowledge communities by using the same services and standards in the future. The organizations will also explore opportunities to provide new capabilities for accessing and preserving digital content, developing common web services, and enabling interoperability across repositories.

In the spirit of advancing open source software, Fedora Commons and DSpace will look at ways to leverage and incubate ideas, community and culture to:

  1. Provide the best technology and services to open source repository framework communities.
  2. Evaluate and synchronize, where possible, both organizations' technology roadmaps to enable convergence and interoperability of key architectural components.
  3. Demonstrate how the DSpace and Fedora open source repository frameworks offer a unique value proposition compared to proprietary solutions.

The announcement came on the heels of an event sponsored by the Joint Information Systems Committee's (JISC) Common Repository Interface Group (CRIG) held at the Library of Congress. The event, known as "RepoCamp," was a forum where developers gathered to discuss innovative approaches to improving interoperability and web-orientation for digital repositories. Sandy Payette, Executive Director of Fedora Commons, and Michele Kimpton, Executive Director of the DSpace Foundation, reiterated their commitment to collaboration and encouraged input and participation from both communities as work gets underway.

Oxford Releases Report on Digital Repository Services for Research Data Management

The Oxford University Office of the Director of IT has released Findings of the Scoping Study Interviews and the Research Data Management Workshop: Scoping Digital Repository Services for Research Data Management.

Here's an excerpt from the report's Web page:

The scoping study interviews aimed to document data management practices from Oxford researchers as well as to capture their requirements for services to help them manage their data more effectively. In order to do this, 37 face-to-face interviews were conducted between May and June with researchers from 27 colleges, departments and faculties. In addition to this, the Research Data Management Workshop was organised to complement the findings of the scoping study interviews.

APSR Releases Investigating Data Management Practices in Australian Universities

The Australian Partnership for Sustainable Repositories has released Investigating Data Management Practices in Australian Universities.

Here an excerpt from the report's Web page:

In late 2007, The University of Queensland undertook a survey of data management practices among the university’s researchers. This was done in response to the increasing realisation that repositories need to include research data, in addition to the research outputs in print form already included, and to provide information which would enhance the support provided for those engaged in eResearch.

The survey was carried out using the Apollo software developed at The Australian National University and adapted by APSR. Two other universities, The University of Melbourne and the Queensland University of Technology, have now replicated the survey among their own communities, while adding some questions of local interest.

The survey covers questions such as the types of digital data being created (spreadsheets, documents, experimental data, images, fieldwork data, etc), the size of the data collection, software used for data analysis, data storage and backup, application of a data management plan, roles and responsibilities around data management, copyright frameworks, usage of high capacity computing, and much more.

Microsoft’s Free Digital Tools for Scholars

At the ninth annual Microsoft Research Faculty Summit, Tony Hey, Corporate Vice President of Microsoft’s External Research Division, discussed a variety of digital tools for scholars.

Here's an excerpt from the press release:

Add-ins. The Article Authoring Add-in for Word 2007 enables metadata to be captured at the authoring stage to preserve document structure and semantic information throughout the publishing process, which is essential for enabling search, discovery and analysis in subsequent stages of the life cycle. The Creative Commons Add-in for Office 2007 allows authors to embed Creative Commons licenses directly into an Office document (Word, Excel or PowerPoint) by linking to the Creative Commons site via a Web service.

The Microsoft e-Journal Service. This offering provides a hosted, full-service solution that facilitates easy self-publishing of online-only journals to facilitate the availability of conference proceedings and small and medium-sized journals.

Research Output Repository Platform. This platform helps capture and leverage semantic relationships among academic objects—such as papers, lectures, presentations and video—to greatly facilitate access to these items in exciting new ways.

The Research Information Centre. In close partnership with the British Library, this collaborative workspace will be hosted via Microsoft Office SharePoint Server 2007 and will allow researchers to collaborate throughout the entire research project workflow, from seeking research funding to searching and collecting information, as well as managing data, papers and other research objects throughout the research process.

Here's a list that indicates availability.

  • Article Authoring Add-in version 1.0 for Microsoft Office Word 2007 (download)
  • Creative Commons Add-in version 1.0 for Microsoft Office (download)
  • Microsoft Math Add-in for Microsoft Office Word 2007 (download)
  • Microsoft eJournal Service (alpha preview)
  • Research Output Repository Platform ("Currently in a limited alpha release, an open beta version will be available later in 2008.")
  • Research Information Centre ("This service is currently in beta testing. Microsoft intends to share the code widely by the end of the year.")

DSpace Can Support One Million Items

A paper by researchers from the National Library of Medicine ("Testing the Scalability of a DSpace-based Archive") finds that DSpace can support an archive with a million items. The tested system "is built upon MIT's DSpace software (Version 1.4), with some modifications and enhancements to better facilitate batch based processing."

Here's an excerpt from the conclusion:

We conclude that the version of DSpace used in SPER (with MySQL database) shows acceptable ingest performance for a million-item archive. . . .

The experimental results shown here pertain to items with mostly one or two monochrome TIFF images, though a few items have up to 100 images. However, a number of inferences may be derived from these results.

  • No real problems were found in ingesting a million items to the archive, using a Sun X4500 server machine, in terms of either performance or reliability of the SPER/DSpace software architecture and implementation. . . .
  • With the increase in archive size, the average ingest time of an item increases in a smooth and predictable way.
  • With increasing number of TIFF images, the ingest time (per item) increases by three to four percent for each additional image.
  • If color TIFF images were used, the ingest times would increase slightly due to the overhead of copying additional data to the upload area, and to the archive's asset storage. However, other archival overheads should not change.

Texas Digital Library Hosts Second E-Journal

The Texas Digital Library is hosting the Journal of Virtual Worlds Research. The first issue is now available.

Articles in the Journal of Virtual Worlds Research are freely available in the PDF format, and they are under a Creative Commons Attribution-No Derivative Works 3.0 United States License.

The journal is edited by Jeremiah Spence, a doctoral student at the University of Texas at Austin's College of Communication.

The Texas Digital Library also hosts the Journal of Digital Information. Articles in the Journal of Digital Information are freely available in the PDF or HTML formats, and authors retain the copyright to them. Supported by the Texas A&M University Libraries, it is edited by Cliff McKnight, Professor of Information Studies at Loughborough University, and Scott Phillips, Research and Development Coordinator at the Texas A&M University Libraries' Digital Initiatives department.