Interview with Microsoft's Pablo Fernicola about Article Authoring Add-in for Microsoft Office Word 2007

Jon Udell has posted an interview ("Word for Scientific Publishing") with Pablo Fernicola, a Microsoft Group Manager, about the Article Authoring Add-in for Microsoft Office Word 2007 (see my prior posting "Microsoft Developing Authoring Add-in for Microsoft Office Word 2007 with NLM DTD Support"). (Warning: there is a very annoying Silverlight download pop-up that obscures part of the post.)

Udell has also posted a screencast of Fernicola demonstrating the add-in ("Pablo Fernicola Demonstrates the Word Add-In for Scientific Authors").

JorumOpen, UK Repository for Creative Commons Licensed Educational Materials, Announced

JISC has announced JorumOpen, a national repository of open access educational materials under Creative Commons licenses.

Here's an excerpt from the announcement:

It was announced today that Jorum, the UK national repository for learning and teaching materials funded by JISC, is to offer open educational resources. This will make it easier for lecturers and teaching staff to share and re-use each other's teaching resources. JorumOpen—as it will be called—will also provide a showcase for UK universities and colleges on the international stage. . . .

Jorum is managed jointly by EDINA and Mimas, the two National Academic Data Centres funded by JISC at the Universities of Edinburgh and Manchester. During the first phase of Jorum's development, the focus has been on building a system that safeguards investment in digital learning resources and offers controlled access to licensed materials. The result is a service that supports access to over 2,500 learning resources for download for direct use in the classroom and within virtual learning environments (VLEs).

Through the development of JorumOpen, lecturers and teachers will be able to share materials under the Creative Commons licence framework: this makes sharing easier, granting users greater rights for use and re-use of online content and easier to understand. Importantly, it does not require prior registration. As a result availability is global as well as across UK universities and colleges. JorumOpen will run alongside a 'members only' facility, JorumEducationUK, that will support sharing of material just within the UK educational sector; this will be available only to registered users and contributors, as is currently the case.

OCLC Announces Digital Archive Service

OCLC has announced the availability of a Digital Archive service.

Here's an excerpt from the press release:

The service provides a secure storage environment for libraries to easily manage and monitor master files and digital originals. The importance of preserving master files grows as a library's digital collections grow. Libraries need a workflow for capturing and managing master files that finds a balance between the acquisition of both digitized and born-digital content while not outpacing a library's capability to manage these large files. . . .

The Digital Archive service is a specially designed system in a controlled operating environment dedicated to the ongoing managed storage of digital content. OCLC has developed specific systems processes and procedures for the service tuned to the management of data for the long term.

From the time content arrives, the Digital Archive systems begin inspecting it to ensure continuity. OCLC systems perform quality checks and record the results in a "health record" for each file. Automated systems revisit these quality checks periodically so libraries receive up-to-date reports on the health of the collection. OCLC provides monthly updated information for all collections on the personal archive report portal.

For users of CONTENTdm, OCLC's digital collection management software for libraries and other cultural heritage institutions, the Digital Archive service is an optional capability integrated with various workflows for building collections. Master files are secured for ingest to the Digital Archive service using the CONTENTdm Acquisition Station, the Connexion digital import capability and the Web Harvesting service.

For users of other content management systems, the Digital Archive service provides a low-overhead mechanism for safely storing master files.

Repositories Support Project Releases Briefing Papers: Open Archives Initiative-Protocol for Metadata Harvesting and Workflows

The Repositories Support Project has released two briefing papers: Open Archives Initiative-Protocol for Metadata Harvesting and Workflows (i.e., digital repository submission workflows). Both briefing papers provide succinct introductions to the topic at hand.

Two JISC Open Archives Initiative Object Reuse and Exchange Projects

JISC is funding two projects to do small-scale OAI-ORE tests:

TheOREM (Theses with ORE Metadata), at the University of Cambridge, aims to:

  • Test the applicability of the ORE standard in a realistic scholarly setting—thesis description, submission and publication.
  • Demonstrate the advantages of the ORE approach in complex object publication, by combining it with existing web-standards compliant technologies.
  • Provide examples to fully exercise the ORE specifications in order to provide validation and future direction.

FORESITE (Functional Object Reuse and Exchange: Supporting Information Topology Experiments) will create Resource Map descriptions of JSTOR's holdings, and then ingest them into the DSpace institutional repository system via the SWORD protocol, creating external references back to the original files. The description work will be automated, and the system for achieving this implemented at the University of Liverpool. The SWORD protocol will be implemented within DSpace by HP Labs along with other extensions necessary.

For further information, see the FORESITE proposal, A Preview of the TheOREM Project, and the TheOREM proposal.

Isilon's IQ Clustered Storage System Chosen by Michigan and Rice for Digital Repository Storage

Isilon Systems has announced that its IQ Clustered Storage System will be used to support the Michigan Digitization Project and the Rice Digital Scholarship Archive.

Here's an excerpt from the press release about Michigan:

Isilon Systems . . . today announced that the University of Michigan (U-M) has selected Isilon's IQ clustered storage system as the primary repository for its Michigan Digitization Project. In partnership with Google, the University of Michigan and its Michigan Digitization Project are digitizing more than 7.5 million books, ensuring these valuable resources are available to the public into perpetuity. This enormous undertaking includes the storage of digital copies of all unique books within the libraries of the entire Big-Ten Conference and directly supports Google Book Search, which aims to create a single, comprehensive, searchable, virtual card catalog of all books in all languages. The University of Michigan, in partnership with Indiana University (IU), is leveraging Isilon's IQ clustered storage system to create a Shared Digital Repository (SDR) of the universities' published library materials. Using Isilon IQ, U-M and IU are able consolidate digital copies of millions of books into one, single, shared pool of storage to meet the rapidly growing storage demand of its massive book digitization project. . . .

In conjunction with the Committee for Institutional Cooperation (CIC), an academic partnership formed by the universities of the Big-Ten Conference and the University of Chicago, the University of Michigan and Indiana University are working to create a Shared Digital Repository (SDR) which will mirror the content from U-M and the CIC libraries found in Google Book Search. Using Isilon IQ clustered storage, featuring its OneFS® operating system software, U-M has eliminated disparate data silos to create a shared pool of storage for the digitization efforts of these partner institutions. Each digitized book is approximately 55 MB in size, downloading at a rate of 3 MB/second, 24 hours a day, 7 days a week, for the entire six year duration of the project. Isilon IQ reduces storage management time, enabling U-M to accelerate the book scanning process, preserve valuable materials, and ultimately expand the research and learning capabilities for millions of users across the globe.

Here's an excerpt from the press release about Rice:

Isilon . . . today announced that Rice University has selected Isilon's IQ clustered storage system as its central repository for digital multimedia, including video of selected speeches by international dignitaries and musical performances from the Shepherd School of Music. In an effort to preserve the many historic events held at these prestigious venues and ensure the productions are available to the public into perpetuity, Rice has deployed Isilon clustered storage to consolidate hundreds of recorded musical performances and keynote speeches into a single, highly scalable and reliable shared pool of storage for the Rice Digital Scholarship Archive, an institutional repository based on the DSpace software platform. . . .

Through a cooperative effort between Rice University's Digital Library Initiative, Fondren Library and Central IT department, the university has created a central repository for all its critical multi-media content, enabling a variety of departments to execute on vital, content-driven projects simultaneously, activity that was impossible with traditional storage. Prior to using Isilon IQ, Rice's storage management for the Digital Scholarship archiving system was unable to effectively support management of large digital video and audio files that required streaming for delivery. These assets, therefore, were stored on a variety of streaming servers by various groups across campus, creating multiple access bottlenecks that led to inefficient storage management and undue IT cost and complexity. By unifying all of its digital content onto one, easy to use, "pay as you grow" clustered storage system, Rice University has removed costly data access and management barriers and dramatically simplified its storage architecture. Additionally, using Isilon's SmartQuotas provisioning and quota management software application, Rice is also storing its Language Center's multi-media course work and its Central IT department's webcasts on Isilon IQ, delivering immediate, concurrent data access to multiple users and user groups, further reducing storage management costs to maximize system efficiency.

Rice University will stream its collection of musical performances from the Shepherd School, as well as its video library of the many world leaders and dignitaries that have spoken at the Baker Institute, to thousands of users online. This operation necessitates the use of multiple media servers, using Windows, Quicktime and Real Player formats. Isilon clustered storage communicates natively over CIFS, NFS FTP, and HTTP, as well as interoperating with Windows, Mac and Linux environments, enabling seamless integration with Rice's variety of server formats and enabling all content to be streamed from one, central, easily and immediately accessible storage system. With Isilon IQ, Rice's entire collection of multi-media is accessible to all its servers 24x7x365, ensuring that the media streaming operations are not only efficient and cost-effective, but prepared to meet high user demand.

Summary of Experiences with E-Journal Publishing Software and Institutional Repositories

Sunny Yoon, Digital Resources Coordinator at the City University of New York, posted a query on the CODE4LIB list about the use of e-journal publishing software and its integration into institutional repositories.

She has now posted an interesting summary of responses to her query.

You can also read the replies that were posted to the list under the heading "e-journal publishing software."

Repository Interface for Overlaid Journal Archives: Results from an Online Questionnaire Survey

The RIOJA project has released Repository Interface for Overlaid Journal Archives: Results from an Online Questionnaire Survey.

Here's an excerpt from the "Introduction":

The Repository Interface for Overlaid Journal Archives (RIOJA) project (http://www.ucl.ac.uk/ls/rioja) is an international partnership of members of academic staff, librarians and technologists from UCL (University College London), the University of Cambridge, the University of Glasgow, Imperial College London and Cornell University. It aims to address some of the issues around the development and implementation of a new publishing model, that of the overlay journal – defined, for the purposes of the project, as a quality-assured journal whose content is deposited to and resides in one or more open access repositories. The project is funded by the Joint Information Systems Committee (JISC, http://www.jisc.ac.uk/) and runs from April 2007 to June 2008.

The RIOJA project will create an interoperability toolkit to enable the overlay of certification onto papers housed in subject repositories. The intention is that the tool will be generic, helping any repository to realise its potential to act as a more complete scholarly resource. The project will also create a demonstrator overlay journal, using the arXiv repository and OJS software, with interaction between the two facilitated by the RIOJA toolkit.

To inform and shape the project, a survey of Astrophysics and Cosmology researchers has been conducted. The findings from that survey form the basis of this report.

The project team will also undertake formal and informal discussion with publishers and with academic and managing members of editorial boards. The survey and supplementary discussions will help to ensure that the RIOJA outputs address the needs and expectations of the research community. Finally, the overall long-term sustainability of a repository-overlay journal will be assessed. The project will examine the costs of adding peer review to arXiv deposits, of implementing and maintaining the functionality which the survey shows to be most valued by researchers, and of providing long-term preservation of content, and will aim to identify and appraise possible cost-recovery business models.

Open Repositories 2008 Presentations

Presentations from the Open Repositories 2008 conference are available in the OR08 Publications repository.

The easiest way to find presentations is to use the Browse by Subject capability; however, both simple and advanced search functions are available as well.

Currently, the repository holds over 90 documents. You can track new additions at the Latest Additions to OR08 Publications page (RSS feed). It's anticipated that all documents will be available by 4/13/08.

Here's a brief selection of available presentations:

Project Reports from the Andrew W. Mellon Foundation's 2008 Research in Information Technology Retreat

Project reports from the Andrew W. Mellon Foundation's 2008 Research in Information Technology retreat are now available.

Here are selected project briefing reports:

Weblog Reports from Open Repositories 2008

Below are selected Weblog reports from Open Repositories 2008.

Ball State University Libraries Move Ahead with Ambitious Digital Initiative Program

The Ball State Libraries have nurtured an ambitious digital initatives program that has established an institutional repository, a CONTENTdm system for managing digital assets, a Digital Media Repository with over 102,000 digital objects, a Digitization Center and Mobile Digitization Unit, an e-Archives for university records, and a virtual press (among other initiatives). Future goals are equally ambitious.

Read more about it at "Goals for Ball State University Libraries' Digital Initiative."

Tracking Deposit Growth: UK Repository Records Statistics

Chris Keene, Technical Development Manager at the University of Sussex Library, has released UK Repository Records Statistics, which provides U.K. institutional repository record growth data from July 2006 onwards based on ROAR statistics. For example, the site has a table showing monthly record totals.

Repository Planning Checklist and Guidance Released: Presents Planning Tool for Trusted Electronic Repositories (PLATTER)

DigitalPreservationEurope has released Repository Planning Checklist and Guidance.

Here's an excerpt from the "Executive Summary and Introduction to Platter":

The purpose of this document is to present a tool, the Planning Tool for Trusted Electronic Repositories (PLATTER) which provides a basis for a digital repository to plan the development of its goals, objectives and performance targets over the course of its lifetime in a manner which will contribute to the repository establishing trusted status amongst its stakeholders. PLATTER is not in itself an audit or certification tool but is rather designed to complement existing audit and certification tools by providing a framework which will allow new repositories to incorporate the goal of achieving trust into their planning from an early stage. A repository planned using PLATTER will find itself in a strong position when it subsequently comes to apply one of the existing auditing tools to confirm the adequacy of its procedures for maintaining the long term usability of and access to its material. . . .

The PLATTER process is centred around a group of Strategic Objective Plans (SOPs) through which a repository specifies its current objectives, targets, or key performance indicators in those areas which have been identified as central to the process of establishing trust. In the future, PLATTER can and should be used as the basis for an electronic tool in which repositories will be able to compare their targets with those adopted by other similar (suitably anonymised) repositories. The intention is that the SOPs should be living documents which evolve with the repository, and PLATTER therefore defines a planning cycle through which the SOPs can develop symbiotically with the repository organisation.

OAI-ORE for Fedora: Oreprovider Released

Oskar Grenholm of the National Library of Sweden has released oreprovider, an open-source Java application that "will let you disseminate digital objects stored in a Fedora repository as OAI-ORE Resource Maps."

In the announcement, he says:

The idea behind it all is that you have a Java web application (oreprovider.war) that, on the fly, will generate Resource Maps serialized as Atom feeds (using OAI4J) for objects in Fedora. All you have to do in Fedora is to add information in RELS-EXT what datastreams belongs to which Resource Map (exactly how to do this can be seen at the projects web page).

DSpace Version 1.5 Released

Version 1.5 of DSpace, which is a major upgrade, has been released.

Here's an excerpt from the announcement:

The DSpace community is pleased to announce the release of DSpace 1.5! This is an important release of DSpace with many new features, including a completely new theme-able Manakin user interface, SWORD integration, many new configurable options, and scalability improvements. . . .

New Features:

  • Maven DSpace 1.5 introduces a new Maven-based build system. Maven is a software tool from Apache that allows developers to compile and distribute software projects. Maven also enables DSpace to be more modular by arranging the software into sub-components. In addition, it makes customizations easier by giving developers the tools to maintain customizations, and provides the ability to manage new features as DSpace continues its accelerating growth rate. . . .
  • Manakin Customize your repository look-and-feel with the new Manakin theme-able user interface. Manakin introduces a new modular framework, enabling an institution to customize their interface according to the specific needs of the particular repository, community, or collection. . . .
  • Light Network Interface Integrate DSpace with legacy or local systems that need to manage content in the repository through the new Light Network Interface. This interface provides a programmatic mechanism to manage content within the repository through a WebDAV or SOAP based protocol. . . .
  • SWORD Integrate with the new SWORD (Simple Web-service Offering Repository Deposit) protocol. Based upon the Atom Publishing Protocol, this interface allows for cross-repository deposit of new content. This protocol may enable future tools that will provide for 'one click' deposit. . . .
  • Browsing The browsing system has been completely re-implemented to provide improved scalability and configuration. The new browsing system enables administrators to easily create new browse indexes. . . .
  • Submissions The item submission system is now more configurable by managing the steps a user follows when submitting a new item to the repository. The new submission system allows for these steps to be rearranged, removed, and even allows for new steps to be added. . . .
  • Events Another under-the-hood improvement introduced in DSpace 1.5 is the event system, which improves scalability and modularity by introducing an event model to the architecture. This feature will allow future add-ons to automatically manage content in the repository based upon when an object has been added, modified, or removed from the system.

Microsoft to Unveil Research-Output Repository Platform at Open Repositories 2008

Microsoft will unveil its Windows-based research-output repository platform in early April at Open Repositories 2008. Initially, the software will be used internally to support a repository for Microsoft Research. At a later date, it will be made available for public download, possibly as open-source software.

Here's an excerpt from "Microsoft and 'Research-Output' Repositories":

The platform has a "semantic computing" flavor. The concepts of "resource" and "relationship" are first-class citizens in our platform API. We do offer a number of "research-output"-related entities for those who want to use them (e.g. "technical report", "thesis", "book", "software download", "data", etc.), all of which inherit from "resource". However, new entities can be introduced into the system (even programmatically) while the existing ones can be further extended through the addition of properties. . . .

We are already well into the process of developing a collection of tools and interfaces on top of the platform as tangible examples of how to use it. We already have implementations of OAI-PMH, BibTeX import/export, customized feed syndication service, ASP.NET controls providing access to the repository, and working on Search and a simple Web UI. We are also working on WPF and Silverlight tools for visualizing the relationships between the resources within our repository. . . .

At the Open Repositories 2008 conference, we will formally unveil our work in advance of its official release and initiate interactions/exchanges with the DSpace, EPrints, Fedora, and other players in the repository community. This is crucial to us because—like every other project our group undertakes—we are intensely focused on interoperability.

I want to be very transparent here: our effort is intended to provide a repository option to those institutions/organizations that already license or have access to Microsoft software (including the free versions of the products, like SQL Server Express). Our platform is intended to sit on top of the existing Microsoft "stack". By providing this new research-output repository platform at no cost, we can offer added value for our existing (and future) customers in the academic and research space. It is critical to point out that we are making every effort to ensure our platform is optimized to make the best use of Microsoft technologies AND to also interoperate with all other existing systems and platforms in the repository ecosystem. We are actively seeking engagement and feedback from the community!

Read more about it at “Microsoft Famulus: New IR Software.”

Microsoft Developing Authoring Add-in for Microsoft Office Word 2007 with NLM DTD Support

Microsoft is developing an Article Authoring Add-in for Microsoft Office Word 2007, which will support the NLM DTD. A Technology Preview of the Add-in is available.

Here's an excerpt from the Technical Computing @ Microsoft—Scholarly Publishing page:

In support of the increased emphasis on electronic publishing and archiving of scholarly articles, Microsoft has developed the Article Authoring Add-in for Microsoft Office Word 2007. This add-in will support the XML format from the National Library of Medicine (NLM), which is commonly used in the scientific, technical, and medical (STM) publishing market as part of the publishing workflow and as the format used for the archiving of articles. Pre-release versions of this add-in will target the staff at STM journals and publishers, at information repositories, and in-house and commercial software developers supporting the STM market.

The Article Authoring Add-in for Word 2007 will enable or simplify a number of activities that are part of the authoring and scholarly publishing process, such as:

  • gathering information about the authors and article content at the time the article is written;
  • enabling journals to provide authors with templates containing the structure for articles, and information for self-classification of the articles by the authors;
  • enabling access to the authors and article metadata contained in the Word file through the use of the NLM format and OpenXML document structure;
  • enabling the editorial staff to have access to the article and journal metadata directly within Word; and
  • enabling two-way conversion between Office OpenXML and the NLM format.

Greg Tananbaum consulted with Microsoft on the development of the tool.

Preserving Mixed Analog/Digital AV Archives: PrestoSpace Project Case Study

The Digital Curation Centre has published DCC Case Study—PrestoSpace: Preservation towards Storage and Access. Standardised Practices for Audiovisual Contents in Europe.

Here's the "Executive Summary":

Explicit strategies are needed to manage 'mixed' audio visual (AV) archives that contain both analogue and digital materials. The PrestoSpace Project brings together industry leaders, research institutes, and other stakeholders at a European level, to provide products and services for effective automated preservation and access solutions for diverse AV collections. The Project’s main objective is to develop and promote flexible, integrated and affordable services for AV preservation, restoration, and storage with a view to enabling migration to digital formats in AV archives.