Report of the Sustainability Guidelines for Australian Repositories Project (SUGAR)

The Australian Partnership for Sustainable Repositories (APSR) has released Report of the Sustainability Guidelines for Australian Repositories Project (SUGAR).

Here’s an excerpt from the report:

The Sustainability Guidelines for Australian Repositories service (SUGAR)was intended to support people working in tertiary education institutions whose activities do not focus on digital preservation. The target community creates and digitises content for a range of purposes to support learning, teaching and research. While some have access to technical and administrative support many others may not be aware of what they need to know. The typical SUGAR user may have little interest in discussions surrounding metadata, interoperability or digital preservation, and may simply want to know the essential steps involved in achieving the task at hand.

A key challenge for SUGAR was to provide a suitable level and amount of information to meet the immediate focus of the user and their level of expertise while introducing and encouraging consideration of issues of digital sustainability. SUGAR was also intended to stand alone as an online service unsupported by a helpdesk.

Towards an Open Source Repository and Preservation System

The UNESCO Memory of the World Programme, with the support of the Australian Partnership for Sustainable Repositories, has published Towards an Open Source Repository and Preservation System: Recommendations on the Implementation of an Open Source Digital Archival and Preservation System and on Related Software Development.

Here’s an excerpt from the Executive Summary and Recommendations:

This report defines the requirements for a digital archival and preservation system using standard hardware and describes a set of open source software which could used to implement it. There are two aspects of this report that distinguish it from other approaches. One is the complete or holistic approach to digital preservation. The report recognises that a functioning preservation system must consider all aspects of a digital repositories; Ingest, Access, Administration, Data Management, Preservation Planning and Archival Storage, including storage media and management software. Secondly, the report argues that, for simple digital objects, the solution to digital preservation is relatively well understood, and that what is needed are affordable tools, technology and training in using those systems.

An assumption of the report is that there is no ultimate, permanent storage media, nor will there be in the foreseeable future. It is instead necessary to design systems to manage the inevitable change from system to system. The aim and emphasis in digital preservation is to build sustainable systems rather than permanent carriers. . . .

The way open source communities, providers and distributors achieve their aims provides a model on how a sustainable archival system might work, be sustained, be upgraded and be developed as required. Similarly, many cultural institutions, archives and higher education institutions are participating in the open source software communities to influence the direction of the development of those softwares to their advantage, and ultimately to the advantage of the whole sector.

A fundamental finding of this report is that a simple, sustainable system that provides strategies to manage all the identified functions for digital preservation is necessary. It also finds that for simple discrete digital objects this is nearly possible. This report recommends that UNESCO supports the aggregation and development of an open source archival system, building on, and drawing together existing open source programs.

This report also recommends that UNESCO participates through its various committees, in open source software development on behalf of the countries, communities, and cultural institutions, who would benefit from a simple, yet sustainable, digital archival and preservation system. . . .

ARL’s Library Brown-Bag Lunch Series: Issues in Scholarly Communication

The Association of Research Libraries (ARL) has released a series of discussion guides for academic librarians to use with faculty. The guides are under a Creative Commons Attribution-ShareAlike 3.0 United States license.

Here’s an excerpt from the guides’ web page:

This series of Discussion Leader’s Guides can serve as a starting point for a single discussion or for a series of conversations. Each guide offers prework and discussion questions along with resources that provide further background for the discussion leader of an hour-long session.

Using the discussion guides, library leaders can launch a program quickly without requiring special expertise on the topics. A brown-bag series could be initiated by a library director, a group of staff, or by any staff person with an interest in the scholarly communication system. The only requirements are the willingness to organize the gatherings and facilitate each meeting’s discussion.

The University of Maine and Two Public Libraries Adopt Emory’s Digitization Plan

Library Journal Academic Newswire reports that the University of Maine, the Toronto Public Library, and the Cincinnati Public Library will follow Emory University’s lead and digitize public domain works utilizing Kirtas scanners with print-on-demand copies being made available via BookSurge. (Also see the press release: "BookSurge, an Amazon Group, and Kirtas Collaborate to Preserve and Distribute Historic Archival Books.")

Source: "University of Maine, plus Toronto and Cincinnati Public Libraries Join Emory in Scan Alternative." Library Journal Academic Newswire, 21 June 2007.

Dealing with Data: Roles, Rights, Responsibilities and Relationships

JISC has released its Dealing with Data: Roles, Rights, Responsibilities and Relationships: Consultancy Report, which was written as part of its Digital Repositories Programme’s Data Cluster Consultancy.

Here’s an excerpt from the Executive Summary:

This Report explores the roles, rights, responsibilities and relationships of institutions, data centres and other key stakeholders who work with data. It concentrates primarily on the UK scene with some reference to other relevant experience and opinion, and is framed as "a snapshot" of a relatively fast-moving field. . . .

The Report is largely based on two methodological approaches: a consultation workshop and a number of semi-structured interviews with stakeholder representatives.

It is set within the context of the burgeoning "data deluge" emanating from e-Science applications, increasing momentum behind open access policy drivers for data, and developments to define requirements for a co-ordinated e-infrastructure for the UK. The diversity and complexity of data are acknowledged, and developing typologies are referenced.

Version 68, Scholarly Electronic Publishing Bibliography

Version 68 of the Scholarly Electronic Publishing Bibliography is now available from Digital Scholarship. This selective bibliography presents over 3,040 articles, books, and other printed and electronic sources that are useful in understanding scholarly electronic publishing efforts on the Internet.

The Scholarly Electronic Publishing Bibliography: 2006 Annual Edition is also available from Digital Scholarship. Annual editions of the Scholarly Electronic Publishing Bibliography are PDF files designed for printing.

The bibliography has the following sections (revised sections are in italics):

1 Economic Issues
2 Electronic Books and Texts
2.1 Case Studies and History
2.2 General Works
2.3 Library Issues
3 Electronic Serials
3.1 Case Studies and History
3.2 Critiques
3.3 Electronic Distribution of Printed Journals
3.4 General Works
3.5 Library Issues
3.6 Research
4 General Works
5 Legal Issues
5.1 Intellectual Property Rights
5.2 License Agreements
6 Library Issues
6.1 Cataloging, Identifiers, Linking, and Metadata
6.2 Digital Libraries
6.3 General Works
6.4 Information Integrity and Preservation
7 New Publishing Models
8 Publisher Issues
8.1 Digital Rights Management
9 Repositories, E-Prints, and OAI
Appendix A. Related Bibliographies
Appendix B. About the Author
Appendix C. SEPB Use Statistics

Scholarly Electronic Publishing Resources includes the following sections:

Cataloging, Identifiers, Linking, and Metadata
Digital Libraries
Electronic Books and Texts
Electronic Serials
General Electronic Publishing
Images
Legal
Preservation
Publishers
Repositories, E-Prints, and OAI
SGML and Related Standards

Council of Australian University Librarians ETD Survey Report

The Council of Australian University Librarians has released Australasian Digital Theses Program: Membership Survey 2006.

Here’s an excerpt from the "Key Findings" section:

1. The average percentage of records for digital theses added to ADT is 95% when digital submission is mandatory and 17% when it is not mandatory. . . .

2. 59% of respondents will have mandatory digital submission in place in 2007.

3. With this level of mandatory submission it is predicted that 60% of all theses produced in Australia and New Zealand in 2007 will have a digital copy recorded in ADT. . . .

5. The overwhelming majority of respondents offer a mediated submission service, either only having a mediated service or offering both mediated and self-submission services. When mediated and self-submission are both available, the percentage self-submitted is polarised with some achieving over a 75% self-submission rate.

6. Over half the respondents have a repository already and most are using it to manage digital theses.

7. 87% will have a repository by the end of this year, and the rest are in the initial planning stage.

CIC’s Digitization Contract with Google

Library Journal Academic Newswire has published a must-read article ("Questions Emerge as Terms of the CIC/Google Deal Become Public") about the Committee on Institutional Cooperation’s Google Book Search Library Project contract.

The article includes quotes from Peter Brantley, Digital Library Federation Executive Director, from his "Monetizing Libraries" posting about the contract (another must-read piece).

Here’s an excerpt from Brantley’s posting:

In other words—pretty much, unless Google ceases business operations, or there is a legal ruling or agreement with publishers that expressly permits these institutions (excepting Michigan and Wisconsin which have contracts of precedence) to receive digitized copies of In-Copyright material, it will be held in escrow until such time as it becomes public domain.

That could be a long wait. . . .

In an article early this year in The New Yorker, "Google’s Moon Shot," Jeffrey Toobin discusses possible outcomes of the antagonism this project has generated between Google and publishers. Paramount among them, in his mind, is a settlement. . . .

A settlement between Google and publishers would create a barrier to entry in part because the current litigation would not be resolved through court decision; any new entrant would be faced with the unresolved legal issues and required to re-enter the settlement process on their own terms. That, beyond the costs of mass digitization itself, is likely to deter almost any other actor in the market.

Report on Chemistry Teaching/Research Data and Institutional Repositories

The JISC-funded SPECTRa project has released Project SPECTRa (Submission, Preservation and Exposure of Chemistry Teaching and Research Data): JISC Final Report, March 2007.

Here’s an excerpt from the Executive Summary:

Project SPECTRa’s principal aim was to facilitate the high-volume ingest and subsequent reuse of experimental data via institutional repositories, using the DSpace platform, by developing Open Source software tools which could easily be incorporated within chemists’ workflows. It focussed on three distinct areas of chemistry research—synthetic organic chemistry, crystallography and computational chemistry.

SPECTRa was funded by JISC’s Digital Repositories Programme as a joint project between the libraries and chemistry departments of the University of Cambridge and Imperial College London, in collaboration with the eBank UK project. . . .

Surveys of chemists at Imperial and Cambridge investigated their current use of computers and the Internet and identified specific data needs. The survey’s main conclusions were:

  • Much data is not stored electronically (e.g. lab books, paper copies of spectra)
  • A complex list of data file formats (particularly proprietary binary formats) being used
  • A significant ignorance of digital repositories
  • A requirement for restricted access to deposited experimental data

Distributable software tool development using Open Source code was undertaken to facilitate deposition into a repository, guided by interviews with key researchers. The project has provided tools which allow for the preservation aspects of data reuse. All legacy chemical file formats are converted to the appropriate Chemical Markup Language scheme to enable automatic data validation, metadata creation and long-term preservation needs. . . .

The deposition process adopted the concept of an "embargo repository" allowing unpublished or commercially sensitive material, identified through metadata, to be retained in a closed access environment until the data owner approved its release. . . .

Among the project’s findings were the following:

  • it has integrated the need for long-term management of experimental chemistry data with the maturing technology and organisational capability of digital repositories;
  • scientific data repositories are more complex to build and maintain than are those designed primarily for text-based materials;
  • the specific needs of individual scientific disciplines are best met by discipline-specific tools, though this is a resource-intensive process;
  • institutional repository managers need to understand the working practices of researchers in order to develop repository services that meet their requirements;
  • IPR issues relating to the ownership and reuse of scientific data are complex, and would benefit from authoritative guidance based on UK and EU law.

NIH Public Access Policy Mandate Needs Immediate Support

The Alliance for Taxpayer Access has issued an action alert regarding a change in the NIH Public Access Policy that would mandate deposit of articles resulting from NIH-funded research. Peter Suber has discussed this issue in relation to a call by ACRL for an NIH mandate.

Here is the alert:

The NIH Public Access Policy is currently under consideration by Congress, as part of the larger FY08 Labor/HHS, Education, and Related Agencies Appropriations Bill. The House is expected to mark up the FY08 Labor/HHS Appropriations Bill on Thursday, June 7th.

Please take action now to express your support for a shift to mandatory policy Fax your House Representative a letter as soon as possible.

Visit http://www.house.gov for contact information. Constituents of the House Appropriations Labor/HHS Subcommittee are especially encouraged to write. (http://appropriations.house.gov/Subcommittees/sub_lhhse.shtml)

For talking points and background on the NIH Public Access Policy and recent legislative measures, please see the ATA Web site at http://www.taxpayeraccess.org/nih.html.

NIH Policy Status

The House is expected to mark up the FY08 Labor/HHS Appropriations Bill within the week. The bill will then move to the full Appropriations committee. Please stand by for an announcement about House activities from the Alliance for Taxpayer Access in the coming days.

The Senate Appropriations Committee—Labor/HHS Subcommittee is expected to review their versions of appropriations bills later this month.

Emory Will Use Kirtas Scanner to Digitize Rare Books

Emory University’s Woodruff Library will use a Kirtas robotic book scanner to digitize rare books and to create PDF files that will be made available on the Internet and sold as print-on-demand books on Amazon.

Here’s an excerpt from the press release:

"We believe that mass digitization and print-on-demand publishing is an important new model for digital scholarship that is going to revolutionize the management of academic materials," said Martin Halbert, director for digital programs and systems at Emory’s Woodruff Library. "Information will no longer be lost in the mists of time when books go out of print. This is a way of opening up the past to the future."

Emory’s Woodruff Library is one of the premier research libraries in the United States, with extensive holdings in the humanities, including many rare and special collections. To increase accessibility to these aging materials, and ensure their preservation, the university purchased a Kirtas robotic book scanner, which can digitize as many as 50 books per day, transforming the pages from each volume into an Adobe Portable Document Format (PDF). The PDF files will be uploaded to a Web site where scholars can access them. If a scholar wishes to order a bound, printed copy of a digitized book, they can go to Amazon.com and order the book on line.

Emory will receive compensation from the sale of digitized copies, although Halbert stressed that the print-on-demand feature is not intended to generate a profit, but simply help the library recoup some of its costs in making out-of-print materials available.

Google Library Project Adds Committee on Institutional Cooperation (CIC)

The Google Book Search Library Project has an important new participant—the Committee on Institutional Cooperation (CIC). The CIC members are the University of Chicago, the University of Illinois, Indiana University, the University of Iowa, the University of Michigan, Michigan State University, the University of Minnesota, Northwestern University, Ohio State University, Pennsylvania State University, Purdue University, and the University of Wisconsin-Madison. As many as 10 million volumes will be digitized from the collections of these major research libraries.

Here’s an excerpt from the CIC press release:

This partnership between our 12 member universities and Google is unprecedented. What makes this work so exciting is that we will literally open the pages of millions of books that have been assembled on our library shelves over more than a century. In literally seconds, we’ll be able browse across the content of thousands of volumes, searching for words or phrases, and making links across those texts that would have taken weeks or months or years of dedicated and scrupulous analysis. It is an extraordinary effort, blending the efforts and aspirations of librarians, university administrators, and scholars from across 12 world-class research universities. And our corporate partner possesses unparalleled expertise in creating and opening the digital world to coherent and comprehensive searching.

The effort is not entirely without controversy—no great undertaking ever is. But our universities believe strongly in the power of information to change the world, and in preserving, protecting and extending access to information. We have carefully weighed and considered the intellectual property issues and believe that our effort is firmly within the guidelines of current copyright law, while providing some flexibility as those laws are tested in the new digital environment in the coming years.

Repositories as Platforms for Researchers e-Portfolios Podcast

The Australian Partnership for Sustainable Repositories (APSR) has made a podcast of Susan Gibbons’s "Repositories as Platforms for Researchers e-Portfolios" presentation at the Adaptable Repository workshop at the University of Sydney.

Powerpoints from the workshop’s presentations are also available.

Lawsuit Aside, McGraw-Hill Uses Google Book Search

According to an article in Network World, McGraw-Hill uses Google Book Search on its Web site in spite of the fact that it is suing Google over the product.

How can this be? McGraw-Hill participates in the Google Book Search Partner Program, which gives publishers control over access to their digitized books, but, at the same time, it objects to Google’s efforts to scan and make available copies of its books in libraries without its permission.

Source: Perez, Juan Carlos. "Google’s Book Search Available in Publisher Sites." Network World, 1 June 2007.

Happy Birthday Open Access News!

Open Access News is five today. OAN‘s indefatigable primary author Peter Suber has written over 10,800 OAN postings during this period. Going further back to 2001, he has written 109 issues of the SPARC Open Access Newsletter (formerly called the Free Online Scholarship Newsletter) as well as important papers on open access.

Thanks, Peter. The open access movement owes you a huge debt of gratitude for this fine work.

E-Book Trial on ScienceDirect

Elsevier has announced that it is conducting an e-book trial on ScienceDirect with over 900 research libraries and corporations.

Here’s an excerpt from the press release:

The trial will provide participating institutes with preliminary access to 500 of the 4,000 scientific and technical books that will be launched on ScienceDirect in the third quarter of 2007. . . .

The eBooks program represents a major expansion to the reference works, handbooks and book series already available on ScienceDirect. At launch, the program will comprise high-quality selected titles published from 1995 to the present day. The books will cover a wide range of scientific disciplines, including those published under the renowned Pergamon and Academic Press imprints. Following the launch, approximately 50 newly published titles will be added to the eBooks list on ScienceDirect each month, offering researchers unparalleled integration and linking between the latest online book and journal information.

Here’s a Chance to Hire Walt Crawford

Here’s a rare opportunity to hire a leading thinker in the library profession.

Walt Crawford is looking for work. For those of you who are not librarians and may not have heard of Walt, he is one of the most influential and important figures in the library world, and he was ranked among the most cited authors for the period 1994–2004 in a March 2007 College & Research Libraries article titled "Analysis of a Decade in Library Literature: 1994–2004" (unfortunately this article is not out of the C&RL embargo period yet and is not freely available).

Here’s a reproduction of Walt’s blog posting about this matter:

A special message:

Ever thought you or one of the groups you work for or with could use a Walt Crawford? Here’s your chance.

The RLG-OCLC transition will be complete in September. I’ve received a termination notice from OCLC, effective September 30, 2007.

I’m interested in exploring new possibilities. For now I’m trying not to narrow the options too much.

The basics: A new position could start any time after October 15, 2007 (possibly earlier). January to April 2008 might be ideal as a starting date, but earlier or later is quite possible.

I’m looking for a mutually-beneficial situation, which could be part time, could be full time, could be based on sponsorship of current writing and possible expansion to new areas, could be contract or consulting. I’m open to an exclusive working relationship—but also to more piecemeal possibilities.

Writing is important to me—but so is sensemaking, at the heart of what I’ve done at work and professionally for a few decades. I find numbers interesting (particularly exposing weaknesses in statistical assertions and finding the numbers that make most sense for an organization) and understand them well. I’ve been analyzing, synthesizing, designing (sometimes programming) and communicating throughout my career. I’m interested in the whole range of issues surrounding the intersections of libraries, policy, media and technology, and have demonstrated my effectiveness as a writer and speaker in those areas.

You can get a good sense of what I’ve published here, including my 15 (to date) books and many of the 400+ articles and columns.

I would certainly consider a short-term (say two to four years) situation—but if you have something that makes sense for both of us for a longer term, I have no set retirement date. If I had to name an ideal, it would probably be roughly two-thirds time with benefits (or full time if Cites & Insights was considered part of the job).

Clear limitation: There are very few places we’d be willing to relocate, most of them in temperate parts of the Pacific Rim—that is, California, Oregon, Washington, Hawaii, or maybe Australia or New Zealand. Otherwise, for most possibilities outside of Silicon Valley (or the Tri-Valley area around Livermore), I’d be looking to telecommute—and perfectly willing to travel on a reasonable basis.

If you have acquaintances who are unlikely to see this blog, within "groups that work for/with libraries"—publishers, vendors, search-engine makers, consortia, what have you—where you think I might be a good fit, I’d be delighted if you told them about this. If you’d like to blog about it, please do, saying whatever you like. (Schadenfreude?Be my guest.)

I don’t have a proper resume. I suspect I’m more likely to be hired by someone who knows who I am or is more interested in a full vita, available here. (OK, I’ll be 62 in September and I have an international reputation that is only slightly related to my daytime job: Maybe not the ideal combination for a classic "hit ’em with the keywords" resume.)

Offers, inquiries, questions, comments should go to me at my gmail address: waltcrawford. If you’d like to meet during ALA Annual, let me know.

For those of you who care about Cites & Insights: I have every intention of continuing and, with luck, improving C&I. I have every intention of keeping it free to the reader. I’ve been thinking about a spinoff in an area that I find increasingly important and that requires more room and time than I’ve been giving it—and that spinoff might or might not be free, depending on arrangements that come to light. Naturally, finding the right position will help ensure the future of C&I.

Here’s the brief bio:

Walt Crawford is an internationally recognized writer and speaker on libraries, technology, policy and media.Crawford was for many years Senior Analyst at RLG, focusing on user interface design and actual usage patterns for end-user bibliographic search systems. Through September 30, 2007, he works on RLG-OCLC transition and integration issues.

Crawford is the creator, writer and publisher of Cites & Insights: Crawford at Large, an ejournal on the intersections of libraries, policy, technology and media published monthly since 2001. He also maintains a blog on these and other issues, Walt at Random.

Crawford’s books include Balanced Libraries: Thoughts on Continuity and Change (2007), First Have Something to Say: Writing for the Library Profession (2003), Being Analog: Creating Tomorrow’s Libraries (1999), Future Libraries: Dreams, Madness & Reality (with Michael Gorman, 1995), and eleven others going back to MARC for Library Use: Understanding the USMARC Formats(1984).

Crawford writes the “disContent” column in EContent Magazine and has written columns for American Libraries, Online and Library Hi Tech. In all, he has written more than 400 library-related articles and columns appearing in a range of library publications.

Crawford was recently cited as one of the 31 most frequently cited authors in library literature 1994-2004 (the only American writer on that list outside academic libraries). In 1995, he received the American Library Association’s LITA/Library Hi Tech Award for Excellence in Communication for Continuing Education, followed by the ALCTS/Blackwell Scholarship Award in 1997. He was president of the Library and Information Technology Association in 1992/93.

More information is available at Crawford’s home page.

The REMAP Project: Record Management and Preservation in Digital Repositories

The REMAP Project at the University of Hull has been funded by JISC investigate how record management and digital preservation functions can be best supported in digital repositories. It utilizes the Fedora system.

Here’s an except from the Project Aims page (I have added the links in this excerpt):

The REMAP project has the following aims:

  • To develop Records Management and Digital Preservation (RMDP) workflow(s) in order to understand how a digital repository can support these activities
  • To embed digital repository interaction within working practices for RMDP purposes
  • To further develop the use of a WSBPEL orchestration tool to work with external Web services, including the PRONOM Web services, to provide appropriate metadata and file information for RMDP
  • To develop and test a notification layer that can interact with the orchestration tool and allow RSS
    syndication to individuals alerting them to RMDP tasks
  • To develop and test an intermediate persistence layer to underpin the notification layer and interact
    with the WSBPEL orchestration tool to allow orchestrated workflows to take place over time
  • To test and validate the use of the enhanced WSBPEL tool with institutional staff involved in RMDP activities

What Does Out of Print Mean in a POD Era?

A contract language change by Simon & Schuster that makes all its books available by print-on-demand technology "in print" has raised the hackles of the Authors Guild. The issue is that as long as a book is in print the rights do not revert back to the author, who could then look for another publisher who would actively promote the book and boost sales.

Source: Rich, Motoko. "Publisher and Authors Parse a Term: Out of Print." The New York Times, 18 May 2007, C3.