May 2009 – DigitalKoans

DigitalKoans Break

DigitalKoans postings will resume on 6/8/09.

Blog Reports from Open Repositories 2009

Below are some blog reports from the Open Repositories 2009 conference.

Mark Leggott

H.J. (Driek) Heesakkers

Peter Sefton

Open Repositories 2009 Trip Report

Keeping Research Data Safe 2: The Identification of Long-lived Digital Datasets for the Purposes of Cost Analysis: Project Plan

Charles Beagrie has released Keeping Research Data Safe 2: The Identification of Long-lived Digital Datasets for the Purposes of Cost Analysis: Project Plan.

Here's an excerpt from the project home page:

The Keeping Research Data Safe 2 project commenced on 31 March 2009 and will complete in December 2009. The project will identify and analyse sources of long-lived data and develop longitudinal data on associated preservation costs and benefits. We believe these outcomes will be critical to developing preservation costing tools and cost benefit analyses for justifying and sustaining major investments in repositories and data curation.

Durable Digital Media: How Does a Billion Years Sound?

A recent article in Nano Letters describes an experimental nanotechnology-based storage device that could last for a billion years and store up to one terabyte per square inch.

Read more about it at "New Memory Material May Hold Data For One Billion Years."

Introducing Copyright: A Plain Language Guide to Copyright in the 21st Century

The Commonwealth of Learning has published Introducing Copyright: A Plain Language Guide to Copyright in the 21st Century as a PDF file under a Creative Commons Attribution-Noncommercial-No Derivative Works license.

Here's an excerpt from the announcement:

This book was written for those who want to learn about copyright in the 21st century. It explains copyright protection and what it means for copyright holders and copyright users. It also introduces readers to contemporary topics: digital rights management, open licences, software patents and copyright protection for works of traditional knowledge. A final chapter tries to predict how technology will change the publishing and entertainment industries that depend on copyright.

The book assumes no special knowledge and avoids technical language as much as possible.

“Deal or No Deal: What If the Google Settlement Fails?”

In "Deal or No Deal: What If the Google Settlement Fails?," Andrew Richard Albanese examines the uncertain future of the Google Book Search Settlement Agreement.

Here's an excerpt:

"This thing is going to die," one close observer of the settlement told PW [Publishers Weekly]. "Let's put it this way—with all the sketchy things in the agreement, there is no way [the parties] want people to look at this longer, rather than shorter."

DISC-UK DataShare Project: Final Report

JISC has released DISC-UK DataShare Project: Final Report.

Here's an excerpt:

The DISC-UK DataShare Project was funded from March 2007-March 2009 as part of JISC's Repositories and Preservation programme, Repositories Enhancement strand. It was led by EDINA and Edinburgh University Data Library in partnership with the University of Oxford and the University of Southampton. The project built on the existing informal collaboration of UK data librarians and data managers who formed DISC-UK (Data Information Specialists Committee–UK).

This project has brought together the distinct communities of data support staff in universities and institutional repository managers in order to bridge gaps and exploit the expertise of both to advance the current provision of repository services for accommodating datasets, and thus to explore new pathways to assist academics at our institutions who wish to share their data over the Internet. The project's overall aim was to contribute to new models, workflows and tools for academic data sharing within a complex and dynamic information environment which includes increased emphasis on stewardship of institutional knowledge assets of all types; new technologies to enhance e- Research; new research council policies and mandates; and the growth of the Open Access / Open Data movement.

With three institutions taking part plus the London School of Economics as an associate partner, a range of exemplars have emerged from the establishment of institutional data repositories and related services. Part of the variety in the exemplars is a result of the different repository platforms used by the three project partners: DSpace (Edinburgh DataShare), ePrints (e-Prints Soton) and Fedora (Oxford University Research Archive, ORA)–all open source software. LSE took another route and is using the distributed Dataverse repository network for data, linking to publications in LSE Research Online. Also, different approaches were taken in setting up the repositories. All three institutions had an existing, well-used institutional repository, but two chose to incorporate datasets within the same system as the publications, and one (Edinburgh DataShare) was a paired repository exclusively for datasets, designed to interoperate with the publications repository (Edinburgh Research Archive). The approach took a major turn midway through the project when an apparent solution to the problem of lack of voluntary deposits arose, in the form of the advent of the Data Audit Framework. Edinburgh participated as a partner in the DAF Development project which created the methodology for the framework, and also won a bid to carry out its own DAF Implementation project. Later, the other two partners conducted their own versions of the data audit framework under the auspices of the DataShare project.

A number of scoping activities were carried about by the partners with the goal of informing repository enhancement as well as broader dissemination. These included a State-of-the-Art-Review to determine what had been learned by previous repository projects in the UK that had forayed into the data arena. This resulted in a list of benefits and barriers to deposit of datasets by researchers to inform our outreach activities. A Data Sharing Continuum diagram was developed to illustrate where the projects were aiming to fit into the curation landscape, and the range of curation steps that could be taken, from simple backup to online visualization. Later on, a specialized metadata schema was explored (Data Documentation Initiative or DDI) in terms of how it might be incorporated into repository systems, though repository development in this area was not taken up. Instead, a dataset application profile was developed based on qualified Dublin Core (dcterms). This was implemented in the Edinburgh DataShare repository and adapted by Southampton for their next release. The project wished to explore wider issues with open data and web publishing, and therefore produced two briefing papers to do with data mashups–on numeric data and geospatial data. Finally, the project staff and consultant distilled what it had learned in terms of policy development for data repositories in a training guide. A number of peer reviewed posters, papers, and articles were written by DISC-UK members about various aspects of the project during the period.

Key conclusions were that 1) Data management motivation is a better bottom-up driver for researchers than data sharing but is not sufficient to create culture change, 2) Data librarians, data managers and data scientists can help bridge communication between repository managers & researchers, and 3) IRs can improve impact of sharing data over the internet.

Digital Preservation: PARSE.Insight Project Reports on First Year Achievements

In "Annual Review Year 1: Goals and Achievements," The PARSE.Insight (Permanent Access to the Records of Science in Europe) Project reports on its first year achievements. This post includes links to a number of longer documents, including the PARSE.Insight Deliverable D2.1 Draft Roadmap.

Here's an excerpt from the PARSE.Insight Deliverable D2.1 Draft Roadmap.

The purpose of this document is to provide an overview and initial details of a number of specific components, both technical and non-technical, which would be needed to supplement existing and already planned infrastructures for science data. The infrastructure components presented here are aimed at bridging the gaps between islands of functionality, developed for particular purposes, often by other European projects, whether separated by discipline or time. Thus the infrastructure components are intended to play a general, unifying role in science data. While developed in the context of a European wide infrastructure, there would be great advantages for these types of infrastructure components to be available much more widely.

Library IT Jobs: Systems/Electronic Resources Librarian at Pratt Institute

The Pratt Institute Libraries are recruiting a Systems/Electronic Resources Librarian.

Here's an excerpt from the ad:

Under the supervision of the Director of Libraries work in a collaborative environment within the Systems team which is responsible for the overall management and support of the Millennium integrated library system and other library applications in the Pratt Institute Libraries. This is a full-time 12 month per year tenure-track faculty position at the Assistant Professor rank.

Welsh Repository Network Final Report

JISC has released the Welsh Repository Network Final Report.

Here's an excerpt:

The aim of the Welsh Repository Network (WRN) was to put in place an essential building block for the development of an integrated network of institutional digital repositories in Wales. The project entailed a centrally managed hardware procurement programme designed to provide every HEI in Wales with dedicated and configured repository hardware. In close collaboration with the technical, organisational and operational support specifically provided for Welsh Higher Education Institutions (HEIs) within the JISC funded Repositories Support Project (RSP), also delivered from Aberystwyth University, this initiative provided a cost-effective, collaborative and decisive boost to the repository agenda in Wales and helped JISC achieve the critical mass of populated repositories and digital content that is a stated objective of the Repositories and Preservation Programme.

The project employed a three-stage approach: requirements gathering, procurement and installation, and monitoring and evaluation. Extensive site visits and regular communication with project partners were a fundamental aspect of project activity and a variety of models were used for procuring hardware including collaborative approaches, outsourcing to commercial software and establishing hosting agreements.

At its most practical level the principal deliverable of the WRN project has been the provision of repository hardware capacity in each and every HEI in Wales which, in combination with the hands-on technical support provided by the RSP, enabled all 12 HEIs to have functional institutional repositories by March 2009. More generally, the project has contributed a series of case studies and test sites that provide the wider JISC community with practical insights into the process of matching alternative organisational models, repository types and hardware configurations to different geographical and institutional settings. The main conclusion to be drawn from the WRN is that while providing funds for procuring hardware helps to push repository development up the institutional agenda, the support that goes with the funding, especially the technical support, is a far more crucial factor in generating a successful and lasting outcome.

Read more about it at the "project Web site."

Digital Library Jobs: Digital Repositories Manager at University of Leeds

The Leeds University Library is recruiting a Digital Repositories Manager.

Here's an excerpt from the ad:

You will manage the University’s digital repositories—created to contain resources in a variety of formats in order to support both research and learning & teaching. You will lead the development and monitoring of repository services, and ensure that the Library offers excellent support for University staff who wish to use the repositories. You will be an expert in digital content, with a good understanding of the issues surrounding creation of digital assets, and offering online access to those assets. Ideally you will have some technical expertise in order that you can contribute to the development and/or support of the repository infrastructure. The Library is making a significant investment in the e-information environment at the University of Leeds and this post offers an exciting opportunity to shape future information provision in innovative ways.

MPAA Attorney Says Even One Personal Backup Copy of DVD is Illegal

Bart Williams, an MPAA attorney, said in a hearing about the Realnetworks v. DVD Copy Control Association case that even if a consumer made single copy of a DVD for acquired for personal use that: "One copy is a violation of the DMCA."

Forty Percent of UK University Libraries to Cut Materials Budgets in 2009-10 Academic Year

The Times Higher Education reports that 40% of surveyed UK university libraries intend to cut journals and books from their materials budgets in the 2009-10 academic year, and a fifth expect to cut at least one "big deal" electronic journal package. (Thanks to Colin Steele.)

“Achieving the Full Potential of Repository Deposit Policies”

Karla Hahn has published "Achieving the Full Potential of Repository Deposit Policies" in the latest issue of Research Library Issues.

Here's an excerpt:

Editor's note: A small group of individuals with expertise on author-rights policies, the campus policy environment, National Institutes of Health (NIH) deposit processes, and digital repository services met in Washington DC on January 9, 2009, under the auspices of ARL's Public Policy and Scholarly Communication programs. The group explored opportunities, desired outcomes, and policy issues involved in developing capabilities for institutionally mediated deposit processes and content transfer between institution-based and funder-based repositories, particularly PubMed Central. Based on that discussion, the group also identified potential strategies that would lead toward creating the needed rights-management environment and repository services. This essay reflects the January 9 discussions.

Also of interest in this issue are: "Author-Rights Language in Library Content Licenses," "Digital Scholarly Communication: A Snapshot of Current Trends," and "Strategies for Supporting New Genres of Scholarship."

O’Reilly Launches Open Feedback Publishing System

O'Reilly has launched the Open Feedback Publishing System, which allows readers to comment on in-progress works.

Here's an excerpt from the announcement:

Over the last few years, traditional publishing has been moving closer to the web and learning a lot of lessons from blogs and wikis, in particular. Today we're happy to announce another small step in that direction: our first manuscript (Programming Scala) is now available for public reading and feedback as part of our Open Feedback Publishing System. The idea is simple: improve in-progress books by engaging the community in a collaborative dialog with the authors out in the open. To do this, we followed the model of the Django Book, Real World Haskell, and Mercurial: The Definitive Guide (among others) and built a system to regularly publish the whole manuscript online as HTML with a comment box under every paragraph, sidebar, figure, and table.

“An Overview of the OAI Object Reuse and Exchange Interoperability Framework” Presentation

Herbert Van de Sompel has made his recent "An Overview of the OAI Object Reuse and Exchange Interoperability Framework" presentation available on Slideshare. (Thanks to Pintiniblog.)

Implementing an Institutional Repository for Leeds Metropolitan University

Wendy Luker and Nick Sheppard have released Implementing an Institutional Repository for Leeds Metropolitan University: Final Report. The repository project was funded by JISC.

Here's an excerpt:

We are able to conclude from this project that Intrallect's intraLibrary software is extensible to a wide range of content and, in particular, adaptable to serve as an effective Open Access research repository. However, to achieve this has been a steep learning curve and the system still requires some development to be fully effective for this specific purpose. The main areas for further development are:

Continued development and refinement of the SRU search interface

Continued development work to ensure OA content is discoverable on the public web; by implementing XML site-maps and, ideally, working with Intrallect to facilitate full text indexing

Continued development work on self-archiving and/or mediated work flows—possibly utilising SWORD technology

We can also conclude that there are real issues in engaging with the academic community to promote the model of Open Access to research in its current form, at least in the short term. Procurement of full text content has followed the pattern exhibited elsewhere in the sector. A number of full text articles are available within the Repository. However, to date the bulk of contributions have been in citation format. The University Research Office is very supportive of the project, and are convinced of the potential of the Repository to raise the profile of research at the University. It is hope that this commitment, combined with the already high profile of the Repository, will lead to higher levels of full text deposit.

U.S. Federal Government Launches Data.gov

The U.S. Federal Government has launched Data.gov.

Here's an excerpt from the home page:

The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government. Although the initial launch of Data.gov provides a limited portion of the rich variety of Federal datasets presently available, we invite you to actively participate in shaping the future of Data.gov by suggesting additional datasets and site enhancements to provide seamless access and use of your Federal data.

Podcast about Copyright Office Meeting on Copyright Exceptions for the Blind or Other Persons with Disabilities

Public Knowledge has released a podcast about the May 18, 2009 U.S. Copyright Office meeting on copyright exceptions for the blind or other persons with disabilities.

Mention-It Takes Open Repositories 2009 Developer Challenge Award

The Mention-It JavaScript library has won the Open Repositories 2009 Developer Challenge award. The application aggregates "'mentions' of content held within an institutional repository (or personal blog/webpage) from across the web."

Read more about it at "'Mention-It' App Takes Developer Challenge Prize at OR09."

Digital Library Jobs: Digital Projects Librarian at Truman State University

The Truman State University Pickler Memorial Library is recruiting a Digital Projects Librarian.

Here's an excerpt from the ad:

The successful candidate will build and maintain digital collections supporting the University’s mission and curriculum. Responsibilities include working with librarians and the campus community to identify opportunities for digital collection to meet information needs; identifying, evaluating, and implementing appropriate software and hardware for digital collections; identifying and implementing appropriate metadata standards; writing and maintaining policies and procedures relating to digital collections; marketing digital collections to the campus community; serving on Library and University committees; assisting at the reference desk; and other duties as assigned. This position would also have the opportunity to assist with traditional collection development.

“Enhancing the Debate on Open Access: A Joint Statement by the International Federation of Library Associations and Institutions and the International Publishers Association”

IFLA and the IPA have issued "Enhancing the Debate on Open Access: A Joint Statement by the International Federation of Library Associations and Institutions and the International Publishers Association."

Here's an excerpt:

IFLA and IPA share a common set of basic understandings and believe that the observance of the shared ground as set out below would enhance the overall debate.

IFLA and IPA value the contribution to scholarly communication that publishers and libraries have made and believe that mutual respect is important to enhance the quality of the public discourse on open access.

IFLA and IPA recognise that the concerns of academic authors must be at the heart of this debate—their scientific freedom, and their needs as researchers, teachers, authors, reviewers and users are paramount.

IFLA and IPA acknowledge that the broadest possible access to scholarly communications is an important shared objective and that potential access to all research by all researchers, irrespective of geographical location or institutional affiliation is a shared aspiration of libraries and publishers.

All assumptions surrounding open access and scholarly communications should be open to scientific scrutiny and academic debate. All stakeholders are encouraged to innovate, experiment and explore the new opportunities that technology brings.

IFLA and IPA recognise that access must be sustainable, i.e. that economic long-term viability and long-term archiving are important elements of this debate.

IFLA and IPA agree that the debate is most effective if it recognises the potential diversity of scholarly communication in different academic disciplines and different types of publications, e.g, research journals, review journals, monographs, text books, etc. IFLA and IPA support a debate that avoids general conclusions for all scholarly communication but gives a closer, differentiated focus on the potentially very different framework in various academic disciplines and types of publications.

Equally, scholarly publishers and their specific roles and functions can vary greatly. Scholarly publishing includes publishers with a variety of commercial and non-commercial affiliations and interests, outside and within the research community.

IFLA and IPA believe publishers, librarians, government and funding agencies should at this stage support innovation, experimentation and pilot schemes on access to scholarly publications. Pilot schemes should be accompanied by vigorous research and analysis that enables evaluation against measurable targets, that reflect the chief concerns of academic authors (as set out in Point 2), as the basis for an enriched, fact-oriented debate. As part of investigating the feasibility of open access, studies should also explore such matters as impact, transparency and economic models. Data should be shared openly among stakeholders or disclosed to allow open scrutiny. The results from these studies should provide better insight into the processes surrounding open access.

Library Trends Thematic Issue on the Library of Congress National Digital Information Infrastructure and Preservation Program

The latest issue of Library Trends has a series of articles on the Library of Congress National Digital Information Infrastructure and Preservation Program.

Google and University of Michigan Sign Expanded Digitization Agreement

Google and the University of Michigan have signed an expanded digitization agreement that incorporates the terms of the Google Book Search Settlement Agreement.

Here's an excerpt from the announcement:

Specifically, the agreement:

Expands the scope of Google and University of Michigan's partnership:
The University of Michigan continues its tradition of leadership in library digitization by being the first library to expand its partnership with Google under the terms of Google's settlement agreement with a broad class of authors and publishers. The principles underlying the new agreement are to ensure access to our collection, to provide a solid foundation for future research and study, and to provide the greatest public good for patrons of libraries around the US.

Broadens public access to University of Michigan's collections:
Once the settlement is approved by the court, readers and students throughout the US will enjoy the benefits of University of Michigan's collections, including free previews, the ability to buy access to University of Michigan's books online, and institutional subscriptions.

Supports shared services with other libraries:
The agreement empowers University of Michigan to broaden public access to its collection by using digital files of books that Google scans to strengthen and support initiatives like HathiTrust.

Provides greater digital access to University of Michigan's collections for students and faculty:
University of Michigan will get a digital copy of every book held in their collection, whether it's scanned from Michigan or at another library.

Broadens access to public domain books from University of Michigan's collection:
The University of Michigan will be able to share digital copies of public domain works Google has digitized from its collection with fellow academic institutions, libraries, and other organizations for non-commercial purposes. These provisions enable Michigan to share its digital library collection with students, scholars, and other library users around the world.

Subsidizes University of Michigan's Institutional Subscription:
If approved by the court, Google's agreement with authors and publishers allows it to make millions of digitized books available to colleges and universities via a subscription. Under our new agreement, Google will subsidize the cost of Michigan's subscription based on the number of books scanned from Michigan. In practice, this means that Google will subsidize the entire cost of Michigan's institutional subscription–so that Michigan's students and staff will be able to access and read almost every book Google has digitized from 29 libraries around the world, for free.

Expands access for students, faculty, and patrons with disabilities:
Google will make public domain works digitized from Michigan's print library collection accessible to users with print disabilities in the same ways as in-copyright books covered under the settlement agreement.

Safeguards the public's access to knowledge:
Michigan's agreement includes collective terms Google has committed to that can be enjoyed by any of Google's other partner libraries. Michigan is the first university to sign on to these terms, which give libraries new ways to help safeguard the public's access to these books.

Establishes a mechanism to review prices:
Our agreement gives Michigan and other participating libraries the power to review the pricing of Institutional Subscriptions to make sure that they are priced for "broad penetration," as required by the settlement agreement. That means that the reviewer will evaluate whether subscriptions are affordable enough to allow universities, libraries, and other institutions across the country to take advantage of them.

If they determine that prices are too high, University of Michigan and other participating libraries who sign these collective terms can challenge the prices through arbitration, and Google will be required to work with the Registry to adjust the pricing accordingly.

Ensures access to millions of books for generations to come:
Google has committed to make the books it has scanned publicly available for free search, consumer purchase, institutional subscriptions, and other services established by the settlement agreement. Our agreement ensures that libraries and their patrons can continue to use digital copies of the millions of books Google has scanned well into the future, even if Google goes away.

Also see the press release.

Digital Repositories Workshop: Tools and Infrastructure Presentations

Presentations from the Oxford Digital Repositories Steering Group's Digital Repositories Workshop: Tools and Infrastructure meeting are available.

Read more about it at "Report on the Digital Repositories Workshop."