DigitalPreservationEurope Publishes Report on Copyright and Privacy Issues for Cooperating Repositories

DigitalPreservationEurope has published PO3.4: Report on the Legal Framework on Repository Infrastructure Impacting on Cooperation Across Member States.

Here's excerpt from the "Introduction."

The focus of this paper is the legal framework for the management of content of cooperating repositories. The focus will be on the regulation of copyright and protection of personal data. That copyright is important when managing data repositories is common knowledge. However, there is an increasing tendency among authors not only to deposit their published scientific work, scientific articles, dissertations or books, but also the underlying data. In addition to this ordinary publicly available sources like internet web pages contain personal data, often of a sensitive nature. Due to this emergent trend repositories will have to comply with the rules governing the use and protection of personal data, especially in the medical and social sciences.

The scenario is the following:

  • National repositories acquire material from different sources and in different formats.
  • The repositories cooperate with repositories in other countries in the preservation of data.
  • There is some degree of specialisation, some repositories specialise on preserving certain formats and other repositories on the preservation of other formats.

This paper describes the legal framework regulating the two decisive actions which have to take place if this scenario is to become a reality:

  1. The reproduction of data
  2. The transfer of data to other repositories

Other copyright issues like the rules concerning communication with the public and the protection of databases will also be touched upon.

Open-Source IRStats Released: Use Statistics for EPrints and DSpace

Eprints.org has released IRStats, an open source use statistics analysis package that analyzes both EPrints (versions 2 and 3) and DSpace (beta functionality) logs. The software is under a BSD license, and it requires Perl, awstats, MySQL, Maxmind Organisation Database, ChartDirector, and a CGI-capable Web server.

A description of IRStats features is available as well as examples of its use. For additional information on the project, see "Introduction to IRS."

DSpace 1.5 Alpha Released

The 1.5 alpha version of the popular DSpace repository software has been released.

Here's an excerpt from "DSpace 1.5 Alpha with Experimental Binary Distribution" by Richard Jones:

There are big changes in this code base, both in terms of functionality and organisation. First, we are now using Maven to manage our build process, and have carved the application into a set of core modules which can be used to assemble your desired DSpace instance. . . .

The second big and most exciting thing is that Manakin is now part of our standard distribution, and we want to see it taking over from the JSP UI over the next few major releases. . . .

In addition to this, we have an Event System which should help us start to decouple tightly integrated parts of the repository. . . . Browsing is now done with a heavily configurable system . . . . Tim Donohue's much desired Configurable Submission system is now integrated with both JSP and Manakin interfaces and is part of the release too.

Further to this we have a bunch of other functionality including: IP Authentication, better metadata and schema registry import, move items from one collection to another, metadata export, configurable multilingualism support, Google and html sitemap generator, Community and Sub-Communities as OAI Sets, and Item metadata in XHTML head ‹meta› elements.

RUBRIC Toolkit: Institutional Repository Solutions Released

The RUBRIC Project has released the RUBRIC Toolkit: Institutional Repository Solutions.

Here's an excerpt from RUBRIC Toolkit: About the RUBRIC Project and the Toolkit page:

The RUBRIC Toolkit is a legacy of the RUBRIC Project, reflecting the discussions, investigation, phases, processes, issues and experiences surrounding the implementation of an Institutional Repository (IR). The sections are based on the collaborative experience of the eight Australian and New Zealand Universities involved in the project.

The content for the RUBRIC Toolkit developed organically and collaboratively in the project wiki over an extended period of time. It was then refined and developed. Project members have populated the Toolkit with useful resources and tools that can be used by other Project Managers and Institutions implementing an IR.

The RUBRIC Toolkit was released in October 2007 and will continue to be updated until the end of the RUBRIC Project in December 2007. As such the Toolkit captures the "best" of available advice, experience and outcomes available for IR development in 2007 and provides links to further reading wherever possible.

Muradora 1.0, a Fedora Front-End, Released

DRAMA (Digital Repository Authorization Middleware Architecture) has released Muradora 1.0, a Fedora front-end that provides identity control (via Shibboleth), authorization (via XACML), and other functions. DRAMA is a sub-project of RAMP (Research Activityflow and Middleware Priorities Project). A Live DVD image simplifies installation.

Here’s an excerpt from the fedora-commons-users posting:

  • "Out-of-the-box" or customized deployment options
  • Intuitive access control editor allows end-users to specify their own access control criteria without editing any XML.
  • Hierarchical enforcement of access control policies. Access control can be set at the collection level, object level or datastream level.
  • Metadata input and validation for any well-formed metadata schema using XForms (a W3C standard). New metadata schemas can be supported via XForms scripts (no Muradora code modification required).
  • Flexible and extensible architecture based on the well known Java Spring enterprise framework.
  • Multiple deployments of Muradora (each customized for their own specific purpose) can talk to the one instance of Fedora.
  • Freely available as open source software (Apache 2 license). All dependent software is also open source.

Irish Virtual Research Library and Archive Repository Launched

The University College Dublin has launched the Irish Virtual Research Library and Archive Repository.

Here's an excerpt from the press release:

VRLA is a digital archive containing a number of digitised collections from UCD’s holdings, of use and interest to Irish humanities researchers. The IVRLA has developed a sophisticated interface enabling users to browse, search, tag and cite digital objects and view or download them in a variety of file formats. This interface sits on top of an open source repository architecture that functions as the IVRLA’s base content store. An elaborate collection model has been developed ensuring all content is viewed within context and structure. This model is particularly suited for organic primary source collections and enables hierarchy and sub-division in how objects are arranged and held within collections.

Peter Murray-Rust Presentation on the Scientific E-Thesis

Peter Murray-Rust's presentation at Caltech on "The Power of the Scientific eThesis" is now available. (You may be asked to install an ActiveX control by MediaSite; you can run the presentation without it.)

Source: Smart, Laura J. "Peter Murray-Rust at Caltech." Repositories for the Rest of Us, 7 September 2007.

AONS: Scanning Repositories for Obsolete Digital Formats

The APSR AONS II project has released a beta version of the Automatic Obsolescence Notification System (AONS).

Here's an excerpt from the announcement on apsr_announcements:

Users can register with the service by providing a URL to a repository's format scan summary. The AONS service will display the summary and allow a repository manager to compare the formats of items in their repository with information from format registries such as PRONOM and Library of Congress. These registries flag any formats that are likely to become obsolete. Repository managers can then make curation decisions about any items at risk, such as upgrading their formats.

By downloading and installing an AONS locally, an institution can also take advantage of a pilot risk metrics implementation. . . .

The AONS software is the result of the AONS II project funded under APSR and developed by David Pearson, David Levy and Matthew Walker from the National Library of Australia (NLA) with an administrative user interface developed by David Berriman at ANU.

The software is able to be downloaded from Sourceforge at http://sourceforge.net/projects/aons and a mailing list is also available for support and feedback. As this is a beta release we welcome feedback to the Sourceforge mailing list to inform our testing which will continue until mid-September.

Please try out the pilot service by sending an email to cosi@apsr.edu.au to register with the service, and tell us which institution you are from. . . .

University of Minnesota Launches the Digital Conservancy

The University of Minnesota has launched its institutional repository, the Digital Conservancy. It utilizes DSpace.

Here's a description from the University Digital Conservancy FAQ page:

The University Digital Conservancy is a program of the University of Minnesota, administered by the University Libraries. The program provides stewardship, reliable long-term open access, and broad dissemination of the digital scholarly and administrative works of University of Minnesota faculty, departments, centers and offices. Materials in the Conservancy are freely available online to the University community and to the public.

Here are selected web pages about the Digital Conservancy:

Institutional Repositories: DOA?

Of late, an air of discouragement has begun to permeate discussions about institutional repositories. Of course, this is understandable. E-print deposit rates have been disappointing, deposit mandates hard to come by, and real operational costs have been higher than some imagined.

Are institutional repositories dead on arrival?

The answer is determined by our expectations.

If we expect swift, easy, rapid progress with university administrators and faculty enthusiastically rallying behind institutional repositories, the answer is "yes." The thrill of putting up the repository software and seeing the initial inflow of e-prints is, for many, gone; the experiment has failed; and it's time to cut our losses and move on.

On the other hand, if we expect that the establishment of fully functional institutional repositories will be a complex, lengthy, and expensive venture, we are on target, and remarkable progress has been made worldwide in a short period of time.

I'm in the latter camp. I cannot say this enough: successful institutional repositories are not primarily determined by technical factors, rather they are determined by attitudinal factors. In other words, faculty, especially key faculty such as holders of endowed chairs and journal editors, and university administrators, especially provosts and presidents, must be convinced that institutional repositories are essential infrastructure for the 21st century. For the most part, the argument rests on the scholarly communication crisis theme, with institutional repositories portrayed as part of the remedy. However, institutional prestige, institutional visibility, and improved citation impact factors are important themes as well. The successful, relentless communication of these themes to key constituencies is essential to the successful establishment of institutional repositories.

In my view, the best strategy for a institution without a repository is to start a vigorous scholarly communication outreach program first. The next best strategy is to do so in parallel with putting up an institutional repository. Next is to implement a scholarly communication program after the repository is up. The worst strategy is to put up a repository with no scholarly communication program—this is a recipe for failure.

So, chin up. It will take slow, steady effort to succeed, but it will be worth it in the end.

Institutional Repositories: Staff and Skills Requirements

SHEPRA has released Institutional Repositories: Staff and Skills Requirements.

Here’s an excerpt from the document:

This document began in response to requests received by the core SHERPA team for examples of job descriptions for repository posts. Its development has been greatly assisted by contributions from the SHERPA partners and UKCORR members.

This document will be revised annually (July/August) to reflect changing needs and requirements. Input from the repository community will be sought at this time.

Fedora Commons Website Launches

The Fedora Commons Website has gone live.

Here's an excerpt from the About Fedora Commons page:

Fedora Commons is a non-profit organization providing sustainable technologies to create, manage, publish, share and preserve digital content as a basis for intellectual, organizational, scientific and cultural heritage by bringing two communities together.

Communities of practice that include scholars, artists, educators, Web innovators, publishers, scientists, librarians, archivists, publishers, records managers, museum curators or anyone who presents, accesses, or preserves digital content.

Software developers who work on the cutting edge of open source Web and enterprise content technologies to ensure that collaboratively created knowledge is available now and in the future.

Fedora Commons is the home of the unique Fedora open source software, a robust integrated repository-centered platform that enables the storage, access and management of virtually any kind of digital content.

Here's an excerpt from the press release about the Gordon and Betty Moore Foundation grant that helps fund the Fedora Commons:

Fedora Commons today announced the award of a four year, $4.9M grant from the Gordon and Betty Moore Foundation to develop the organizational and technical frameworks necessary to effect revolutionary change in how scientists, scholars, museums, libraries, and educators collaborate to produce, share, and preserve their digital intellectual creations. Fedora Commons is a new non-profit organization that will continue the mission of the Fedora Project, the successful open-source software collaboration between Cornell University and the University of Virginia. The Fedora Project evolved from the Flexible Extensible Digital Object Repository Architecture (Fedora) developed by researchers at Cornell Computing and Information Science.

With this funding, Fedora Commons will foster an open community to support the development and deployment of open source software, which facilitates open collaboration and open access to scholarly, scientific, cultural, and educational materials in digital form. The software platform developed by Fedora Commons with Gordon and Betty Moore Foundation funding will support a networked model of intellectual activity, whereby scientists, scholars, teachers, and students will use the Internet to collaboratively create new ideas, and build on, annotate, and refine the ideas of their colleagues worldwide. With its roots in the Fedora open-source repository system, developed since 2001 with support from the Andrew W. Mellon Foundation, the new software will continue to focus on the integrity and longevity of the intellectual products that underlie this new form of knowledge work. The result will be an open source software platform that both enables collaborative models of information creation and sharing, and provides sustainable repositories to secure the digital materials that constitute our intellectual, scientific, and cultural history.

Berkeley Electronic Press Acquires Digital Commons IR Software

The Berkeley Electronic Press (bepress) has acquired the Digital Commons institutional repository software from ProQuest. bepress was the original creator of the software.

Here's an excerpt from the press release:

ProQuest and The Berkeley Electronic Press ("bepress") today announced that they have reached an agreement for bepress to purchase ownership of Digital Commons, the world's leading hosted institutional repository solution. Bepress will be adding sales and marketing staff and augmenting its existing customer support and services in addition to the hosting and technology services that it has always provided Digital Commons customers.

Bepress Chairman, Aaron Edlin, said "Institutional Repositories are core to the bepress mission of furthering scholarly communication and thus bepress is excited at the opportunity to build a close relationship with Digital Commons customers. Developing successful and vibrant Institutional Repositories will be bepress's central focus."

Publisher Author Agreements

According to today's SHERPA/RoMEO statistics, 36% of the 308 included publishers are green ("can archive pre-print and post-print"), 24% are blue ("can archive post-print (i.e. final draft post-refereeing)"), 11% are yellow ("can archive pre-print (i.e. pre-refereeing)"), and 28% are white ("archiving not formally supported"). Looked at another way, 72% of the publishers permit some form of self-archiving.

These are certainly encouraging statistics, and publishers who permit any form of self-archiving should be applauded; however, leaving aside Creative Commons licenses and author agreements that have been crafted by SPARC and others to promote rights retention, publishers recently liberalized author agreements still raise issues that librarians and scholars should be aware of.

Looking deeper, there are publisher variations in terms of where e-prints can be self-archived. Typically, this might be some combination of the author's Website, institutional repository or Website, funding agency's server, or disciplinary archive. Some agreements allow deposit on any noncommercial or open access server. Restricting deposit to open access or noncommercial servers is perfectly legitimate in my view; more specific restrictions are, well, too restrictive. The problem arises when the agreement limits the author's deposit options to ones he or she doesn't have, such as only allowing deposit in an institutional repository when the author's institution doesn't have one or only allowing posting on an author's Website when the author doesn't have one.

Another issue is publisher requirements for authors to remove e-prints on publication, to modify e-prints after publication to reflect citation and publisher contact information, to replace e-prints with published versions, or to create their own versions of postprints. Low deposit rates in institutional repositories without institutional mandates suggest that anything that involves extra effort by authors is a deterrent to deposit. The above kinds of publisher requirements are likely to have equally low rates on compliance, resulting in deposited e-prints that do not conform to author agreements. To be effective, such requirements would have to be policed by publishers or digital repositories. Otherwise, they are meaningless and are best deleted from author agreements.

A final issue is retrospective deposit. We can think of the journal literature as an inverted pyramid, with the broad top being currently published articles and the bottom being the first published journal articles. The papers published since the emergence of author agreements that permit self-archiving are a significant resource; however, much of the literature precedes such agreements. The vast majority of these articles are under standard copyright transfer agreements, with publishers holding all rights. Consequently, it is very important that publishers clarify whether their relatively new self-archiving policies can be applied retroactively. Elsevier has done so:

When Elsevier changes its policies to enable greater academic use of journal materials (such as the changes several years ago in our web-posting policies) or to clarify the rights retained by journal authors, Elsevier is prepared to extend those rights retroactively with respect to articles published in journal issues produced prior to the policy change.

Elsevier is pleased to confirm that, unless explicitly noted to the contrary, all policies apply retrospectively to previously published journal content. If, after reviewing the material noted above, you have any questions about such rights, please contact Global Rights.

Unfortunately, many publishers have not clarified this issue. Under these conditions, whether authors can deposit preprints or author-created postprints hinges on whether these works are viewed as being different works from the publisher version, and, hence, owned by the authors. Although some open access advocates believe this to be the case, to my knowledge this has never been decided in a court of law. Michael Carroll, who is a professor at the Villanova University School of Law and a member of the Board of the Creative Commons, has said in an analysis of whether authors can put preprints of articles published using standard author agreements under Creative Commons licenses:

Although technically distinct, the copyrights in the pre-print and the post-print overlap. The important point to understand is that copyright grants the owner the right to control exact duplicates and versions that are "substantially similar" to the copyrighted work. (This is under U.S. law, but most other jurisdictions similarly define the scope of copyright).

A pre-print will normally be substantially similar to the post-print. Therefore, when an author transfers the exclusive rights in the work to a publisher, the author precludes herself from making copies or distributing copies of any substantially similar versions of the work as well.

Much progress has been made in the area of author agreements, but authors must still pay careful attention to the details of agreements, which vary considerably by publisher. The SHERPA/RoMEO—Publisher Copyright Policies & Self-Archiving database is a very useful and important tool and users should actively participate in refining this database; however, authors are well advised not to stop at the summary information presented here and to go to the agreement itself (if available). It would be very helpful if a set of standard author agreements that covered the major variations could be developed and put into use by the publishing industry.

Michael Keller Appointed CLIR Senior Presidential Fellow

The Council on Library and Information Resources (CLIR) has announced the appointment of Michael Keller, Stanford’s University Librarian, as CLIR Senior Presidential Fellow. Keller is also Director of Academic Information Resources, founder and publisher of HighWire Press, and publisher of the Stanford University Press.

Here’s an excerpt from the press release:

During the two-year appointment, which begins August 1, Mr. Keller will undertake a series of studies and reports for CLIR publication. His research will include examining the recommendations of recent cyberinfrastructure reports and exploring how our communities can respond to the complex environment these reports envision, including the role and function of institutional repositories, digital archives, and digital libraries. He will also compose white papers that elucidate new and emerging research methodologies, new models of scholarly publishing, the role of supercomputer
centers in the evolving concept of cyberinfrastructure, and topics specific to rethinking aspects of libraries and academic life. During his tenure as fellow, he will continue to work from Stanford.

Update on the DSpace Foundation

Michele Kimpton, Executive Director of the DSpace Foundation, gave gave a talk about the foundation at the DSpace UK & Ireland User Group meeting in early July.

Her PowerPoint presentation is now available.

Source: Lewis, Stuart. "Presentations from Recent DSpace UK & Ireland User Group Meeting," Unilever Centre for Molecular Informatics, Cambridge—Jim Downing, 11 July 2007.

An Ecological Approach to Repository and Service Interactions

UKOLN and JISC CETIS have released An Ecological Approach to Repository and Service Interactions, Draft Version 0.9 for comment.

Here’s an excerpt from the "Not the Executive Summary" section:

This work began with the need to express something of how and why repositories and services interact. As a community we have well understood technical models and architectures that provide mechanisms for interoperability. The actual interactions that occur, however, are not widely understood and knowledge about them is not often shared. This is in part because we tend to share in the abstract through architectures and use cases, articulating interactions or connections requires an engagement with specific details. . . .

Ecology is the study of systems that are complex, dynamic, and full of interacting entities and processes. Although the nature of these interactions and processes may be highly detailed, a higher level view of them is accessible and intuitive. We think that ecology and the ecosystems it studies may offer a useful analogy to inform the task of understanding and articulating the interactions between users, repositories, and services and the information environments in which they take place. This report outlines some concepts from ecology that may be useful and suggests some definitions for a common conversation about the use of this metaphor.

We hope that this report suggests an additional way to conceptualise and analyse interactions and provide a common vocabulary for an ecological approach. It should as a minimum provoke and support some useful discussions about networks and communities.

Curation of Scientific Data: Challenges for Institutions and Their Repositories Podcast

A podcast of Chris Rusbridge’s "Curation of Scientific Data: Challenges for Institutions and their Repositories" presentation at The Adaptable Repository conference is now available. Rusbridge is Director of the Digital Curation Centre in the UK.

The PowerPoint for the presentation is also available.