PEER Behavioural Research: Authors and Users vis-à-vis Journals and Repositories; Baseline Report

The Publishing and the Ecology of European Research (PEER) project has released PEER Behavioural Research: Authors and Users vis-à-vis Journals and Repositories; Baseline Report.

Here's an excerpt from the press release:

The PEER Behavioural Research Team from Loughborough University (Department of Information Science & LISU) has completed its behavioural baseline report, which is based on an electronic survey of authors (and authors as users) with more than 3000 European researchers and a series of focus groups covering the Medical sciences; Social sciences, humanities & arts; Life sciences; and Physical sciences & mathematics. The objectives of the Behavioural Research within PEER are to:

  • Track trends and explain patterns of author and user behaviour in the context of so called Green Open Access.
  • Understand the role repositories play for authors in the context of journal publishing.
  • Understand the role repositories play for users in context of accessing journal articles.

The baseline report outlines findings from the first phase of the research and identifies the key themes to emerge. It also identifies priorities for further analysis and future work. Some interesting points to emerge from the first phase of research that may be of interest to a number of stakeholders in the scholarly communication system include:

  • An individual's attitude towards open access repositories may change dependant on whether they are an author or a reader; readers being interested in the quality of the articles but authors also focused on the reputation of the repository itself
  • Reaching the target audience is the overwhelming motivation for scholars to disseminate their research results and this strongly influences their choice of journal and/or repository
  • Researchers in certain disciplines may lack confidence in making preprints available, and to some extent this is not only a matter of confidence in the quality of a text but also due to differences in work organisation across research cultures (e.g. strong internal peer review of manuscripts versus reliance on journals for peer review). Other factors are likely to include career stage and centrality of research to the parent discipline
  • Value-added services, such as download statistics and alert services, would contribute to the perceived usefulness of repositories and could help them gain popularity in what is an increasingly competitive information landscape
  • Readers often need to go through a variety of processes to access all the articles that they require and widespread open access may reduce the need for this time consuming practice.

Cornell Establishes Collaborative Business Model for arXiv Repository

The Cornell University Library has established a collaborative business model for the arXiv repository.

Here's an excerpt from the press release:

arXiv will remain free for readers and submitters, but the Library has established a voluntary, collaborative business model to engage institutions that benefit most from arXiv.

"Keeping an open-access resource like arXiv sustainable means not only covering its costs, but also continuing to enhance its value, and that kind of financial commitment is beyond a single institution's resources," said Oya Rieger, Associate University Librarian for Information Technologies. "If a case can be made for any repository being community-supported, arXiv has to be at the top of the list."

The 200 institutions that use arXiv most heavily account for more than 75 percent of institutional downloads. Cornell is asking these institutions for financial support in the form of annual contributions, and most of the top 25 have already committed to helping arXiv.

Institutions that have already pledged support include:

  • California Institute of Technology
  • University of California, Berkeley
  • University of Cambridge (UK)
  • CERN – European Organization for Nuclear Research (Switzerland)
  • CNRS – Centre National de la Recherche Scientifique (France)
  • Columbia University
  • DESY – Deutsches Elektronen-Synchrotron (Germany)
  • Durham University (UK)
  • ETH Zurich – Eidgenössische Technische Hochschule Zürich (Switzerland)
  • Fermilab
  • Harvard University
  • University of Illinois at Urbana-Champaign
  • Imperial College London (UK)
  • Los Alamos National Laboratory
  • Massachusetts Institute of Technology
  • Max Planck Society (Germany)
  • University of Michigan
  • University of Oxford (UK)
  • University of Pennsylvania
  • Princeton University
  • SLAC National Accelerator Laboratory
  • Texas A&M University . . .

The proposed funding model is viewed as a short-term strategy, and the Library is actively seeking input on a long-term solution. Currently, Cornell University Library supports the operating costs of arXiv, which are comparable to the costs of the university's collection budget for physics and astronomy. As one of the most influential innovations in scholarly communications since the advent of the Internet, arXiv's original dissemination model represented the first significant means to provide expedited access to scientific research well ahead of formal publication.

Paul Ginsparg Gets $882,610 Grant for arXiv Enhancement

Paul Ginsparg, professor of physics and information science at Cornell University, has been awarded a $882,610 grant by the NSF for the Tools for Open Access Cyberinfrastructure project, which will enhance the popular arXiv repository. The grant was funded under the American Recovery and Reinvestment Act of 2009.

Here's an excerpt from the grant award :

This project proposes to investigate and implement a variety of tools for enhancing the very widely used and popular Arxiv.org infrastructure, based on information filters for assisted service discovery and selection, text-mining, information genealogy, automated classification and identification of composite resources, data-mining, usage analyses, matching and ranking heuristics, support for next-generation document formats, and semantic markup.

Read more about it at "Stimulus Grant to Enhance arXiv E-Preprints for Scientists."

Armbruster and Romary Compare Four Repository Types

Chris Armbruster and Laurent Romary have self-archived "Comparing Repository Types: Challenges and Barriers for Subject-Based Repositories, Research Repositories, National Repository Systems and Institutional Repositories in Serving Scholarly Communication" in SSRN.

Here's an excerpt:

Four types of publication repository may be distinguished, namely the subject-based repository, research repository, national repository system and institutional repository.

Two important shifts in the role of repositories may be noted. With regard to content, a well-defined and high quality corpus is essential. This implies that repository services are likely to be most successful when constructed with the user and reader uppermost in mind. With regard to service, high value to specific scholarly communities is essential. This implies that repositories are likely to be most useful to scholars when they offer dedicated services supporting the production of new knowledge.

Along these lines, challenges and barriers to repository development may be identified in three key dimensions: a) identification and deposit of content; b) access and use of services; and c) preservation of content and sustainability of service. An indicative comparison of challenges and barriers in some major world regions such as Europe, North America and East Asia plus Australia is offered in conclusion.

"Worldwide Use and Impact of the NASA Astrophysics Data System Digital Library"

Michael J. Kurtz et al. have self-archived "Worldwide Use and Impact of the NASA Astrophysics Data System Digital Library" in arXiv.org.

Here's the abstract:

By combining data from the text, citation, and reference databases with data from the ADS readership logs we have been able to create Second Order Bibliometric Operators, a customizable class of collaborative filters which permits substantially improved accuracy in literature queries. Using the ADS usage logs along with membership statistics from the International Astronomical Union and data on the population and gross domestic product (GDP) we develop an accurate model for world-wide basic research where the number of scientists in a country is proportional to the GDP of that country, and the amount of basic research done by a country is proportional to the number of scientists in that country times that country's per capita GDP.

We introduce the concept of utility time to measure the impact of the ADS/URANIA and the electronic astronomical library on astronomical research. We find that in 2002 it amounted to the equivalent of 736 FTE researchers, or $250 Million, or the astronomical research done in France. Subject headings: digital libraries; bibliometrics; sociology of science; information retrieval

"A Taxonomy of Articles in PubMed Central"

In "A Taxonomy of Articles in PubMed Central," Jim Till examines the open access characteristics of articles deposited in PubMed Central that were published between April 7, 2008 and August 7, 2008.

Here's an excerpt:

Summary: The total number of articles published in the 4-month interval (April 7 to August 7, 2008) and contributed to PMC was 23960. The four subtypes of articles in PMC, and their estimated proportions during this 4-month interval, are: 1) Author manuscripts that are publicly accessible (7346/23960=30.7%); 2) Articles that are embargoed (378/23960=1.6%); 3) Articles that are Libre OA (3635/23960=15.2%); 4) Other articles that are publicly accessible, via Gratis OA (12601/23960=52.5%). These proportions are probably not very different for the subset of NIH-supported articles, if it's assumed that, during this 4-month interval, about 50-60% of the articles contributed to PMC were NIH-supported.

"Positional Effects on Citation and Readership in arXiv"

Asif-ul Haque and Paul Ginsparg have self-archived "Positional Effects on Citation and Readership in arXiv" in arXiv.org.

Here's an excerpt:

arXiv.org mediates contact with the literature for entire scholarly communities, both through provision of archival access and through daily email and web announcements of new materials, potentially many screenlengths long. We confirm and extend a surprising correlation between article position in these initial announcements, ordered by submission time, and later citation impact, due primarily to intentional "self-promotion" on the part of authors. A pure "visibility" effect was also present: the subset of articles accidentally in early positions fared measurably better in the long-term citation record than those lower down. Astrophysics articles announced in position 1, for example, overall received a median number of citations 83% higher, while those there accidentally had a 44% visibility boost. For two large subcommunities of theoretical high energy physics, hep-th and hep-ph articles announced in position 1 had median numbers of citations 50% and 100% larger than for positions 5-15, and the subsets there accidentally had visibility boosts of 38% and 71%.

We also consider the positional effects on early readership. The median numbers of early full text downloads for astro-ph, hep-th, and hep-ph articles announced in position 1 were 82%, 61%, and 58% higher than for lower positions, respectively, and those there accidentally had medians visibility-boosted by 53%, 44%, and 46%. Finally, we correlate a variety of readership features with long-term citations, using machine learning methods, thereby extending previous results on the predictive power of early readership in a broader context. We conclude with some observations on impact metrics and dangers of recommender mechanisms.

Overlay Journal Infrastructure for Meteorological Sciences (OJIMS): Final Report

JISC has released the Overlay Journal Infrastructure for Meteorological Sciences (OJIMS): Final Report.

Here's an excerpt:

The Overlay Journal Infrastructure for Meteorological Sciences (OJIMS) project developed the mechanisms that could support both a new on-line Journal of Meteorological Data and an Open-Access Repository for documents related to the meteorological sciences. The project had three fundamental aims:

  • Creation of overlay journal mechanics.
  • Creation of an open access subject based repository for Meteorology and atmospheric sciences.
  • Construction and evaluation of business models for potential overlay journals. . . .
  • The proposal for the Journal of Meteorological Data is that it would be an on-line, peer-reviewed data journal. It would extend the scientific discipline of peer review to data, providing recognition for the work of creating data. The rigorous, but manageable, standards for metadata and documentation prescribed will facilitate re-use of the data, encourage appropriate application of the data to scientific problems and enable experiments to be repeated. A review process was proposed which encompasses three elements: a data description document, metadata and the data themselves. All three elements would be reviewed, but citation would be of the text article

    .

“Beyond Institutional Repositories”

Laurent Romary and Chris Armbruster have self-archived "Beyond Institutional Repositories" in SSRN.

Here's an excerpt:

The current system of so-called institutional repositories, even if it has been a sensible response at an earlier stage, may not answer the needs of the scholarly community, scientific communication and accompanied stakeholders in a sustainable way. However, having a robust repository infrastructure is essential to academic work. Yet, current institutional solutions, even when networked in a country or across Europe, have largely failed to deliver. Consequently, a new path for a more robust infrastructure and larger repositories is explored to create superior services that support the academy. A future organization of publication repositories is advocated that is based upon macroscopic academic settings providing a critical mass of interest as well as organizational coherence. Such a macro-unit may be geographical (a coherent national scheme), institutional (a large research organization or a consortium thereof) or thematic (a specific research field organizing itself in the domain of publication repositories).

The argument proceeds as follows: firstly, while institutional open access mandates have brought some content into open access, the important mandates are those of the funders and these are best supported by a single infrastructure and large repositories, which incidentally enhances the value of the collection (while a transfer to institutional repositories would diminish the value). Secondly, we compare and contrast a system based on central research publication repositories with the notion of a network of institutional repositories to illustrate that across central dimensions of any repository solution the institutional model is more cumbersome and less likely to achieve a high level of service. Next, three key functions of publication repositories are reconsidered, namely a) the fast and wide dissemination of results; b) the preservation of the record; and c) digital curation for dissemination and preservation. Fourth, repositories and their ecologies are explored with the overriding aim of enhancing content and enhancing usage. Fifth, a target scheme is sketched, including some examples. In closing, a look at the evolutionary road ahead is offered.

New Report Says Less Than 50% of Publishers Permit Self-Archiving in Disciplinary Archives

A new report from the Publishing Research Consortium, Journal Authors' Rights: Perception and Reality, says that less than 10% of publishers permit self-archiving of the publisher PDF file in any repository and less than 50% permit deposit of the submitted and the accepted article version in a disciplinary archive.

Here's an excerpt:

However, when it comes to self-archiving, although 80% or more allow self- archiving to a personal or departmental website, over 60% to an institutional repository, and over 40% to a subject repository, in most cases this is only permitted for the submitted and/or accepted version; use of the final, published version for self-archiving is very much more restricted.

ARL Report: Current Models of Digital Scholarly Communication

The Association of Research Libraries has released Current Models of Digital Scholarly Communication by Nancy L. Maron and K. Kirby Smith, plus a database of associated examples.

Here's an excerpt from the press release:

In the spring of 2008, ARL engaged Ithaka’s Strategic Services Group to conduct an investigation into the range of online resources valued by scholars, paying special attention to those projects that are pushing beyond the boundaries of traditional formats and are considered innovative by the faculty who use them. The networked digital environment has enabled the creation of many new kinds of works, and many of these resources have become essential tools for scholars conducting research, building scholarly networks, and disseminating their ideas and work, but the decentralized distribution of these new-model works has made it difficult to fully appreciate their scope and number.

Ithaka’s findings are based on a collection of resources identified by a volunteer field team of over 300 librarians at 46 academic institutions in the US and Canada. Field librarians talked with faculty members on their campuses about the digital scholarly resources they find most useful and reported the works they identified. The authors evaluated each resource gathered by the field team and conducted interviews of project leaders of 11 representative resources. Ultimately, 206 unique digital resources spanning eight formats were identified that met the study’s criteria.

The study’s innovative qualitative approach yielded a rich cross-section of today’s state of the art in digital scholarly resources. The report profiles each of the eight genres of resources, including discussion of how and why the faculty members reported using the resources for their work, how content is selected for the site, and what financial sustainability strategies the resources are employing. Each section draws from the in-depth interviews to provide illustrative anecdotes and representative examples.

Highlights from the study’s findings include:

  • While some disciplines seem to lend themselves to certain formats of digital resource more than others, examples of innovative resources can be found across the humanities, social sciences, and scientific/technical/medical subject areas.

  • Of all the resources suggested by faculty, almost every one that contained an original scholarly work operates under some form of peer review or editorial oversight.

  • Some of the resources with greatest impact are those that have been around a long while.

  • While some resources serve very large audiences, many digital publications—capable of running on relatively small budgets—are tailored to small, niche audiences.

  • Innovations relating to multimedia content and Web 2.0 functionality appear in some cases to blur the lines between resource types.

  • Projects of all sizes—especially open-access sites and publications—employ a range of support strategies in the search for financial sustainability.

"Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web"

Duncan Hull, Steve R. Pettifer, and Douglas B. Kel have published "Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web" in PLoS Computational Biology.

Here's the abstract:

Many scientists now manage the bulk of their bibliographic information electronically, thereby organizing their publications and citation material from digital libraries. However, a library has been described as 'thought in cold storage,' and unfortunately many digital libraries can be cold, impersonal, isolated, and inaccessible places. In this Review, we discuss the current chilly state of digital libraries for the computational biologist, including PubMed, IEEE Xplore, the ACM digital library, ISI Web of Knowledge, Scopus, Citeseer, arXiv, DBLP, and Google Scholar. We illustrate the current process of using these libraries with a typical workflow, and highlight problems with managing data and metadata using URIs. We then examine a range of new applications such as Zotero, Mendeley, Mekentosj Papers, MyNCBI, CiteULike, Connotea, and HubMed that exploit the Web to make these digital libraries more personal, sociable, integrated, and accessible places. We conclude with how these applications may begin to help achieve a digital defrost, and discuss some of the issues that will help or hinder this in terms of making libraries on the Web warmer places in the future, becoming resources that are considerably more useful to both humans and machines.

Author's Rights, Tout de Suite

Author's Rights, Tout de Suite, the latest Digital Scholarship publication, is designed to give journal article authors a quick introduction to key aspects of author's rights and to foster further exploration of this topic through liberal use of relevant references to online documents and links to pertinent Web sites.

It is under a Creative Commons Attribution-Noncommercial 3.0 United States License, and it can be freely used for any noncommercial purpose, including derivative works, in accordance with the license.

The prior publication in the Tout de Suite series, Institutional Repositories, Tout de Suite, is also available.

A Look at the Development and Future of Scholarly Communication in High Energy Physics

Robert Aymar, Director-General of CERN, has deposited a e-print of "Scholarly Communication in High-Energy Physics: Past, Present and Future Innovations" in the CERN Document Server.

Here's an excerpt from the abstract:

Unprecedented technological advancements have radically changed the way we communicate and, at the same time, are effectively transforming science into e-Science. In turn, this transformation calls for an evolution in scholarly communication. This review describes several innovations, spanning the last decades of scholarly communication in High Energy Physics: the first repositories, their interaction with peer-reviewed journals, a proposed model for Open Access publishing and a next-generation repository for the field.

Of particular interest is his description of the INSPIRE Project, "a fully integrated HEP information platform for the future," that will have "text- and data-mining applications, citation analysis and other tools, and Web 2.0 features."

For further information about INSPIRE, see "Information Systems in HEP get INSPIREd" and the INSPIRE Wiki.

Tools Allow Users to Create Automatically Updated Lists from Research Papers in Economics Database

Research Papers in Economics (RePEc) offers two tools that allow users to create lists from its database: (1) a reading list tool (e.g., Socio-Economics of Fisheries and Aquaculture), and (2) a customized publication compilations tool (e.g., University of Connecticut Economics PhD Alumni). Reading lists are automatically updated each week; publication compilations are automatically updated each month.

Read more about it at "Using RePEc for Syllabi, Bibliographies and Publication Lists."

Microsoft Releases Beta of Article Authoring Add-in for Microsoft Office Word 2007

Microsoft has released a beta version of its Article Authoring Add-in for Microsoft Office Word 2007.

Here's an excerpt from the product's home page:

Beta 1 of Word add-in to enhance the authoring of scientific and technical articles, including support for the National Library of Medicine XML format This Beta 1 release enables reading and writing of XML-based documents in the format used by the National Library of Medicine for archiving scientific articles.

Repository Interface for Overlaid Journal Archives: Results from an Online Questionnaire Survey

The RIOJA project has released Repository Interface for Overlaid Journal Archives: Results from an Online Questionnaire Survey.

Here's an excerpt from the "Introduction":

The Repository Interface for Overlaid Journal Archives (RIOJA) project (http://www.ucl.ac.uk/ls/rioja) is an international partnership of members of academic staff, librarians and technologists from UCL (University College London), the University of Cambridge, the University of Glasgow, Imperial College London and Cornell University. It aims to address some of the issues around the development and implementation of a new publishing model, that of the overlay journal – defined, for the purposes of the project, as a quality-assured journal whose content is deposited to and resides in one or more open access repositories. The project is funded by the Joint Information Systems Committee (JISC, http://www.jisc.ac.uk/) and runs from April 2007 to June 2008.

The RIOJA project will create an interoperability toolkit to enable the overlay of certification onto papers housed in subject repositories. The intention is that the tool will be generic, helping any repository to realise its potential to act as a more complete scholarly resource. The project will also create a demonstrator overlay journal, using the arXiv repository and OJS software, with interaction between the two facilitated by the RIOJA toolkit.

To inform and shape the project, a survey of Astrophysics and Cosmology researchers has been conducted. The findings from that survey form the basis of this report.

The project team will also undertake formal and informal discussion with publishers and with academic and managing members of editorial boards. The survey and supplementary discussions will help to ensure that the RIOJA outputs address the needs and expectations of the research community. Finally, the overall long-term sustainability of a repository-overlay journal will be assessed. The project will examine the costs of adding peer review to arXiv deposits, of implementing and maintaining the functionality which the survey shows to be most valued by researchers, and of providing long-term preservation of content, and will aim to identify and appraise possible cost-recovery business models.

67 Plagiarized Papers from Turkey Removed from arXiv

The arXiv archive has removed 67 plagiarized papers, which were written by 15 Turkish physicists. Questions about the physics expertise of two of the authors emerged during their oral dissertation defenses, and the investigation widened from there.

Source: “Turkish Professors Uncover Plagiarism in Papers Posted on Physics Server.” The Chronicle of Higher Education News Blog, 6 September 2007.

AONS: Scanning Repositories for Obsolete Digital Formats

The APSR AONS II project has released a beta version of the Automatic Obsolescence Notification System (AONS).

Here's an excerpt from the announcement on apsr_announcements:

Users can register with the service by providing a URL to a repository's format scan summary. The AONS service will display the summary and allow a repository manager to compare the formats of items in their repository with information from format registries such as PRONOM and Library of Congress. These registries flag any formats that are likely to become obsolete. Repository managers can then make curation decisions about any items at risk, such as upgrading their formats.

By downloading and installing an AONS locally, an institution can also take advantage of a pilot risk metrics implementation. . . .

The AONS software is the result of the AONS II project funded under APSR and developed by David Pearson, David Levy and Matthew Walker from the National Library of Australia (NLA) with an administrative user interface developed by David Berriman at ANU.

The software is able to be downloaded from Sourceforge at http://sourceforge.net/projects/aons and a mailing list is also available for support and feedback. As this is a beta release we welcome feedback to the Sourceforge mailing list to inform our testing which will continue until mid-September.

Please try out the pilot service by sending an email to cosi@apsr.edu.au to register with the service, and tell us which institution you are from. . . .

Publisher Author Agreements

According to today's SHERPA/RoMEO statistics, 36% of the 308 included publishers are green ("can archive pre-print and post-print"), 24% are blue ("can archive post-print (i.e. final draft post-refereeing)"), 11% are yellow ("can archive pre-print (i.e. pre-refereeing)"), and 28% are white ("archiving not formally supported"). Looked at another way, 72% of the publishers permit some form of self-archiving.

These are certainly encouraging statistics, and publishers who permit any form of self-archiving should be applauded; however, leaving aside Creative Commons licenses and author agreements that have been crafted by SPARC and others to promote rights retention, publishers recently liberalized author agreements still raise issues that librarians and scholars should be aware of.

Looking deeper, there are publisher variations in terms of where e-prints can be self-archived. Typically, this might be some combination of the author's Website, institutional repository or Website, funding agency's server, or disciplinary archive. Some agreements allow deposit on any noncommercial or open access server. Restricting deposit to open access or noncommercial servers is perfectly legitimate in my view; more specific restrictions are, well, too restrictive. The problem arises when the agreement limits the author's deposit options to ones he or she doesn't have, such as only allowing deposit in an institutional repository when the author's institution doesn't have one or only allowing posting on an author's Website when the author doesn't have one.

Another issue is publisher requirements for authors to remove e-prints on publication, to modify e-prints after publication to reflect citation and publisher contact information, to replace e-prints with published versions, or to create their own versions of postprints. Low deposit rates in institutional repositories without institutional mandates suggest that anything that involves extra effort by authors is a deterrent to deposit. The above kinds of publisher requirements are likely to have equally low rates on compliance, resulting in deposited e-prints that do not conform to author agreements. To be effective, such requirements would have to be policed by publishers or digital repositories. Otherwise, they are meaningless and are best deleted from author agreements.

A final issue is retrospective deposit. We can think of the journal literature as an inverted pyramid, with the broad top being currently published articles and the bottom being the first published journal articles. The papers published since the emergence of author agreements that permit self-archiving are a significant resource; however, much of the literature precedes such agreements. The vast majority of these articles are under standard copyright transfer agreements, with publishers holding all rights. Consequently, it is very important that publishers clarify whether their relatively new self-archiving policies can be applied retroactively. Elsevier has done so:

When Elsevier changes its policies to enable greater academic use of journal materials (such as the changes several years ago in our web-posting policies) or to clarify the rights retained by journal authors, Elsevier is prepared to extend those rights retroactively with respect to articles published in journal issues produced prior to the policy change.

Elsevier is pleased to confirm that, unless explicitly noted to the contrary, all policies apply retrospectively to previously published journal content. If, after reviewing the material noted above, you have any questions about such rights, please contact Global Rights.

Unfortunately, many publishers have not clarified this issue. Under these conditions, whether authors can deposit preprints or author-created postprints hinges on whether these works are viewed as being different works from the publisher version, and, hence, owned by the authors. Although some open access advocates believe this to be the case, to my knowledge this has never been decided in a court of law. Michael Carroll, who is a professor at the Villanova University School of Law and a member of the Board of the Creative Commons, has said in an analysis of whether authors can put preprints of articles published using standard author agreements under Creative Commons licenses:

Although technically distinct, the copyrights in the pre-print and the post-print overlap. The important point to understand is that copyright grants the owner the right to control exact duplicates and versions that are "substantially similar" to the copyrighted work. (This is under U.S. law, but most other jurisdictions similarly define the scope of copyright).

A pre-print will normally be substantially similar to the post-print. Therefore, when an author transfers the exclusive rights in the work to a publisher, the author precludes herself from making copies or distributing copies of any substantially similar versions of the work as well.

Much progress has been made in the area of author agreements, but authors must still pay careful attention to the details of agreements, which vary considerably by publisher. The SHERPA/RoMEO—Publisher Copyright Policies & Self-Archiving database is a very useful and important tool and users should actively participate in refining this database; however, authors are well advised not to stop at the summary information presented here and to go to the agreement itself (if available). It would be very helpful if a set of standard author agreements that covered the major variations could be developed and put into use by the publishing industry.

Friday’s OAI5 Presentations

Presentations from Friday’s sessions of the 5th Workshop on Innovations in Scholarly Communication in Geneva are now available.

Here are a few highlights from this major conference:

  • Doctoral e-Theses; Experiences in Harvesting on a National and European Level (PowerPoint): "In the presentation we will show some lessons learned and the first results of the Demonstrator, an interoperable portal of European doctoral e-theses in five countries: Denmark, Germany, the Netherlands, Sweden and the UK."
  • Exploring Overlay Journals: The RIOJA project (PowerPoint): "This presentation introduces the RIOJA (Repository Interface to Overlaid Journal Archives) project, on which a group of cosmology researchers from the UK is working with UCL Library Services and Cornell University. The project is creating a tool to support the overlay of journals onto repositories, and will demonstrate a cosmology journal overlaid on top of arXiv."
  • Dissemination or Publication? Some Consequences from Smudging the Boundaries between Research Data and Research Papers (PDF): "Project StORe’s repository middleware will enable researchers to move seamlessly between the research data environment and its outputs, passing directly from an electronic article to the data from which it was developed, or linking instantly to all the publications that have resulted from a particular research dataset."
  • Open Archives, The Expectations of the Scientific Communities (RealVideo): "This analysis led the French CNRS to start the Hal project, a pluridisciplinary open archive strongly inspired by ArXiv, and directly connected to it. Hal actually automatically transfers data and documents to ArXiv for the relevant disciplins; similarly, it is connected to Pum Med and Pub Med Central for life sciences. Hal is customizable so that institutions can build their own portal within Hal, which then plays the role of an institutional archive (examples are INRIA, INSERM, ENS Lyon, and others)."

(You may want to download PowerPoint Viewer 2007 if you don’t have PowerPoint 2007).

OpenDOAR API

The OpenDOAR project has announced the availability of an API for accessing digital repository data in their database.

Here’s an excerpt from the press release:

OpenDOAR, as a SHERPA project, is pleased to announce the release of an API that lets developers use OpenDOAR data in their applications. It is a machine-to-machine interface that can run a wide variety of queries against the OpenDOAR Database and get back XML data. Developers can choose to receive just repository titles & URLs, all the available OpenDOAR data, or intermediate levels of detail. They can then incorporate the output into their own applications and ‘mash-ups’, or use it to control processes such as OAI-PMH harvesting. . . .

OpenDOAR is a continuing project hosted at the University of Nottingham under the SHERPA Partnership. OpenDOAR maintains and builds on a quality-assured list of the world’s Open Access Repositories. OpenDOAR acts as a bridge between repository administrators and the service providers who make use of information held in repositories to offer search and other services to researchers and scholars worldwide.

A key feature of OpenDOAR is that all of the repositories we list have been visited by project staff, tested and assessed by hand. We currently decline about a quarter of candidate sites as being broken, empty, out of scope, etc. This gives a far higher quality assurance to the listings we hold than results gathered by just automatic harvesting. OpenDOAR has now surveyed over 1,100 repositories, producing a classified Directory of over 800 freely available archives of academic information.

Fez+Fedora Repository Software Gains Traction in US

The February 2007 issue of Sustaining Repositories reports that more US institutions are using or investigating a combination of Fez and Fedora (see the below quote):

Fez programmers at the University of Queensland (UQ) have been gratified by a surge in international interest in the Fez software. Emory University Libraries are building a Fez repository for electronic theses. Indiana University Libraries are also testing Fez+Fedora to see whether to replace their existing DSpace installation. The Colorado Alliance of Research Libraries (http://www.coalliance.org/) is using Fez+Fedora for their Alliance Digital Repository. Also in the US, the National Science Digital Library is using Fez+Fedora for their Materials Science Digital Library (http://matdl.org/repository/index.php).

Open Access Repository Software Use By Country

Based on data from the OpenDOAR Charts service, here is snapshot of the open access repository software that is in use in the top five countries that offer such repositories.

The countries are abbreviated in the table header column as follows: US = United States, DK = Germany, UK = United Kingdom, AU = Australia, and NL = Netherlands. The number in parentheses is the reported number of repositories in that country.

Read the country percentages downward in each column (they do not total to 100% across the rows).

Excluding "unknown" or "other" systems, the highest in-country percentage is shown in boldface.

Software/Country US (248) DE (109) UK (93) AU (50) NL (44)
Bepress 17% 0% 2% 6% 0%
Cocoon 0% 0% 1% 0% 0%
CONTENTdm 3% 0% 2% 0% 0%
CWIS 1% 0% 0% 0% 0%
DARE 0% 0% 0% 0% 2%
Digitool 0% 0% 1% 0% 0%
DSpace 18% 4% 22% 14% 14%
eDoc 0% 2% 0% 0% 0%
ETD-db 4% 0% 0% 0% 0%
Fedora 0% 0% 0% 2% 0%
Fez 0% 0% 0% 2% 0%
GNU EPrints 19% 8% 46% 22% 0%
HTML 2% 4% 4% 4% 0%
iTor 0% 0% 0% 0% 5%
Milees 0% 2% 0% 0% 0%
MyCoRe 0% 2% 0% 0% 0%
OAICat 0% 0% 0% 2% 0%
Open Repository 0% 0% 3% 0% 2%
OPUS 0% 43% 2% 0% 0%
Other 6% 7% 2% 2% 0%
PORT 0% 0% 0% 0% 2%
Unknown 31% 28% 18% 46% 23%
Wildfire 0% 0% 0% 0% 52%