Cornell University Library Repository Principles and Strategies Handbook

Erin Faulder et al. have self-archived the "Cornell University Library Repository Principles and Strategies Handbook."

Here's an excerpt:

The handbook provides support for both new and existing repository managers, comprising both recommended practices and specifically identified action steps that will allow them to track their progress and identify gaps. Each section of the handbook covers a different strategic area of repository management, standing largely on its own and linking to other sections when appropriate. Although there is no primary section order, we recommend starting with Defining Repository Scope and Service Planning. The handbook specifically addresses principles and practices pertaining to digital repositories, where a digital repository can be defined as: a system, the purpose of which is to store, present, and preserve a collection of data for which the library provides services. That is, the term refers specifically to the application as opposed to the content (collections, objects and metadata) within.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"More Than 1 Million Images Now Publicly Available at library.artstor.org!"

Artstor has released "More Than 1 Million Images Now Publicly Available at library.artstor.org!."

Here's an excerpt:

Artstor has made more than 1 million image, video, document, and audio files from public institutional collections freely available to everyone—subscribers and non-subscribers alike—at library.artstor.org. These collections are being shared by institutions who make their content available via JSTOR Forum, a tool that allows them to catalog, manage, and share digital media collections and make them discoverable to the widest possible audience.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Data-Level Metrics Now Available through Make Data Count"

DataONE has released "Data-Level Metrics Now Available through Make Data Count."

Here's an excerpt:

One year into our Sloan funded Make Data Count project, the Make Data Count Team comprising DataONE, California Digital Library and Data Cite are proud to release Version 1 of standardized data usage and citation metrics! . . .

Since the development of our COUNTER Code of Practice for Research Data we have implemented comparable, standardized data usage and citation metrics at Dash (CDL) and DataONE, two project team repositories. . . .

The Make Data Count project team works in an agile "minimum viable product" methodology. This first release has focused on developing a standard recommendation, processing our logs against that Code of Practice [COUNTER Code of Practice for Research Data] to develop comparable data usage metrics, and display of both usage and citation metrics at the repository level.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Types, Frequencies, and Findability of Disciplinary Grey Literature within Prominent Subject Databases and Academic Institutional Repositories"

Wanda R. Marsolek et al. have published "The Types, Frequencies, and Findability of Disciplinary Grey Literature within Prominent Subject Databases and Academic Institutional Repositories" in the Journal of Librarianship and Scholarly Communication.

Here's an excerpt:

INTRODUCTION In many disciplines grey literature, or works that are more ephemeral in nature and are not typically published through traditional scholarly channels, are heavily used alongside traditional materials and sources. We were interested in the type and frequency of grey literature in subject databases and in North American institutional repositories (IRs) as well as what disciplines use grey literature. METHODS Over 100 subject databases utilized by academic researchers and the IRs of over 100 academic institutions were studied. Document type, search capabilities, and level of curation were noted. RESULTS Grey literature was present in the majority (68%) of the literature databases and almost all IRs (95%) contained grey literature. DISCUSSION Grey literature was present in the subject databases across all broad disciplines including arts and humanities. In these resources the most common types of grey literature were conference papers, technical reports, and theses and dissertations. The findability of the grey literature in IRs varied widely as did evidence of active collection development. CONCLUSION Recommendations include the development of consistent metadata standards for grey literature to enhance searching within individual resources as well as supporting future interoperability. An increased level of collection development of grey literature in institutional repositories would facilitate preservation and increase the findability and reach of grey literature.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"ARCHANGEL: Trusted Archives of Digital Public Documents"

John Collomosse, have self-archived "ARCHANGEL: Trusted Archives of Digital Public Documents."

Here's an excerpt:

We present ARCHANGEL; a de-centralised platform for ensuring the long-term integrity of digital documents stored within public archives. Document integrity is fundamental to public trust in archives. Yet currently that trust is built upon institutional reputation—trust at face value in a centralised authority, like a national government archive or University. ARCHANGEL proposes a shift to a technological underscoring of that trust, using distributed ledger technology (DLT) to cryptographically guarantee the provenance, immutability and so the integrity of archived documents. We describe the ARCHANGEL architecture, and report on a prototype of that architecture build over the Ethereum infrastructure. We report early evaluation and feedback of ARCHANGEL from stakeholders in the research data archives space.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Integration of an Active Research Data System with a Data Repository to Streamline the Research Data Lifecyle: Pure-NOMAD Case Study"

Simone Ivan Conte et al. have published "Integration of an Active Research Data System with a Data Repository to Streamline the Research Data Lifecyle: Pure-NOMAD Case Study " in the International Journal of Digital Curation.

Here's an excerpt:

Research funders have introduced requirements that expect researchers to properly manage and publicly share their research data, and expect institutions to put in place services to support researchers in meeting these requirements. So far the general focus of these services and systems has been on addressing the final stages of the research data lifecycle (archive, share and re-use), rather than stages related to the active phase of the cycle (collect/create and analyse). As a result, full integration of active data management systems with data repositories is not yet the norm, making the streamlined transition of data from an active to a published and archived status an important challenge. In this paper we present the integration between an active data management system developed in-house (NOMAD) and Elsevier's Pure data repository used at our institution, with the aim of offering a simple workflow to facilitate and promote the data deposit process. The integration results in a new data management and publication workflow that helps researchers to save time, minimize human errors related to manually handling files, and further promote data deposit together with collaboration across the institution.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"New Release: arXiv Search v0.1"

Cornell University has released "New Release: arXiv Search v0.1."

Here's an excerpt:

Today we launched a reimplementation of our search system. As part of our broader strategy for arXiv-NG, we are incrementally decoupling components from the classic arXiv codebase, and replacing them with more modular services developed in Python. Our goal was to replace the aging Lucene search backend, achieve feature-parity with the classic search system, and give the search interface an opportunistic face-lift. . . .The most important win for us in this milestone is that the new backend lays the groundwork for more dramatic improvements to search, our APIs, and other components targeted for reimplementation in arXiv-NG.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Modern Research Data Portal: A Design Pattern for Networked, Data-Intensive Science"

Kyle Chard et al. have published "The Modern Research Data Portal: A Design Pattern for Networked, Data-Intensive Science" in PeerJ.

Here's an excerpt:

In this article, we first define the problems that research data portals address, introduce the legacy approach, and examine its limitations. We then introduce the MRDP design pattern and describe its realization via the integration of two elements: Science DMZs (Dart et al., 2013) (high-performance network enclaves that connect large-scale data servers directly to high-speed networks) and cloud-based data management and authentication services such as those provided by Globus (Chard, Tuecke & Foster, 2014). We then outline a reference implementation of the MRDP design pattern, also provided in its entirety on the companion web site, https://docs.globus.org/mrdp, that the reader can study—and, if they so desire, deploy and adapt to build their own high-performance research data portal. We also review various deployments to show how the MRDP approach has been applied in practice: examples like the National Center for Atmospheric Research's Research Data Archive, which provides for high-speed data delivery to thousands of geoscientists; the Sanger Imputation Service, which provides for online analysis of user-provided genomic data; the Globus data publication service, which provides for interactive data publication and discovery; and the DMagic data sharing system for data distribution from light sources. We conclude with a discussion of related technologies and summary.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Behaviours and Technical Recommendations of the COAR Next Generation Repositories Working Group

COAR has released Behaviours and Technical Recommendations of the COAR Next Generation Repositories Working Group.

Here's an excerpt from the announcement:

COAR's vision is to position repositories as the foundation for a distributed, globally networked infrastructure for scholarly communication, on top of which layers of value added services will be deployed, thereby transforming the system, making it more research-centric, open to and supportive of innovation, while also collectively managed by the scholarly community.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Leading Across Boundaries: Collaborative Leadership and the Institutional Repository in Research Universities and Liberal Arts Colleges"

David M. Seaman has self-archived "Leading Across Boundaries: Collaborative Leadership and the Institutional Repository in Research Universities and Liberal Arts Colleges."

Here's an excerpt:

Two methodologies—content analysis of IR web pages and surveys of library directors and IR developers—were employed to determine if IRs revealed evidence of collaborative leadership. The study populations were those members of the Association of Research Libraries (ARL) and the Oberlin Group of liberal arts colleges that operated IR services by July 2014 (146 institutions overall). The research examined if IR format, size, age, nomenclature, or technology platform varied between ARL and Oberlin Group members. It asked if there is any difference in the perception of collaborative leadership traits, perceived IR success, or collaborative involvement with stakeholder communities between ARL and Oberlin Group members or between library directors and IR developers. The study found evidence of all six collaborative leadership traits being examined: assessing the environment for collaboration, creating clarity, building trust, sharing power, developing people, and self-reflection.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Journal Flipping or a Public Open Access Infrastructure? What Kind of Open Access Future Do We Want?"

Tony Ross-Hellauer and Benedikt Fecher have published "Journal Flipping or a Public Open Access Infrastructure? What Kind of Open Access Future Do We Want?" in LSE Impact of Social Sciences.

Here's an excerpt:

Open access (OA) is advocated by science funders, policymakers and researchers alike. It will most likely be the default way of publishing in the not-so-distant future. Nonetheless, the dominant approach to achieve OA at the moment—journal flipping—could have adverse long-term effects for science. To try to stir debate, we here present two dichotomic scenarios for open access in 20 years' time [journal flipping vs. a public open access infrastructure].

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

The State of Open Data Report 2017

Figshare has released The State of Open Data Report 2017.

Here's an excerpt:

Its key finding is that open data has become more embedded in the research community—82% of survey respondents are aware of open data sets and more researchers are curating their data for sharing.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Next Stage of SocArXiv’s Development: Bringing Greater Transparency and Efficiency to the Peer Review Process"

Philip Cohen has published "The Next Stage of SocArXiv's Development: Bringing Greater Transparency and Efficiency to the Peer Review Proces" in LSE Impact of Social Sciences.

Here's an excerpt:

Looking ahead to the next stage of its development, Philip Cohen considers how SocArXiv might challenge the peer review system to be more efficient and transparent, firstly by confronting the bias that leads many who benefit from the status quo to characterise mooted alternatives as extreme. The value and implications of openness at the various decision points in the system must be debated, as should potentially more disruptive innovations such as non-exclusive review and publication or crowdsourcing reviews.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Lots of Institutional Repositories Keep E-prints Safe

The seductive allure of a commercial mega repository is two-fold: (1) everything is conveniently in one place, and (2) a company is taking care of the dreary and expensive business of running it.

Everything seems fine: problem solved! That is until something goes wrong, such as the repository being bought and controlled by a publisher or being threatened by lawsuits by a coterie of publishers.

Then it's important to remember: it's a company, and companies exist to make a profit.

Heh, companies are great. I wouldn't have just had that tasty cup of coffee without them. But, we should be very clear about what motivates companies and controls their behavior. And we shouldn't be shocked if they do things that aren't motivated by lofty goals.

I know: institutional repositories are hard work. The bloom is off the rose. But they exist to serve higher education, not make money, and they part of the academic communities they serve. And they can't be bought. And their universities don't often go out of business. And there are a lot of them. And they are not likely to be attractive targets for lawsuits unless something has gone very, very wrong at the local level.

Copyright is complicated. No one is advocating that we ignore it and just shove e-prints into IR's willy-nilly. Getting faculty to understand the ins and outs of e-print copyright is no picnic, nor is monitoring for compliance. But the battle is easier to fight at the local level where one-on-one faculty to librarian communication is possible.

For self-archiving to flourish in the long run, institutional repositories must flourish. By and large, librarians establish, run, and support them, and they are the quiet heroes of green open access who will continue to provide a sustainable and reliable infrastructure for self-archiving.

"Has the Open Access Movement Delayed the Revolution?"

Richard Poynder has published "Has the Open Access Movement Delayed the Revolution?" in Open and Shut?.

Here's an excerpt:

As I said, publishers are also co-opting green OA. They are doing this by buying up repository platforms like SSRN and bepress, for instance, and by imposing lengthy embargoes before green OA papers can be made freely available. Again, the OA movement has assisted in this by, for instance, advocating for and supporting OA policies that accept publisher-imposed embargoes as a given, and by partnering with publishers in initiatives that turn repositories into little more than search interfaces. This has the effect of directing users away from repositories to legacy publishers’ sites (see here for instance, and here).

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Penn Libraries to End Partnership with bepress"

The University of Pennsylvania Libraries has released "Penn Libraries to End Partnership with bepress."

Here's an excerpt:

In August, bepress sold their company to Elsevier, a business with a history of aggressive confidentiality agreements, steep price increases, and opaque data mining practices. In their acquisition of bepress and other companies like SSRN and Mendeley, Elsevier demonstrates a move toward the consolidation and monopolization of products and services impacting all areas of the research lifecycle.

We are worried about the long-term impacts from these acquisitions and are concerned that such changes are not in the best interests of the library community. Therefore, we feel obligated to begin exploring alternatives.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap