"Reference Rot in the Repository: A Case Study of Electronic Theses and Dissertations (ETDs) in an Academic Library"

Mia Massicott and Kathleen Botter have published "Reference Rot in the Repository: A Case Study of Electronic Theses and Dissertations (ETDs) in an Academic Library" in Information Technology and Libraries.

Here's an excerpt:

This study examines ETDs deposited during the period 2011-2015 in an institutional repository, to determine the degree to which the documents suffer from reference rot, that is, linkrot plus content drift. The authors converted and examined 664 doctoral dissertations in total, extracting 11,437 links, finding overall that 77% of links were active, and 23% exhibited linkrot. A stratified random sample of 49 ETDs was performed which produced 990 active links, which were then checked for content drift based on mementos found in the Wayback Machine. Mementos were found for 77% of links, and approximately half of these, 492 of 990, exhibited content drift. The results serve to emphasize not only the necessity of broader awareness of this problem, but also to stimulate action on the preservation front.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

A Tour of the Research Data Management (RDM) Service Space. The Realities of Research Data Management, Part 1

OCLC Research has released A Tour of the Research Data Management (RDM) Service Space. The Realities of Research Data Management, Part 1.

Here's an excerpt from the announcement:

The Realities of Research Data Management is a four-part series that explores how research universities are addressing the challenge of managing research data throughout the research lifecycle.

In this introductory report, we provide some brief background on the emergence of RDM as a focus for research support services within higher education, and present a simple framework describing three major components of the RDM service space:

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Discovering Scholarly Orphans Using ORCID"

Martin Klein and Herbert Van de Sompel have self-archived "Discovering Scholarly Orphans Using ORCID."

Here's an excerpt:

Archival efforts such as (C)LOCKSS and Portico are in place to ensure the longevity of traditional scholarly resources like journal articles. At the same time, researchers are depositing a broad variety of other scholarly artifacts into emerging online portals that are designed to support web-based scholarship. These web-native scholarly objects are largely neglected by current archival practices and hence they become scholarly orphans. We therefore argue for a novel paradigm that is tailored towards archiving these scholarly orphans. We are investigating the feasibility of using Open Researcher and Contributor ID (ORCID) as a supporting infrastructure for the process of discovery of web identities and scholarly orphans for active researchers. We analyze ORCID in terms of coverage of researchers, subjects, and location and assess the richness of its profiles in terms of web identities and scholarly artifacts. We find that ORCID currently lacks in all considered aspects and hence can only be considered in conjunction with other discovery sources. However, ORCID is growing fast so there is potential that it could achieve a satisfactory level of coverage and richness in the near future.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

The ABC Method: A Risk Management Approach to the Preservation of Cultural Heritage

The Canadian Conservation Institute has released The ABC Method: A Risk Management Approach to the Preservation of Cultural Heritage.

Here's an excerpt:

This manual offers a comprehensive understanding of risk management applied to the preservation of heritage assets, whether collections, buildings or sites. It provides a step-by-step procedure and a variety of tools to guide the heritage professional in applying the ABC method to their own context. The method can be applied to a range of situations, from analysis of a single risk to a comprehensive risk assessment of the entire heritage asset.

See also:A Guide to Risk Management of Cultural Heritage.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Leveraging Exceptions and Limitations for Digital Curation and Online Collections: The U.S. Case"

Patricia Aufderheide has published "Leveraging Exceptions and Limitations for Digital Curation and Online Collections: The U.S. Case" in Libellarium: Journal for the Research of Writing, Books, and Cultural Heritage Institutions.

Here's an excerpt:

Librarians wanting to use digital affordances for their patron’s and public benefit have increasingly found themselves frustrated by copyright law designed for a pre-digital era. In the U.S., this frustration has driven the nation’s most prestigious library group, the Association of Research Libraries, to explore the utility of the major exception to copyright monopoly rights, fair use, in order to accomplish basic curation and collection goals in a digital era. The ARL's efforts to clarify how libraries can employ fair use has resulted in sometimes-dramatic changes in how work is done, and has permitted innovation at some universities. Its approach demonstrates the power of consensus in a professional field to permit innovation within the law.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Landscape of Research Data Repositories in 2015: A re3data Analysis"

Maxi Kindling et al. have published "The Landscape of Research Data Repositories in 2015: A re3data Analysis" in D-Lib Magazine.

Here's an excerpt:

This article provides a comprehensive descriptive and statistical analysis of metadata information on 1,381 research data repositories worldwide and across all research disciplines. The analyzed metadata is derived from the re3data database, enabling search and browse functionalities for the global registry of research data repositories. The analysis focuses mainly on institutions that operate research data repositories, types and subjects of research data repositories (RDR), access conditions as well as services provided by the research data repositories. RDR differ in terms of the service levels they offer, languages they support or standards they comply with. These statements are commonly acknowledged by saying the RDR landscape is heterogeneous. As expected, we found a heterogeneous RDR landscape that is mostly influenced by the repositories' disciplinary background for which they offer services.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Web Archiving in the United States: A 2016 Survey

The National Digital Stewardship Alliance has released Web Archiving in the United States: A 2016 Survey .

Here's an excerpt from the announcement:

From January 20 to February 16, 2016, a team representing multiple NDSA member institutions and interest groups conducted a survey of organizations in the United States actively involved in, or planning to start, programs to archive content from the Web. This effort built upon a similar survey undertaken by NDSA in late 2011 and published online in June 2012 and a second survey in late 2013 published online in September 2014.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Data Curation Network: How Do We Compare? A Snapshot of Six Academic Library Institutions’ Data Repository and Curation Services"

Lisa R. Johnston et al. have published "Data Curation Network: How Do We Compare? A Snapshot of Six Academic Library Institutions’ Data Repository and Curation Services" in the Journal of eScience Librarianship.

Here's an excerpt:

Methods: Each institutional lead provided a written summary of their services based on a previously developed structure, followed by group discussion and refinement of descriptions. Service areas assessed include the repository services for data, technologies used, policies, and staffing in place.

Conclusions: Through this process we aim to better define the current levels of support offered by our institutions as a first step toward meeting our project's overarching goal to develop a shared staffing model for data curation across multiple institutions.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"A Pilot Competency Matrix for Data Management Skills: A Step toward the Development of Systematic Data Information Literacy Programs"

Megan R. Sapp Nelson has published "A Pilot Competency Matrix for Data Management Skills: A Step toward the Development of Systematic Data Information Literacy Programs" in the Journal of eScience Librarianship.

Here's an excerpt:

This article describes a significant innovation upon existing competencies by identifying a scaffolding (built upon existing competencies) that moves students progressively from undergraduate training through post graduate coursework and research to post-doctoral work and into the early years of data stewardship. The scaffolding ties together existing research that has been completed in research data management skills and data information literacy with research into the outcomes that are desirable for individuals to present in data management at each of the levels of education. Competencies are aligned according to application (personal, team, research enterprise) in such a way that the skills attained at the undergraduate level give students moving on to graduate work greater familiarity with data management and therefore greater likelihood of success at the graduate and then post graduate and data steward levels.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Building a Research Data Management Service at the University of California, Berkeley"

Jamie Wittenberg and Mary Elings have self-archived "Building a Research Data Management Service at the University of California, Berkeley."

Here's an excerpt:

University of California, Berkeley's Library and the central Research Information Technologies unit have collaborated to develop a research data management program that leverages each organization’s expertise and resources to create a unified service. The service offers a range of workshops, consultation, and an online resource. Because of this collaboration, service areas that are often fully embedded in IT, like backup and secure storage, as well as services in the Library domain, like resource discovery and instruction, are integrated into a single research data management program..

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Content-Based Video Retrieval in Historical Collections of the German Broadcasting Archive"

Markus Mühling et al. have self-archived "Content-Based Video Retrieval in Historical Collections of the German Broadcasting Archive."

Here's an excerpt:

The German Broadcasting Archive (DRA) maintains the cultural heritage of radio and television broadcasts of the former German Democratic Republic (GDR). The uniqueness and importance of the video material stimulates a large scientific interest in the video content. In this paper, we present an automatic video analysis and retrieval system for searching in historical collections of GDR television recordings. It consists of video analysis algorithms for shot boundary detection, concept classification, person recognition, text recognition and similarity search. The performance of the system is evaluated from a technical and an archival perspective on 2,500 hours of GDR television recordings

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Research Data Services in European Academic Research Libraries"

Carol Tenopir et al. have published "Research Data Services in European Academic Research Libraries" in LIBER Quarterly.

Here's an excerpt:

Research data is an essential part of the scholarly record, and management of research data is increasingly seen as an important role for academic libraries. This article presents the results of a survey of directors of the Association of European Research Libraries (LIBER) academic member libraries to discover what types of research data services (RDS) are being offered by European academic research libraries and what services are planned for the future. Overall, the survey found that library directors strongly agree on the importance of RDS. As was found in earlier studies of academic libraries in North America, more European libraries are currently offering or are planning to offer consultative or reference RDS than technical or hands-on RDS. The majority of libraries provide support for training in skills related to RDS for their staff members. Almost all libraries collaborate with other organizations inside their institutions or with outside institutions in order to offer or develop policy related to RDS. We discuss the implications of the current state of RDS in European academic research libraries, and offer directions for future research.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Archiving Software Surrogates on the Web for Future Reference"

Helge Holzmann, Wolfram Sperber, Mila Runnwerth have self-archived "Archiving Software Surrogates on the Web for Future Reference."

Here's an excerpt:

Software has long been established as an essential aspect of the scientific process in mathematics and other disciplines. However, reliably referencing software in scientific publications is still challenging for various reasons. A crucial factor is that software dynamics with temporal versions or states are difficult to capture over time. We propose to archive and reference surrogates instead, which can be found on the Web and reflect the actual software to a remarkable extent. Our study shows that about a half of the webpages of software are already archived with almost all of them including some kind of documentation.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"ArchiveSpark: Efficient Web Archive Access, Extraction and Derivation"

Helge Holzmann, Vinay Goel, and Avishek Anand have self-archived "ArchiveSpark: Efficient Web Archive Access, Extraction and Derivation."

Here's an excerpt:

Web archives are a valuable resource for researchers of various disciplines. However, to use them as a scholarly source, researchers require a tool that provides efficient access to Web archive data for extraction and derivation of smaller datasets. Besides efficient access we identify five other objectives based on practical researcher needs such as ease of use, extensibility and reusability.

Towards these objectives we propose ArchiveSpark, a framework for efficient, distributed Web archive processing that builds a research corpus by working on existing and standardized data formats commonly held by Web archiving institutions. Performance optimizations in ArchiveSpark, facilitated by the use of a widely available metadata index, result in significant speed-ups of data processing. Our benchmarks show that ArchiveSpark is faster than alternative approaches without depending on any additional data stores while improving usability by seamlessly integrating queries and derivations with external tools.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Librarians’ Perspectives on the Factors Influencing Research Data Management Programs"

College & Research Libraries has released an e-print of "Librarians' Perspectives on the Factors Influencing Research Data Management Programs."

Here's an excerpt:

This qualitative research study examines librarians' research data management (RDM) experiences, specifically the factors that influence their ability to support researchers' needs. Findings from interviews with 36 academic library professionals in the United States identify 5 factors of influence: 1) technical resources, 2) human resources, 3) researchers' perceptions about the library, 4) leadership support, and 5) communication, coordination, and collaboration. Findings show different aspects of these factors facilitate or constrain RDM activity. The implications of these factors on librarians' continued work in RDM are considered.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"ArchiveWeb: Collaboratively Extending and Exploring Web Archive Collections"

Zeon Trevor Fernando et al. have self-archived "ArchiveWeb: Collaboratively Extending and Exploring Web Archive Collections."

Here's an excerpt:

Curated web archive collections contain focused digital contents which are collected by archiving organizations to provide a representative sample covering specific topics and events to preserve them for future exploration and analysis. In this paper, we discuss how to best support collaborative construction and exploration of these collections through the ArchiveWeb system. . . . This paper describes the functionalities of our current prototype for searching, constructing, exploring and discussing web archive collections, as well as feedback on this prototype from seven archiving organizations, and our plans for improving the next release of the system.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

W3C: Data on the Web Best Practices

W3C has released Data on the Web Best Practices.

Here's an excerpt from the announcement:

W3C is delighted to publish its Data on the Web Best Practices as a Recommendation. The document offers 35 Best Practices for sharing data, openly or not, in a way that maximizes the potential of the Web as a data platform rather than simply as a way to send data from A to B. The Best Practices are prescriptive in their intended outcomes but not in how those outcomes are achieved. They cover everything from the basics (provide metadata!) through nuance (provide structural metadata), to topics like licensing, provenance and basic information on providing APIs through to more advanced topics like data archiving, data enrichment and republishing data.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Bridging Technologies to Efficiently Arrange and Describe Digital Archives: The Bentley Historical Library’s ArchivesSpace-Archivematica-DSpace Workflow Integration Project"

Max Eckard, Dallas Pillen and Mike Shallcross have published "Bridging Technologies to Efficiently Arrange and Describe Digital Archives: The Bentley Historical Library's ArchivesSpace-Archivematica-DSpace Workflow Integration Project" in the Code4Lib Journal.

Here's an excerpt:

In recent years, ArchivesSpace and Archivematica have emerged as two of the most exciting open source platforms for working with digital archives. The former manages accessions and collections and provides a framework for entering descriptive, administrative, rights, and other metadata. The latter ingests digital content and prepares information packages for long-term preservation and access. In October 2016, the Bentley Historical Library wrapped up a two-year, $355,000 grant from the Andrew W. Mellon Foundation to partner with the University of Michigan Library on the integration of these two systems in an end-to-end workflow that will include the automated deposit of content into a DSpace repository. This article provides context of the project and offers an in-depth exploration of the project’s key development tasks, all of which were provided by Artefactual Systems, the developers of Archivematica (code available at https://github.com/artefactual-labs/appraisal-tab).

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

PLOS: Response to NIH RFI—Strategies for NIH Data Management, Sharing, and Citation

PLOS has released Response to NIH RFI—Strategies for NIH Data Management, Sharing, and Citation.

Here's an excerpt:

We write to express the views of the Public Library of Science, a fully Open Access Publisher of seven Research Journals, in response to your RFI on Data Sharing, Management, and Citation. Open access to Research Articles is just the first step in what we consider should be the end state for all publicly funded research, and we support broader efforts towards open science. We are developing our own policies to help establish a new norm in which upon publication of a journal article, if not before, all of the underlying data (where ethically appropriate) is openly available to access and reuse without restriction according to the FAIR principles for data management to make data Findable, Accessible, Interoperable and Re-usable.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Preserving Transactional Data"

Sara Day Thomson has published "Preserving Transactional Data" in The International Journal of Digital Curation.

Here's an excerpt:

This paper discusses requirements for preserving transactional data and the accompanying challenges facing the companies and institutions who aim to re-use these data for analysis or research. It presents a range of use cases—examples of transactional data—in order to describe the characteristics and difficulties of these 'big' data for long-term access. Based on the overarching trends discerned in these use cases, the paper will define the challenges facing the preservation of these data early in the curation lifecycle. It will point to potential solutions within current legal and ethical frameworks, but will focus on positioning the problem of re-using these data from a preservation perspective.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Version 7 of the Research Data Curation Bibliography Released

Digital Scholarship has released Version 7 of the Research Data Curation Bibliography. This selective bibliography includes over 620 English-language articles, books, and technical reports that are useful in understanding the curation of digital research data in academic and other research institutions.

The Research Data Curation Bibliography covers topics such as research data creation, acquisition, metadata, provenance, repositories, management, policies, support services, funding agency requirements, peer review, publication, citation, sharing, reuse, and preservation.

Most sources have been published from January 2009 through December 2016; however, a limited number of earlier key sources are also included. The bibliography includes links to freely available versions of included works. If such versions are unavailable, links to the publishers' descriptions are provided.

Abstracts are included in this bibliography if a work is under a Creative Commons Attribution License (BY and national/international variations), a Creative Commons public domain dedication (CC0), or a Creative Commons Public Domain Mark and this is clearly indicated in the work.

The Research Data Curation Bibliography is under a Creative Commons Attribution 4.0 International License.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Piracy, Public Access, and Preservation: An Exploration of Sustainable Accessibility in a Public Torrent Index"

John Martin has self-archived "Piracy, Public Access, and Preservation: An Exploration of Sustainable Accessibility in a Public Torrent Index."

Here's an excerpt:

Using a snapshot of torrents on the site, this study considers the potential for torrent networks to preserve and provide access to cultural materials in the form of digital media content. Metadata from 2.1 million torrents were categorized by media type and the robustness of given torrents was assessed. Trends over time, such as number of uploads and volume, were also investigated. This study found that relatively few torrents exhibit long-term survivability, even though the overall volume in the index shows continuous increase.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Scholarly Context Adrift: Three out of Four URI References Lead to Changed Content"

Shawn M. Jones et al. have published "Scholarly Context Adrift: Three out of Four URI References Lead to Changed Content" in PLOS ONE.

Here's an excerpt:

Increasingly, scholarly articles contain URI references to "web at large" resources including project web sites, scholarly wikis, ontologies, online debates, presentations, blogs, and videos. Authors reference such resources to provide essential context for the research they report on. A reader who visits a web at large resource by following a URI reference in an article, some time after its publication, is led to believe that the resource's content is representative of what the author originally referenced. However, due to the dynamic nature of the web, that may very well not be the case. We reuse a dataset from a previous study in which several authors of this paper were involved, and investigate to what extent the textual content of web at large resources referenced in a vast collection of Science, Technology, and Medicine (STM) articles published between 1997 and 2012 has remained stable since the publication of the referencing article. We do so in a two-step approach that relies on various well-established similarity measures to compare textual content. In a first step, we use 19 web archives to find snapshots of referenced web at large resources that have textual content that is representative of the state of the resource around the time of publication of the referencing paper. We find that representative snapshots exist for about 30% of all URI references. In a second step, we compare the textual content of representative snapshots with that of their live web counterparts. We find that for over 75% of references the content has drifted away from what it was when referenced. These results raise significant concerns regarding the long term integrity of the web-based scholarly record and call for the deployment of techniques to combat these problems.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

iPRES 2016: 13th International Conference on Digital Preservation Proceedings

The iPRES 2016: 13th International Conference on Digital Preservation Proceedings is available as a 169-page PDF.

Here's an excerpt:

In keeping with previous years, the iPRES 2016 programme is organised into research and practice streams. This format ensures visibility and promotion of both academic research work and the projects and initiatives of institutions involved in digital preservation practices. Furthermore, work- shops and tutorials provide opportunities for participants to share information, knowledge and best practices, and explore opportunities for collaboration on new approaches.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap