"’Yeah, I Guess That’s Data’: Data Practices and Conceptions among Humanities Faculty"

portal: Libraries and the Academy has released an e-print of "'Yeah, I Guess That's Data': Data Practices and Conceptions among Humanities Faculty" by Jennifer L. Thoegersen.

Here's an excerpt:

Libraries are attempting to identify their role in providing data management services. However, humanities faculty’s conceptions of data and their data management practices are not well-known. This qualitative study explores the data management practices of humanities faculty at a four-year university and examines their perceptions of the term data.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Conceptualizing Data Curation Activities within Two Academic Libraries"

Sophia Lafferty-Hess et al. have self-archived "Conceptualizing Data Curation Activities within Two Academic Libraries."

Here's an excerpt:

At the 2017 Triangle Research Libraries Network Institute, staff from the University of North Carolina at Chapel Hill and Duke University used the 47 data curation activities identified by the Data Curation Network project to create conceptual groupings of data curation activities. The results of this "thought-exercise" are discussed in this white paper. The purpose of this exercise was to provide more specificity around data curation within our individual contexts as a method to consistently discuss our current service models, identify gaps we would like to fill, and determine what is currently out of scope. We hope to foster an open and productive discussion throughout the larger academic library community about how we prioritize data curation activities as we face growing demand and limited resources.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Data-Level Metrics Now Available through Make Data Count"

DataONE has released "Data-Level Metrics Now Available through Make Data Count."

Here's an excerpt:

One year into our Sloan funded Make Data Count project, the Make Data Count Team comprising DataONE, California Digital Library and Data Cite are proud to release Version 1 of standardized data usage and citation metrics! . . .

Since the development of our COUNTER Code of Practice for Research Data we have implemented comparable, standardized data usage and citation metrics at Dash (CDL) and DataONE, two project team repositories. . . .

The Make Data Count project team works in an agile "minimum viable product" methodology. This first release has focused on developing a standard recommendation, processing our logs against that Code of Practice [COUNTER Code of Practice for Research Data] to develop comparable data usage metrics, and display of both usage and citation metrics at the repository level.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Changing Influence of Journal Data Sharing Policies on Local RDM Practices "

Dylanne Dearborn et al. have published "The Changing Influence of Journal Data Sharing Policies on Local RDM Practices" in the International Journal of Digital Curation.

Here's an excerpt:

The purpose of this study was to examine changes in research data deposit policies of highly ranked journals in the physical and applied sciences between 2014 and 2016, as well as to develop an approach to examining the institutional impact of deposit requirements. Policies from the top ten journals (ranked by impact factor from the Journal Citation Reports) were examined in 2014 and again in 2016 in order to determine if data deposits were required or recommended, and which methods of deposit were listed as options. For all 2016 journals with a required data deposit policy, publication information (2009-2015) for the University of Toronto was pulled from Scopus and departmental affiliation was determined for each article. The results showed that the number of high-impact journals in the physical and applied sciences requiring data deposit is growing. In 2014, 71.2% of journals had no policy, 14.7% had a recommended policy, and 13.9% had a required policy (n=836). In contrast, in 2016, there were 58.5% with no policy, 19.4% with a recommended policy, and 22.0% with a required policy (n=880). It was also evident that U of T chemistry researchers are by far the most heavily affected by these journal data deposit requirements, having published 543 publications, representing 32.7% of all publications in the titles requiring data deposit in 2016. The Python scripts used to retrieve institutional publications based on a list of ISSNs have been released on GitHub so that other institutions can conduct similar research.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"A Data-Driven Approach to Appraisal and Selection at a Domain Data Repository"

Amy M Pienta et al. have published "A Data-Driven Approach to Appraisal and Selection at a Domain Data Repository" in the International Journal of Digital Curation.

Here's an excerpt:

Social scientists are producing an ever-expanding volume of data, leading to questions about appraisal and selection of content given finite resources to process data for reuse. We analyze users’ search activity in an established social science data repository to better understand demand for data and more effectively guide collection development. By applying a data-driven approach, we aim to ensure curation resources are applied to make the most valuable data findable, understandable, accessible, and usable. We analyze data from a domain repository for the social sciences that includes over 500,000 annual searches in 2014 and 2015 to better understand trends in user search behavior. Using a newly created search-to-study ratio technique, we identified gaps in the domain data repository’s holdings and leveraged this analysis to inform our collection and curation practices and policies. The evaluative technique we propose in this paper will serve as a baseline for future studies looking at trends in user demand over time at the domain data repository being studied with broader implications for other data repositories.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Modelling the Research Data Lifecycle"

Stacy T. Kowalczyk has published "Modelling the Research Data Lifecycle" in the International Journal of Digital Curation.

Here's an excerpt:

This paper develops and tests a lifecycle model for the preservation of research data by investigating the research practices of scientists. This research is based on a mixed-method approach. An initial study was conducted using case study analytical techniques; insights from these case studies were combined with grounded theory in order to develop a novel model of the Digital Research Data Lifecycle. A broad-based quantitative survey was then constructed to test and extend the components of the model. The major contribution of these research initiatives are the creation of the Digital Research Data Lifecycle, a data lifecycle that provides a generalized model of the research process to better describe and explain both the antecedents and barriers to preservation. The antecedents and barriers to preservation are data management, contextual metadata, file formats, and preservation technologies. The availability of data management support and preservation technologies, the ability to create and manage contextual metadata, and the choices of file formats all significantly effect the preservability of research data.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Research Data Preservation in Canada: A White Paper

The Portage Network has released Research Data Preservation in Canada: A White Paper.

Here’s an excerpt from the announcement:

The Preservation Expert Group (PEG) was created to advise Portage on developing research data management (RDM) infrastructure and best practices for preserving research data and metadata in Canada. The members of PEG have written this White Paper as a foundation document to describe the current digital preservation landscape, highlighting some of the digital preservation work already being undertaken in Canada, and to identify challenges that need to be addressed by Portage and other stakeholders to develop and improve RDM capacity and infrastructure across the country.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Frictionless Data: Making Research Data Quality Visible"

Dan Fowler, Jo Barratt, and Paul Walsh have published "Frictionless Data: Making Research Data Quality Visible " in the International Journal of Digital Curation.

Here's an excerpt:

There is significant friction in the acquisition, sharing, and reuse of research data. It is estimated that eighty percent of data analysis is invested in the cleaning and mapping of data (Dasu and Johnson,2003). This friction hampers researchers not well versed in data preparation techniques from reusing an ever-increasing amount of data available within research data repositories. Frictionless Data is an ongoing project at Open Knowledge International focused on removing this friction. We are doing this by developing a set of tools, specifications, and best practices for describing, publishing, and validating data. The heart of this project is the "Data Package", a containerization format for data based on existing practices for publishing open source software. This paper will report on current progress toward that goal.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Library Carpentry: Software Skills Training for Library Professionals"

Jez Cope and James Baker have published "Library Carpentry: Software Skills Training for Library Professionals" in the International Journal of Digital Curation.

Here's an excerpt:

Much time and energy is now being devoted to developing the skills of researchers in the related areas of data analysis and data management. However, less attention is currently paid to developing the data skills of librarians themselves: these skills are often brought in by recruitment in niche areas rather than considered as a wider development need for the library workforce, and are not widely recognised as important to the professional career development of librarians. We believe that building computational and data science capacity within academic libraries will have direct benefits for both librarians and the users we serve.

Library Carpentry is a global effort to provide training to librarians in technical areas that have traditionally been seen as the preserve of researchers, IT support and systems librarians. Established non-profit volunteer organisations, such as Software Carpentry and Data Carpentry, offer introductory research software skills training with a focus on the needs and requirements of research scientists. Library Carpentry is a comparable introductory software skills training programme with a focus on the needs and requirements of library and information professionals. This paper describes how the material was developed and delivered, and reports on challenges faced, lessons learned and future plans.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Implementing a Research Data Policy at Leiden University "

Fieke Schoots et al. have published "Implementing a Research Data Policy at Leiden University " in the International Journal of Digital Curation.

Here's an excerpt:

In this paper, we discuss the various stages of the institution-wide project that lead to the adoption of the data management policy at Leiden University in 2016. We illustrate this process by highlighting how we have involved all stakeholders. Each organisational unit was represented in the project teams. Results were discussed in a sounding board with both academic and support staff. Senior researchers acted as pioneers and raised awareness and commitment among their peers. By way of example, we present pilot projects from two faculties. We then describe the comprehensive implementation programme that will create facilities and services that must allow implementing the policy as well as monitoring and evaluating it. Finally, we will present lessons learnt and steps ahead. The engagement of all stakeholders, as well as explicit commitment from the Executive Board, has been an important key factor for the success of the project and will continue to be an important condition for the steps ahead.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"How Valid is your Validation? A Closer Look Behind the Curtain of JHOVE "

Michelle Lindlar and Yvonne Tunnat have published "How Valid is your Validation? A Closer Look Behind the Curtain of JHOVE " in the International Journal of Digital Curation.

Here's an excerpt:

Validation is a key task of any preservation workflow and often JHOVE is the first tool of choice for characterizing and validating common file formats. Due to the tool’s maturity and high adoption, decisions if a file is indeed fit for long-term availability are often made based on JHOVE output. But can we trust a tool simply based on its wide adoption and maturity by age? How does JHOVE determine the validity and well-formedness of a file? Does a module really support all versions of a file format family? How much of the file formats’ standards do we need to know and understand in order to interpret the output correctly? Are there options to verify JHOVE-based decisions within preservation workflows? While the software has been a long-standing favourite within the digital curation domain for many years, a recent look at JHOVE as a vital decision supporting tool is currently missing. This paper presents a practice report which aims to close this gap.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Support Your Data: A Research Data Management Guide for Researchers"

John A Borghi et al. have published "Support Your Data: A Research Data Management Guide for Researchers" in Research Ideas and Outcomes.

Here's an excerpt:

Researchers are faced with rapidly evolving expectations about how they should manage and share their data, code, and other research materials. To help them meet these expectations and generally manage and share their data more effectively, we are developing a suite of tools which we are currently referring to as "Support Your Data". These tools, which include a rubric designed to enable researchers to self-assess their current data management practices and a series of short guides which provide actionable information about how to advance practices as necessary or desired, are intended to be easily customizable to meet the needs of a researchers working in a variety of institutional and disciplinary contexts.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Data Sharing in PLOS ONE: An Analysis of Data Availability Statements"

Lisa M. Federer et al. have published "Data Sharing in PLOS ONE: An Analysis of Data Availability Statements" in PLOS ONE.

Here's an excerpt:

A number of publishers and funders, including PLOS, have recently adopted policies requiring researchers to share the data underlying their results and publications. Such policies help increase the reproducibility of the published literature, as well as make a larger body of data available for reuse and re-analysis. In this study, we evaluate the extent to which authors have complied with this policy by analyzing Data Availability Statements from 47,593 papers published in PLOS ONE between March 2014 (when the policy went into effect) and May 2016. Our analysis shows that compliance with the policy has increased, with a significant decline over time in papers that did not include a Data Availability Statement. However, only about 20% of statements indicate that data are deposited in a repository, which the PLOS policy states is the preferred method. More commonly, authors state that their data are in the paper itself or in the supplemental information, though it is unclear whether these data meet the level of sharing required in the PLOS policy. These findings suggest that additional review of Data Availability Statements or more stringent policies may be needed to increase data sharing.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Operationalizing the Replication Standard: A Case Study of the Data Curation and Verification Workflow for Scholarly Journals"

Thu-Mai Christian et al. have self-archived "Operationalizing the Replication Standard: A Case Study of the Data Curation and Verification Workflow for Scholarly Journals."

Here's an excerpt:

In response to widespread concerns about the integrity of research published in scholarly journals, several initiatives have emerged that are promoting research transparency through access to data underlying published scientific findings. Journal editors, in particular, have made a commitment to research transparency by issuing data policies that require authors to submit their data, code, and documentation to data repositories to allow for public access to the data. In the case of the American Journal of Political Science (AJPS) Data Replication Policy, the data also must undergo an independent verification process in which materials are reviewed for quality as a condition of final manuscript publication and acceptance. Aware of the specialized expertise of the data archives, AJPS called upon the Odum Institute Data Archive to provide a data review service that performs data curation and verification of replication datasets. This article presents a case study of the collaboration between AJPS and the Odum Institute Data Archive to develop a workflow that bridges manuscript publication and data review processes. The case study describes the challenges and the successes of the workflow integration, and offers lessons learned that may be applied by other data archives that are considering expanding their services to include data curation and verification services to support reproducible research.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Curating Humanities Research Data: Managing Workflows for Adjusting a Repository Framework "

Hagen Peukert has published "Curating Humanities Research Data: Managing Workflows for Adjusting a Repository Framework" in the International Journal of Digital Curation.

Here's an excerpt:

Handling heterogeneous data, subject to minimal costs, can be perceived as a classic management problem. The approach at hand applies established managerial theorizing to the field of data curation. It is argued, however, that data curation cannot merely be treated as a standard case of applying management theory in a traditional sense. Rather, the practice of curating humanities research data, the specifications and adjustments of the model suggested here reveal an intertwined process, in which knowledge of both strategic management and solid information technology have to be considered. Thus, suggestions on the strategic positioning of research data, which can be used as an analytical tool to understand the proposed workflow mechanisms, and the definition of workflow modules, which can be flexibly used in designing new standard workflows to configure research data repositories, are put forward.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"How Important is Data Curation? Gaps and Opportunities for Academic Libraries"

Lisa R Johnston et al. have published "How Important is Data Curation? Gaps and Opportunities for Academic Libraries" in the Journal of Librarianship and Scholarly Communication.

Here's an excerpt:

INTRODUCTION Data curation may be an emerging service for academic libraries, but researchers actively "curate" their data in a number of ways—even if terminology may not always align. Building on past userneeds assessments performed via survey and focus groups, the authors sought direct input from researchers on the importance and utilization of specific data curation activities. METHODS Between October 21, 2016, and November 18, 2016, the study team held focus groups with 91 participants at six different academic institutions to determine which data curation activities were most important to researchers, which activities were currently underway for their data, and how satisfied they were with the results. RESULTS Researchers are actively engaged in a variety of data curation activities, and while they considered most data curation activities to be highly important, a majority of the sample reported dissatisfaction with the current state of data curation at their institution. DISCUSSION Our findings demonstrate specific gaps and opportunities for academic libraries to focus their data curation services to more effectively meet researcher needs. CONCLUSION Research libraries stand to benefit their users by emphasizing, investing in, and/or heavily promoting the highly valued services that may not currently be in use by many researchers.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

2017 Fixity Survey Report: An NDSA Report

The National Digital Stewardship Alliance has released the 2017 Fixity Survey Report: An NDSA Report.

Here's an excerpt:

Fixity checking, or the practice of algorithmically reviewing digital content to insure that it has not changed over time, is a complex but essential aspect in digital preservation management. To date, there have been no broadly established best practices surrounding fixity checking, perhaps largely due to the wide variety of digital preservation systems and solutions employed by cultural heritage organizations. In an attempt to understand the common practices that exist for fixity checking, as well as the challenges institutions face when implementing a fixity check routine, the National Digital Stewardship Alliance (NDSA) Fixity Working Group developed and published a survey on fixity practices in fall of 2017. A total of 164 survey responses were recorded, of which 89 completed surveys were used in results analysis.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"ARCHANGEL: Trusted Archives of Digital Public Documents"

John Collomosse, have self-archived "ARCHANGEL: Trusted Archives of Digital Public Documents."

Here's an excerpt:

We present ARCHANGEL; a de-centralised platform for ensuring the long-term integrity of digital documents stored within public archives. Document integrity is fundamental to public trust in archives. Yet currently that trust is built upon institutional reputation—trust at face value in a centralised authority, like a national government archive or University. ARCHANGEL proposes a shift to a technological underscoring of that trust, using distributed ledger technology (DLT) to cryptographically guarantee the provenance, immutability and so the integrity of archived documents. We describe the ARCHANGEL architecture, and report on a prototype of that architecture build over the Ethereum infrastructure. We report early evaluation and feedback of ARCHANGEL from stakeholders in the research data archives space.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Integration of an Active Research Data System with a Data Repository to Streamline the Research Data Lifecyle: Pure-NOMAD Case Study"

Simone Ivan Conte et al. have published "Integration of an Active Research Data System with a Data Repository to Streamline the Research Data Lifecyle: Pure-NOMAD Case Study " in the International Journal of Digital Curation.

Here's an excerpt:

Research funders have introduced requirements that expect researchers to properly manage and publicly share their research data, and expect institutions to put in place services to support researchers in meeting these requirements. So far the general focus of these services and systems has been on addressing the final stages of the research data lifecycle (archive, share and re-use), rather than stages related to the active phase of the cycle (collect/create and analyse). As a result, full integration of active data management systems with data repositories is not yet the norm, making the streamlined transition of data from an active to a published and archived status an important challenge. In this paper we present the integration between an active data management system developed in-house (NOMAD) and Elsevier's Pure data repository used at our institution, with the aim of offering a simple workflow to facilitate and promote the data deposit process. The integration results in a new data management and publication workflow that helps researchers to save time, minimize human errors related to manually handling files, and further promote data deposit together with collaboration across the institution.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Are the FAIR Data Principles Fair?"

Alastair Dunning, Madeleine de Smaele, and Jasmin Bõhmer have published "Are the FAIR Data Principles Fair?" in the International Journal of Digital Curation.

Here's an excerpt:

This practice paper describes an ongoing research project to test the effectiveness and relevance of the FAIR Data Principles. Simultaneously, it will analyse how easy it is for data archives to adhere to the principles. The research took place from November 2016 to January 2017, and will be underpinned with feedback from the repositories.

The FAIR Data Principles feature 15 facets corresponding to the four letters of FAIR—Findable, Accessible, Interoperable, Reusable. These principles have already gained traction within the research world. The European Commission has recently expanded its demand for research to produce open data. The relevant guidelines1are explicitly written in the context of the FAIR Data Principles. Given an increasing number of researchers will have exposure to the guidelines, understanding their viability and suggesting where there may be room for modification and adjustment is of vital importance.

This practice paper is connected to a dataset(Dunning et al.,2017) containing the original overview of the sample group statistics and graphs, in an Excel spreadsheet. Over the course of two months, the web-interfaces, help-pages and metadata-records of over 40 data repositories have been examined, to score the individual data repository against the FAIR principles and facets. The traffic-light rating system enables colour-coding according to compliance and vagueness. The statistical analysis provides overall, categorised, on the principles focussing, and on the facet focussing results.

The analysis includes the statistical and descriptive evaluation, followed by elaborations on Elements of the FAIR Data Principles, the subject specific or repository specific differences, and subsequently what repositories can do to improve their information architecture.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap