"Operationalizing the Replication Standard: A Case Study of the Data Curation and Verification Workflow for Scholarly Journals"

Thu-Mai Christian et al. have self-archived "Operationalizing the Replication Standard: A Case Study of the Data Curation and Verification Workflow for Scholarly Journals."

Here's an excerpt:

In response to widespread concerns about the integrity of research published in scholarly journals, several initiatives have emerged that are promoting research transparency through access to data underlying published scientific findings. Journal editors, in particular, have made a commitment to research transparency by issuing data policies that require authors to submit their data, code, and documentation to data repositories to allow for public access to the data. In the case of the American Journal of Political Science (AJPS) Data Replication Policy, the data also must undergo an independent verification process in which materials are reviewed for quality as a condition of final manuscript publication and acceptance. Aware of the specialized expertise of the data archives, AJPS called upon the Odum Institute Data Archive to provide a data review service that performs data curation and verification of replication datasets. This article presents a case study of the collaboration between AJPS and the Odum Institute Data Archive to develop a workflow that bridges manuscript publication and data review processes. The case study describes the challenges and the successes of the workflow integration, and offers lessons learned that may be applied by other data archives that are considering expanding their services to include data curation and verification services to support reproducible research.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Curating Humanities Research Data: Managing Workflows for Adjusting a Repository Framework "

Hagen Peukert has published "Curating Humanities Research Data: Managing Workflows for Adjusting a Repository Framework" in the International Journal of Digital Curation.

Here's an excerpt:

Handling heterogeneous data, subject to minimal costs, can be perceived as a classic management problem. The approach at hand applies established managerial theorizing to the field of data curation. It is argued, however, that data curation cannot merely be treated as a standard case of applying management theory in a traditional sense. Rather, the practice of curating humanities research data, the specifications and adjustments of the model suggested here reveal an intertwined process, in which knowledge of both strategic management and solid information technology have to be considered. Thus, suggestions on the strategic positioning of research data, which can be used as an analytical tool to understand the proposed workflow mechanisms, and the definition of workflow modules, which can be flexibly used in designing new standard workflows to configure research data repositories, are put forward.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"How Important is Data Curation? Gaps and Opportunities for Academic Libraries"

Lisa R Johnston et al. have published "How Important is Data Curation? Gaps and Opportunities for Academic Libraries" in the Journal of Librarianship and Scholarly Communication.

Here's an excerpt:

INTRODUCTION Data curation may be an emerging service for academic libraries, but researchers actively "curate" their data in a number of ways—even if terminology may not always align. Building on past userneeds assessments performed via survey and focus groups, the authors sought direct input from researchers on the importance and utilization of specific data curation activities. METHODS Between October 21, 2016, and November 18, 2016, the study team held focus groups with 91 participants at six different academic institutions to determine which data curation activities were most important to researchers, which activities were currently underway for their data, and how satisfied they were with the results. RESULTS Researchers are actively engaged in a variety of data curation activities, and while they considered most data curation activities to be highly important, a majority of the sample reported dissatisfaction with the current state of data curation at their institution. DISCUSSION Our findings demonstrate specific gaps and opportunities for academic libraries to focus their data curation services to more effectively meet researcher needs. CONCLUSION Research libraries stand to benefit their users by emphasizing, investing in, and/or heavily promoting the highly valued services that may not currently be in use by many researchers.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

2017 Fixity Survey Report: An NDSA Report

The National Digital Stewardship Alliance has released the 2017 Fixity Survey Report: An NDSA Report.

Here's an excerpt:

Fixity checking, or the practice of algorithmically reviewing digital content to insure that it has not changed over time, is a complex but essential aspect in digital preservation management. To date, there have been no broadly established best practices surrounding fixity checking, perhaps largely due to the wide variety of digital preservation systems and solutions employed by cultural heritage organizations. In an attempt to understand the common practices that exist for fixity checking, as well as the challenges institutions face when implementing a fixity check routine, the National Digital Stewardship Alliance (NDSA) Fixity Working Group developed and published a survey on fixity practices in fall of 2017. A total of 164 survey responses were recorded, of which 89 completed surveys were used in results analysis.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"ARCHANGEL: Trusted Archives of Digital Public Documents"

John Collomosse, have self-archived "ARCHANGEL: Trusted Archives of Digital Public Documents."

Here's an excerpt:

We present ARCHANGEL; a de-centralised platform for ensuring the long-term integrity of digital documents stored within public archives. Document integrity is fundamental to public trust in archives. Yet currently that trust is built upon institutional reputation—trust at face value in a centralised authority, like a national government archive or University. ARCHANGEL proposes a shift to a technological underscoring of that trust, using distributed ledger technology (DLT) to cryptographically guarantee the provenance, immutability and so the integrity of archived documents. We describe the ARCHANGEL architecture, and report on a prototype of that architecture build over the Ethereum infrastructure. We report early evaluation and feedback of ARCHANGEL from stakeholders in the research data archives space.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Integration of an Active Research Data System with a Data Repository to Streamline the Research Data Lifecyle: Pure-NOMAD Case Study"

Simone Ivan Conte et al. have published "Integration of an Active Research Data System with a Data Repository to Streamline the Research Data Lifecyle: Pure-NOMAD Case Study " in the International Journal of Digital Curation.

Here's an excerpt:

Research funders have introduced requirements that expect researchers to properly manage and publicly share their research data, and expect institutions to put in place services to support researchers in meeting these requirements. So far the general focus of these services and systems has been on addressing the final stages of the research data lifecycle (archive, share and re-use), rather than stages related to the active phase of the cycle (collect/create and analyse). As a result, full integration of active data management systems with data repositories is not yet the norm, making the streamlined transition of data from an active to a published and archived status an important challenge. In this paper we present the integration between an active data management system developed in-house (NOMAD) and Elsevier's Pure data repository used at our institution, with the aim of offering a simple workflow to facilitate and promote the data deposit process. The integration results in a new data management and publication workflow that helps researchers to save time, minimize human errors related to manually handling files, and further promote data deposit together with collaboration across the institution.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Are the FAIR Data Principles Fair?"

Alastair Dunning, Madeleine de Smaele, and Jasmin Bõhmer have published "Are the FAIR Data Principles Fair?" in the International Journal of Digital Curation.

Here's an excerpt:

This practice paper describes an ongoing research project to test the effectiveness and relevance of the FAIR Data Principles. Simultaneously, it will analyse how easy it is for data archives to adhere to the principles. The research took place from November 2016 to January 2017, and will be underpinned with feedback from the repositories.

The FAIR Data Principles feature 15 facets corresponding to the four letters of FAIR—Findable, Accessible, Interoperable, Reusable. These principles have already gained traction within the research world. The European Commission has recently expanded its demand for research to produce open data. The relevant guidelines1are explicitly written in the context of the FAIR Data Principles. Given an increasing number of researchers will have exposure to the guidelines, understanding their viability and suggesting where there may be room for modification and adjustment is of vital importance.

This practice paper is connected to a dataset(Dunning et al.,2017) containing the original overview of the sample group statistics and graphs, in an Excel spreadsheet. Over the course of two months, the web-interfaces, help-pages and metadata-records of over 40 data repositories have been examined, to score the individual data repository against the FAIR principles and facets. The traffic-light rating system enables colour-coding according to compliance and vagueness. The statistical analysis provides overall, categorised, on the principles focussing, and on the facet focussing results.

The analysis includes the statistical and descriptive evaluation, followed by elaborations on Elements of the FAIR Data Principles, the subject specific or repository specific differences, and subsequently what repositories can do to improve their information architecture.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Implementation Roadmap for the European Open Science Cloud

The European Commission has released Implementation Roadmap for the European Open Science Cloud.

Here's an excerpt from the announcement:

Overall, the document presents the results and available evidence from an extensive and conclusive consultation process that started with the publication of the Communication: European Cloud initiative (COM(2016)178) in April 2016.

The consultation upheld the intervention logic presented in the Communication, to create a fit for purpose pan-European federation of research data infrastructures, with a view to moving from the current fragmentation to a situation where data is easy to store, find, share and re-use.

On the basis of the consultation, the implementation Roadmap gives and overview of six actions lines for the implementation of the EOSC:

a) architecture, b) data, c) services, d) access & interfaces, e) rules and f) governance.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Data Librarianship: A Path and an Ethic. A Conversation between Thomas Padilla and Vicky Steeves"

Thomas Padilla and Vicky Steeves have published "Data Librarianship: A Path and an Ethic. A Conversation between Thomas Padilla and Vicky Steeves" in dh+lib.

I think a lot about the corporate capture of the scholarly record, and how my work in data management and reproducibility can either contribute to or disrupt that. With the rise of reproducibility as a buzzword, there are plenty of commercial entities ready to profit from so-called 'reproducibility platforms'. This represents yet another corporate capture of scholarship. I try to disrupt this by advocating for community-run, open source software for reproducibility, such as ReproZip (which I work on), o2r, and Binder. The same goes for data management platforms. We're seeing a lot of new data services springing up from major publishers and this is also something I am actively trying to combat.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Data Policies of Highly-Ranked Social Science Journals"

Mercè Crosas et al. have self-archived "Data Policies of Highly-Ranked Social Science Journals."

Here's an excerpt:

By encouraging and requiring that authors share their data in order to publish articles, scholarly journals have become an important actor in the movement to improve the openness of data and the reproducibility of research. But how many social science journals encourage or mandate that authors share the data supporting their research findings? How does the share of journal data policies vary by discipline? What influences these journals’ decisions to adopt such policies and instructions? And what do those policies and instructions look like?

We discuss the results of our analysis of the instructions and policies of 291 highly-ranked journals publishing social science research, where we studied the contents of journal data policies and instructions across 14 variables, such as when and how authors are asked to share their data, and what role journal ranking and age play in the existence and quality of data policies and instructions. We also compare our results to the results of other studies that have analyzed the policies of social science journals, although differences in the journals chosen and how each study defines what constitutes a data policy limit this comparison. We conclude that a little more than half of the journals in our study have data policies. A greater share of the economics journals have data policies and mandate sharing, followed by political science/international relations and psychology journals.

Finally, we use our findings to make several recommendations: Policies should include the terms "data," "dataset" or more specific terms that make it clear what to make available; policies should include the benefits of data sharing; journals, publishers, and associations need to collaborate more to clarify data policies; and policies should explicitly ask for qualitative data.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"What Do Data Librarians Think of the MLIS? Professionals’ Perceptions of Knowledge Transfer, Trends, and Challenges"

Camille V.L. Thomas and Richard J. Urban have published "What Do Data Librarians Think of the MLIS? Professionals' Perceptions of Knowledge Transfer, Trends, and Challenges " in College & Research Libraries.

Here's an excerpt:

There are existing studies on data curation programs in library science education and studies on data services in libraries. However, there is not much insight into how educational programs have prepared data professionals for practice. This study asked 105 practicing professionals how well they thought their education prepared them for professional experience. It also asked supervisors about their perceptions of how well employees performed. After analyzing the results, the investigators of this study found that changing the educational model may lead to improvements in future library data services.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Archiving Large-Scale Legacy Multimedia Research Data: A Case Study"

Claudia Yogeswaran and Kearsy Cormier have published "Archiving Large-Scale Legacy Multimedia Research Data: A Case Study " in the International Journal of Digital Curation.

Here's an excerpt:

In this paper we provide a case study of the creation of the DCAL Research Data Archive at University College London. In doing so, we assess the various challenges associated with archiving large-scale legacy multimedia research data, given the lack of literature on archiving such datasets. We address issues such as the anonymisation of video research data, the ethical challenges of managing legacy data and historic consent, ownership considerations, the handling of large-size multimedia data, as well as the complexity of multi-project data from a number of researchers and legacy data from eleven years of research.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Global Access to Research Software: The Forgotten Pillar of Open Science Implementation

The Global Young Academy has released Global Access to Research Software: The Forgotten Pillar of Open Science Implementation .

Here's an excerpt:

The Global Young Academy (GYA), in collaboration with the Oxford-based organisation INASP, carried out a pilot survey to assess the quantity and quality of access to proprietary and open source software among researchers from all disciplines. . . .Emphasis was placed on gathering data from researchers based in Bangladesh, Ghana and Nigeria, whose access to and use of research software had not yet been extensively documented.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Data Availability, Reusability, and Analytic Reproducibility: Evaluating the Impact of a Mandatory Open Data Policy at the Journal Cognition"

Tom Hardwicke et al. have self-archived "Data Availability, Reusability, and Analytic Reproducibility: Evaluating the Impact of a Mandatory Open Data Policy at the Journal Cognition."

Here's an excerpt:

Access to research data is a critical feature of an efficient, progressive, and ultimately self-correcting scientific ecosystem. But the extent to which in-principle benefits of data sharing are realized in practice is unclear. Crucially, it is largely unknown whether published findings can be reproduced by repeating reported analyses upon shared data ("analytic reproducibility"). To investigate, we conducted an observational evaluation of a mandatory open data policy introduced at the journal Cognition. Interrupted time-series analyses indicated a substantial post-policy increase in data available statements (104/417, 25% pre-policy to 136/174, 78% post-policy), and data that were in-principle reusable (23/104, 22% pre-policy to 85/136, 62%, post-policy). However, for 35 articles with in-principle reusable data, the analytic reproducibility of target outcomes related to key findings was poor: 11 (31%) cases were reproducible without author assistance, 11 (31%) cases were reproducible only with author assistance, and 13 (37%) cases were not fully reproducible despite author assistance. Importantly, original conclusions did not appear to be seriously impacted. Mandatory open data policies can increase the frequency and quality of data sharing. However, suboptimal data curation, unclear analysis specification, and reporting errors can impede analytic reproducibility, undermining the utility of data sharing and the credibility of scientific findings.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The State of Assessing Data Stewardship Maturity —An Overview"

Ge Peng has published "The State of Assessing Data Stewardship Maturity —An Overview" in Data Science Journal.

Here's an excerpt:

Data stewardship encompasses all activities that preserve and improve the information content, accessibility, and usability of data and metadata. Recent regulations, mandates, policies, and guidelines set forth by the U.S. government, federal other, and funding agencies, scientific societies and scholarly publishers, have levied stewardship requirements on digital scientific data. This elevated level of requirements has increased the need for a formal approach to stewardship activities that supports compliance verification and reporting. Meeting or verifying compliance with stewardship requirements requires assessing the current state, identifying gaps, and, if necessary, defining a roadmap for improvement. This, however, touches on standards and best practices in multiple knowledge domains. Therefore, data stewardship practitioners, especially these at data repositories or data service centers or associated with data stewardship programs, can benefit from knowledge of existing maturity assessment models. This article provides an overview of the current state of assessing stewardship maturity for federally funded digital scientific data. A brief description of existing maturity assessment models and related application(s) is provided. This helps stewardship practitioners to readily obtain basic information about these models. It allows them to evaluate each model’s suitability for their unique verification and improvement needs.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Practical Challenges For Researchers in Data Sharing

Springer Nature has released Practical Challenges For Researchers in Data Sharing.

Here's an excerpt:

This survey aims to understand researcher activity around sharing data at a particular point in the research lifecycle—when they are preparing their work for publication. In this it builds on previously published studies that explore data sharing more generally during the research process. It explores attitudes briefly, but focuses on actions and challenges in sharing data. Responses from over 7,700 researchers enabled us to draw new insights across subject felds and, to a lesser extent, across geographies.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Data We Trust—But—What Data?"

Jennifer Golbeck has published "Data We Trust—But—What Data?" in Reference & User Services Quarterly.

Here's an excerpt:

In the last year, we have not seen a massive removal of government data. We have seen targeted suppression and a general lack of concern for having government data sources reflect objective truth. Fortunately, many organizations are monitoring, archiving, and analyzing changes to official data.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Identifying Potential Solutions to Increase Discoverability and Reuse of Analog Datasets in Various Campus Locations"

Shannon L. Farrell and Julia Ann Kelly have published "Identifying Potential Solutions to Increase Discoverability and Reuse of Analog Datasets in Various Campus Locations" in Issues in Science and Technology Librarianship.

Here's an excerpt:

Describing, preserving, and providing access to data is now the purview of many science librarians, although the emphasis has been on data in electronic format. Data in paper or analog format might be found in many places around our campuses. At the University of Minnesota we conducted a preliminary investigation of analog data through discussions with faculty, staff, and the University Archives. We identified data in numerous locations, including the University Archives, personal collections, departmental holdings, museums, and off-campus research stations. We discovered data in many formats and carried out a few initial projects including creating a detailed inventory of one research center's analog data and digitizing and depositing one individual's dissertation data in our institutional repository. We also examined University Archives and discovered substantial amounts of analog data along with problems such as incomplete description or context. Overall we have identified several challenges and directions that we could take to make analog data more findable and available for reuse, but there is no clear single path forward.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Data Engagement Opportunities Scaffold: Development and Implementation"

Abigail Goben and Megan R. Sapp Nelson have published "The Data Engagement Opportunities Scaffold: Development and Implementation" in the Journal of eScience Librarianship.

Here's an excerpt:

While interest in research data management (RDM) services have grown, clarifying the path between traditional library responsibilities and RDM remains a challenge. While the literature has provided ideas about services and student-/researcher-focused data information literacy (DIL) competencies, nothing has yet brought these skill sets together to provide a pathway for librarians engaging in RDM. The Data Engagement Opportunities scaffold was developed to provide a strategic trajectory relating information science skills, the DIL competencies, the stages of the data life cycle, three levels of RDM engagement activities, and potential measurable outcomes. This scaffold provides direction for librarians looking to identify their current abilities and explore new opportunities.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Relaunch: Open Data Goldbook for Data Managers and Data Holders"

The European Data Portal has released "Relaunch: Open Data Goldbook for Data Managers and Data Holders."

Here's an excerpt:

How to build an Open Data strategy? How to implement an Open Data initiative? What is needed to put in place an Open Data lifecycle? How to ensure and monitor Open Data success? The European Data Portal has updated its Open Data Goldbook for Data Managers and Data Holders to answer all of these questions.

Go to the report.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Open Data Charter’s Measurement Guide Is Now Open for Consultation!"

Danny Lãmmerhirt et al. have published "The Open Data Charter's Measurement Guide Is Now Open for Consultation!" in the Open Knowledge International Blog.

Here's an excerpt:

The Measurement and Accountability Working Group (MAWG) is launching the public consultation phase for the draft Open Data Charter Measurement* Guide! . . . .

The Guide explains how the Open Data Charter principles can be measured. It provides a comprehensive overview of existing open data measurement tools and their indicators, which assess the state of open government data at a national level. Many of the indicators analysed are relevant for local and regional governments, too.

See also: Open Data Charter Measurement Guide.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Defining the Role of Libraries in the Open Science Landscape: A Reflection on Current European Practice"

Paul Ayris and Tiberius Ignat have self-archived "Defining the Role of Libraries in the Open Science Landscape: A Reflection on Current European Practice."

Here's an excerpt:

This collaborative paper looks at how libraries can engage with and offer leadership in the Open Science movement. It is based on case studies and the results of an EU-funded research project on Research Data Management taken from European research-led universities and their libraries. It begins by analysing three recent trends in Science, and then links component parts of the research process to aspects of Open Science. The paper then looks in detail at four areas and identifies roles for libraries: Open Access and Open Access publishing, Research Data Management, E-Infrastructures (especially the European Open Science Cloud), and Citizen Science. The paper ends in suggesting a model for how libraries, by using a 4-step test, can assess their engagement with Open Science. This 4-step test is based on lessons drawn from the case studies.

Digital Scholarship | Digital Scholarship Sitemap

"The Modern Research Data Portal: A Design Pattern for Networked, Data-Intensive Science"

Kyle Chard et al. have published "The Modern Research Data Portal: A Design Pattern for Networked, Data-Intensive Science" in PeerJ.

Here's an excerpt:

In this article, we first define the problems that research data portals address, introduce the legacy approach, and examine its limitations. We then introduce the MRDP design pattern and describe its realization via the integration of two elements: Science DMZs (Dart et al., 2013) (high-performance network enclaves that connect large-scale data servers directly to high-speed networks) and cloud-based data management and authentication services such as those provided by Globus (Chard, Tuecke & Foster, 2014). We then outline a reference implementation of the MRDP design pattern, also provided in its entirety on the companion web site, https://docs.globus.org/mrdp, that the reader can study—and, if they so desire, deploy and adapt to build their own high-performance research data portal. We also review various deployments to show how the MRDP approach has been applied in practice: examples like the National Center for Atmospheric Research's Research Data Archive, which provides for high-speed data delivery to thousands of geoscientists; the Sanger Imputation Service, which provides for online analysis of user-provided genomic data; the Globus data publication service, which provides for interactive data publication and discovery; and the DMagic data sharing system for data distribution from light sources. We conclude with a discussion of related technologies and summary.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap