"Building Open Access to Research (OAR) Data Infrastructure at NIST"

Gretchen Greene, Raymond Plante, and Robert Hanisch have published "Building Open Access to Research (OAR) Data Infrastructure at NIST" in Data Science Journal.

Here's an excerpt:

As a National Metrology Institute (NMI), the USA National Institute of Standards and Technology (NIST) scientists, engineers and technology experts conduct research across a full spectrum of physical science domains. NIST is a non-regulatory agency within the U.S. Department of Commerce with a mission to promote U.S. innovation and industrial competitiveness by advancing measurement science, standards, and technology in ways that enhance economic security and improve our quality of life. NIST research results in the production and distribution of standard reference materials, calibration services, and datasets. These are generated from a wide range of complex laboratory instrumentation, expert analyses, and calibration processes. In response to a government open data policy, and in collaboration with the broader research community, NIST has developed a federated Open Access to Research (OAR) scientific data infrastructure aligned with FAIR (Findable, Accessible, Interoperable, Reusable) data principles. Through the OAR initiatives, NIST's Material Measurement Laboratory Office of Data and Informatics (ODI) recently released a new scientific data discovery portal and public data repository. These science-oriented applications provide dissemination and public access for data from across the broad spectrum of NIST research disciplines, including chemistry, biology, materials science (such as crystallography, nanomaterials, etc.), physics, disaster resilience, cyberinfrastructure, communications, forensics, and others. NIST’s public data consist of carefully curated Standard Reference Data, legacy high valued data, and new research data publications. The repository is thus evolving both in content and features as the nature of research progresses. Implementation of the OAR infrastructure is key to NIST’s role in sharing high integrity reproducible research for measurement science in a rapidly changing world.

Research Data Curation Bibliography, Version 10 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Developing a Research Data Policy Framework for All Journals and Publishers"

Iain Hrynaszkiewicz et al. have self-archived "Developing a Research Data Policy Framework for All Journals and Publishers."

Here's an excerpt:

More journals and publishers—and funding agencies and institutions—are introducing research data policies. But as the prevalence of policies increases, there is potential to confuse researchers and support staff with numerous or conflicting policy requirements. We define and describe 14 features of journal research data policies and arrange these into a set of six standard policy types or tiers, which can be adopted by journals and publishers to promote data sharing in a way that encourages good practice and is appropriate for their audience's perceived needs. Policy features include coverage of topics such as data citation, data repositories, data availability statements, data standards and formats, and peer review of research data. These policy features and types have been created by reviewing the policies of multiple scholarly publishers, which collectively publish more than 10,000 journals, and through discussions and consensus building with multiple stakeholders in research data policy via the Data Policy Standardisation and Implementation Interest Group of the Research Data Alliance. Implementation guidelines for the standard research data policies for journals and publishers are also provided, along with template policy texts which can be implemented by journals in their Information for Authors and publishing workflows. We conclude with a call for collaboration across the scholarly publishing and wider research community to drive further implementation and adoption of consistent research data policies.

Research Data Curation Bibliography, Version 10 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Lives and After Lives of Data"

Christine L. Borgman has published "The Lives and After Lives of Data" in the Harvard Data Science Review.

Here's an excerpt:

The most elusive term in data science is 'data.' While often treated as objects to be computed upon, data is a theory-laden concept with a long history. Data exist within knowledge infrastructures that govern how they are created, managed, and interpreted. By comparing models of data life cycles, implicit assumptions about data become apparent. In linear models, data pass through stages from beginning to end of life, which suggest that data can be recreated as needed. Cyclical models, in which data flow in a virtuous circle of uses and reuses, are better suited for irreplaceable observational data that may retain value indefinitely. In astronomy, for example, observations from one generation of telescopes may become calibration and modeling data for the next generation, whether digital sky surveys or glass plates. The value and reusability of data can be enhanced through investments in knowledge infrastructures, especially digital curation and preservation. Determining what data to keep, why, how, and for how long, is the challenge of our day.

Research Data Curation Bibliography, Version 10 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Citation Advantage of Linking Publications to Research Data"

Giovanni Colavizza et al. have self-archived "The Citation Advantage of Linking Publications to Research Data."

Here's an excerpt:

We consider 531,889 journal articles published by PLOS and BMC which are part of the PubMed Open Access collection, categorize their data availability statements according to their content and analyze the citation advantage of different statement categories via regression. We find that, following mandated publisher policies, data availability statements have become common by now, yet statements containing a link to a repository are still just a fraction of the total. We also find that articles with these statements, in particular, can have up to 25.36% higher citation impact on average: an encouraging result for all publishers and authors who make the effort of sharing their data.

Research Data Curation Bibliography, Version 10 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Research Data Curation Bibliography, Version 10 PDF Released

Digital Scholarship has released a PDF of the Research Data Curation Bibliography, Version 10.

Created from the HTML file, this unpaginated PDF with basic formatting makes it easier to print the lengthy bibliography.

Research Data Curation Bibliography, Version 10 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Landscape of Rights and Licensing Initiatives for Data Sharing"

Sam Grabus and Jane Greenberg have published "The Landscape of Rights and Licensing Initiatives for Data Sharing" in Data Science Journal.

Here's an excerpt:

Over the last twenty years, a wide variety of resources have been developed to address the rights and licensing problems inherent with contemporary data sharing practices. The landscape of developments is this area is increasingly confusing and difficult to navigate, due to the complexity of intellectual property and ethics issues associated with sharing sensitive data. This paper seeks to address this challenge, examining the landscape and presenting a Version 1.0 directory of resources. A multi-method study was pursued, with an environmental scan examining 20 resources, resulting in three high-level categories: standards, tools, and community initiatives; and a content analysis revealing the subcategories of rights, licensing, metadata & ontologies. A timeline confirms a shift in licensing standardization priorities from open data to more nuanced and technologically robust solutions, over time, to accommodate for more sensitive data types. This paper reports on the research undertaking, and comments on the potential for using license-specific metadata supplements and developing data-centric rights and licensing ontologies.

Research Data Curation Bibliography, Version 10 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"A Model for Initiating Research Data Management Services at Academic Libraries"

Kevin B. Read et al. have published "A Model for Initiating Research Data Management Services at Academic Libraries" in the Journal of the Medical Library Association.

Here's an excerpt:

Background: Librarians developed a pilot program to provide training, resources, strategies, and support for medical libraries seeking to establish research data management (RDM) services. Participants were required to complete eight educational modules to provide the necessary background in RDM. Each participating institution was then required to use two of the following three elements: (1) a template and strategies for data interviews, (2) a teaching tool kit to teach an introductory RDM class, or (3) strategies for hosting a data class series.

Case Presentation: Six libraries participated in the pilot, with between two and eight librarians participating from each institution. Librarians from each institution completed the online training modules. Each institution conducted between six and fifteen data interviews, which helped build connections with researchers, and taught between one and five introductory RDM classes. All classes received very positive evaluations from attendees. Two libraries conducted a data series, with one bringing in instructors from outside the library.

Conclusion: The pilot program proved successful in helping participating librarians learn about and engage with their research communities, jump-start their teaching of RDM, and develop institutional partnerships around RDM services. The practical, hands-on approach of this pilot proved to be successful in helping libraries with different environments establish RDM services. The success of this pilot provides a proven path forward for libraries that are developing data services at their own institutions.

Research Data Curation Bibliography, Version 10 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

FAIRness of Repositories & Their Data: A Report from LIBER’s Research Data Management Working Group

LIBER has released FAIRness of Repositories & Their Data: A Report from LIBER's Research Data Management Working Group.

Here's an excerpt from the announcement:

The report, which can be downloaded from Zenodo, summarises the answers given by managers, librarians and technical staff with regards to:

  1. The FAIRness of repositories and their data;
  2. Misconceptions related to the principles’ definition and implementation;
  3. The complexity of the implementation and the importance of the FAIR principles for the repository community.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Establishing, Developing, and Sustaining a Community of Data Champions"

James L. Savage and Lauren Cadwallader have published "Establishing, Developing, and Sustaining a Community of Data Champions" in Data Science Journal.

Here's an excerpt:

Supporting good practice in Research Data Management (RDM) is challenging for higher education institutions, in part because of the diversity of research practices and data types across disciplines. While centralised research data support units now exist in many universities, these typically possess neither the discipline-specific expertise nor the resources to offer appropriate targeted training and support within every academic unit. One solution to this problem is to identify suitable individuals with discipline-specific expertise that are already embedded within each unit, and empower these individuals to advocate for good RDM and to deliver support locally. This article focuses on an ongoing example of this approach: the Data Champion Programme at the University of Cambridge, UK. We describe how the Data Champion programme was established; the programme's reach, impact, strengths and weaknesses after two years of operation; and our anticipated challenges and planned strategies for maintaining the programme over the medium- and long-term.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Virtuous and Vicious Circles in the Data Life-Cycle"

Elizabeth Yakel et al. have published "Virtuous and Vicious Circles in the Data Life-Cycle" in Information Research.

Here's an excerpt:

We address the following research questions:

  • How do different aspects of data production positively and negatively impact other phases in the life-cycle?
  • How do data selection decisions during sharing positively and negatively impact other phases in the life-cycle?
  • How can the work of data curators intervene to reinforce positive actions or mitigate negative actions?

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"A Link Is Not Enough—Reproducibility of Data"

Mateusz Pawlik et al. have published "A Link Is Not Enough—Reproducibility of Data" in Datenbank-Spektrum.

Here's an excerpt:

Although many works in the database community use open data in their experimental evaluation, repeating the empirical results of previous works remains a challenge. This holds true even if the source code or binaries of the tested algorithms are available. In this paper, we argue that providing access to the raw, original datasets is not enough. Real-world datasets are rarely processed without modification. Instead, the data is adapted to the needs of the experimental evaluation in the data preparation process. We showcase that the details of the data preparation process matter and subtle differences during data conversion can have a large impact on the outcome of runtime results. We introduce a data reproducibility model, identify three levels of data reproducibility, report about our own experience, and exemplify our best practices.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Administrative Load of Sharing Sensitive Data—Challenges and Solutions?"

Kirsty Merrett et al. have published "The Administrative Load of Sharing Sensitive Data—Challenges and Solutions?" in the International Journal of Digital Curation.

Here's an excerpt:

Sharing data openly has become a straightforward process at the University of Bristol. The University's top funders mandate or recommend data sharing as a condition of funding, and many publishers require access to research data to enable results of published articles to be verified. The University has provided a dedicated data repository to support this since 2015, and demand for open publication has risen steadily since its inception. However, an increasing number of requests for sharing data relate to data that has ethical, legal or commercial sensitivities and so cannot be published openly.

Rather than discuss the wide-ranging ethical implications of data sharing, this practice paper will focus on the secure sharing of sensitive data that has ethical approval and, where required, has the necessary consent in place, from the perspective of an institution that has already decided to undertake the work inherent in sharing sensitive data. The specific purpose is to detail the workflow and administrative tasks integral in this and to highlight the types of challenges encountered.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"From Passive to Active, from Generic to Focussed: How Can an Institutional Data Archive Remain Relevant in a Rapidly Evolving Landscape?"

Maria J. Cruz et al. have published "From Passive to Active, From Generic to Focussed: How Can an Institutional Data Archive Remain Relevant in a Rapidly Evolving Landscape? " in the International Journal of Digital Curation.

Here's an excerpt:

Founded in 2008 as an initiative of the libraries of three of the four technical universities in the Netherlands, the 4TU.Centre for Research Data (4TU.Research Data) has provided a fully operational, cross-institutional, long-term archive since 2010, storing data from all subjects in applied sciences and engineering. Presently, over 90% of the data in the archive is geoscientific data coded in netCDF (Network Common Data Form)—a data format and data model that, although generic, is mostly used in climate, ocean and atmospheric sciences. In this practice paper, we explore the question of how 4TU.Research Data can stay relevant and forward-looking in a rapidly evolving research data management landscape. In particular, we describe the motivation behind this question and how we propose to address it.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Certification for Trustworthy Digital Repositories: "CoreTrustSeal: From Academic Collaboration to Sustainable Services"

Hervé L'Hours et al. have published "CoreTrustSeal: From Academic Collaboration to Sustainable Services" in IASSIST Quarterly.

Here's an excerpt:

National and international digital repositories must design and deliver sustainable services as a foundation for a range of scientific and data management infrastructures while reducing costs and avoiding duplication of effort. The CoreTrustSeal, launched in 2017, defines requirements and offers core level certification for Trustworthy Digital Repositories (TDR) holding data for long-term preservation. This paper traces the journey of the CoreTrustSeal through the Data Seal of Approval (DSA), ICSU World Data System (WDS), Research Data Alliance (RDA) working groups, and community engagement, towards becoming a sustainable service supporting global data infrastructure. We outline the design and delivery of the service, current activities, the benefits of certification to a range of communities, and future plans and challenges. As well as providing a historical narrative and current and future perspectives the CoreTrustSeal experience offers lessons for those developing standards and best practices, or seeking to develop cooperative and community-driven efforts which bridge data curation across academic disciplines and the governmental and private sectors.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap