"Rethinking Transparency and Rigor from a Qualitative Open Science Perspective"


To further complicate matters, many qualitative researchers would posit that while secondary data are a combination of the researcher’s perceptions and observations, even primary data, such as interview transcripts, are filtered to some extent through the researcher. This is because, in qualitative research, the researcher is an instrument of both data collection and analysis . . . .

The researcher-as-instrument tradition also complicates discussions around reproducibility (i.e., the ability for another researcher to look at someone’s data and reproduce the analyses), one of the key components of rigor as it is currently discussed in the open science movement (NIH, n.d.). Quantitative researchers’ focus on reproducibility is often contrary to the tenets of qualitative research, particularly in methodologies aiming to uncover new ways of knowing, such as constructivist and grounded theory approaches. If one understands the researcher as a data collection instrument and a filter through which data is processed, strict quantitative-focused reproducibility becomes less likely—not through misconduct or error, but because ultimately, people conduct research, and people are not likely to have exactly the same perspectives. Guidelines that reinforce reproducibility without addressing this tension are not going to be useful for all researchers.

https://bit.ly/3MEbtnk

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"A Pilot Study to Locate Historic Scientific Data in a University Archive"


Historic data in analog (or print) format is a valuable resource that is utilized by scientists in many fields. This type of data may be found in various locations on university campuses including offices, labs, storage facilities, and archives. This study investigates whether biological data held in one institutional university archives could be identified, described, and thus made potentially useful for contemporary life scientists. Scientific data was located and approximately half of it was deemed to be of some value to current researchers and about 20% included enough information for the study to be repeated. Locating individual data sets in the collections at the University Archives at the University of Minnesota proved challenging. This preliminary work points to possible ways to move forward to make raw data in university archives collections more discoverable and likely to be reused. It raises questions that can help inform future work in this area.

https://bit.ly/41JBMNb

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Initial Insight Into Three Modes of Data Sharing: Prevalence of Primary Reuse, Data Integration and Dataset Release in Research Articles"


While data sharing has received research interest in recent times, its real status remains unclear, owing to its ambiguous concept. To understand the current status of data sharing, this study examined primary reuse, data integration, and dataset release as the actual practices of data sharing. A total of 963 articles, chosen from those published in 2018 and registered in the Web of Science global citation database, were manually checked. Existing data were reused in the mode of data integration (13.3%) as frequently as they were for the mode of primary reuse (12.1%). Dataset release was the least common mode (9.0%). The results show the variation in data sharing and indicate the need for standardization of data description in articles based on thorough registration and expansion in public data archives to close the loop that results in the virtuous cycle of research data.

https://doi.org/10.1002/leap.1546

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"’We Share All Data with Each Other’: Data-Sharing in Peer-to-Peer Relationships"


The analysis identifies three social forms of data-sharing in peer-to-peer relationships: (a) closed communal sharing, which is based on a feeling of belonging together; (b) closed associative sharing, in which the participants act on the basis of an agreement; and (c) open associative sharing, which is oriented to “institutional imperatives” (Merton) and to formal regulations. The study shows that far more data-sharing is occurring in scientific practice than seems to be apparent from a concept of open data alone.

https://doi.org/10.1007/s11024-023-09487-y

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

FADGI: Technical Guidelines for Digitizing Cultural Heritage Materials, Third Edition


The Technical Guidelines for Digitizing Cultural Heritage Materials: Third Edition (linked below) were developed by the Still Image Working Group in 2022-2023. This document is an update of the 2016 Technical Guidelines for Digitizing Cultural Heritage Materials: Creation of Raster Image Master Files. The latest revision of the guidelines expands on earlier works and incorporates new material reflecting the advances in imaging science and cultural heritage imaging best practice. The Guidelines include shared best practices for still image materials (e.g., textual content, maps, and photographic prints and negatives) followed by agencies participating in the Federal Agencies Digital Guidelines Initiative (FADGI).These guidelines are intended to be used in conjunction with digital image conformance evaluation targets and software. Together, these guidelines and appropriate testing and monitoring systems provide the foundation for a FADGI-conforming digitization program.

https://bit.ly/3Bn6hOm

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Open Science: A Practical Guide for Early-Career Researchers


Beginning researchers are an important link in the transition to Open Science, so this guide is aimed at PhD candidates, Research Master Students, and early-career researchers from all disciplines at Dutch universities and research institutes. [This guide will be very useful to non-Dutch researchers.] It is designed to accompany researchers in every step of their research, from the phase of preparing your research project and discovering relevant resources (chapter 2) to the phase of data collection and analysis (chapter 3), writing and publishing articles, data, and other research output (chapter 4), and outreach and assessment (chapter 5). Every chapter provides you with the best tools and practices to implement immediately.

https://doi.org/10.5281/zenodo.7716152

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Digital Scholarship Has Released Digital Curation Certificate and Master’s Degree Programs

Digital Scholarship has released Digital Curation Certificate and Master’s Degree Programs. This document describes digital curation certificate and master’s degree programs in North America, identifying those that are online. It does not cover individualized certificate programs, such as those at Indiana University Bloomington or the University of Illinois Urbana-Champaign. Nor does it cover digital curation specializations within MLS and other master’s degree programs in iSchools. It is available as a website and a website PDF with live links.

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"How and Why Do Researchers Reference Data? A Study of Rhetorical Features and Functions of Data References in Academic Articles"


Data reuse is a common practice in the social sciences. While published data play an essential role in the production of social science research, they are not consistently cited, which makes it difficult to assess their full scholarly impact and give credit to the original data producers. Furthermore, it can be challenging to understand researchers’ motivations for referencing data. Like references to academic literature, data references perform various rhetorical functions, such as paying homage, signaling disagreement, or drawing comparisons. This paper studies how and why researchers reference social science data in their academic writing. We develop a typology to model relationships between the entities that anchor data references, along with their features (access, actions, locations, styles, types) and functions (critique, describe, illustrate, interact, legitimize). We illustrate the use of the typology by coding multidisciplinary research articles (n = 30) referencing social science data archived at the Inter-university Consortium for Political and Social Research (ICPSR). We show how our typology captures researchers’ interactions with data and purposes for referencing data. Our typology provides a systematic way to document and analyze researchers’ narratives about data use, extending our ability to give credit to data that support research.

https://doi.org/10.5334/dsj-2023-010

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Good, Better, Best: Practices in Archiving & Preserving Open Access Monographs


Good, Better, Best: Practices in Archiving & Preserving Open Access Monographs brings together the project’s growing knowledge and understanding around this community of practice, as well as reports on the Work Package’s research and development over the course of the project.

Following an introduction chapter giving a brief background landscape summary alongside employed methodologies, Chapter 2, "A basic guidebook for the small and scholar-led press" considers good, better, and best practices around file formats, metadata, content packaging, existing routes to digital publication archives, archiving and preservation workflows, and challenges surrounding copyright, reuse, and licensing. Additional chapters detail the repository workflow experimentations, both manual and automated, as well as successful proof-of-concept archiving in two online repositories: one, and institutional repository, and the other, the Internet Archive. Along with a chapter (Chapter 6) that explores the current understanding around implications for archiving and preserving complex and experimental monographs, two further chapters (7 and 8) look at future work: the expansion and development of the Thoth Archiving Network and the new Open Book Futures project, beginning May 2023. Appendices include signposting to toolkits, guides, and resources, as well as a brief glossary that provides links to more comprehensive archiving and preservation glossaries already in existence. We hope this will be a useful resource for the small and scholar-led press community and beyond.

https://doi.org/10.5281/zenodo.7876047

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Data Sharing in the Context of Community-Engaged Research Partnerships"


Over the past 20 years, the National Institutes for Health (NIH) has implemented several policies designed to improve sharing of research data, such as the NIH public access policy for publications, NIH genomic data sharing policy, and National Cancer Institute (NCI) Cancer Moonshot public access and data sharing policy. . . . Important questions that we must consider as data sharing is expanded are to whom do benefits of data sharing accrue and to whom do benefits not accrue? In an era of growing efforts to engage diverse communities in research, we must consider the impact of data sharing for all research participants and the communities that they represent.

We examine the issue of data sharing through a community-engaged research lens, informed by a long-standing partnership between community-engaged researchers and a key community health organization (Kruse et al., 2022). We contend that without effective community engagement and rich contextual knowledge, biases resulting from data sharing can remain unchecked. We provide several recommendations that would allow better community engagement related to data sharing to ensure both community and researcher understanding of the issues involved and move toward shared benefits. By identifying good models for evaluating the impact of data sharing on communities that contribute data, and then using those models systematically, we will advance the consideration of the community perspective and increase the likelihood of benefits for all.

https://doi.org/10.1016/j.socscimed.2023.115895

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Estimating Social Bias in Data Sharing Behaviours: An Open Science Experiment"


Open data sharing is critical for scientific progress. Yet, many authors refrain from sharing scientific data, even when they have promised to do so. Through a preregistered, randomized audit experiment (N = 1,634), we tested possible ethnic, gender and status-related bias in scientists’ data-sharing willingness. 814 (54%) authors of papers where data were indicated to be ‘available upon request’ responded to our data requests, and 226 (14%) either shared or indicated willingness to share all or some data. While our preregistered hypotheses regarding bias in data-sharing willingness were not confirmed, we observed systematically lower response rates for data requests made by putatively Chinese treatments compared to putatively Anglo-Saxon treatments. Further analysis indicated a theoretically plausible heterogeneity in the causal effect of ethnicity on data-sharing. In interaction analyses, we found indications of lower responsiveness and data-sharing willingness towards male but not female data requestors with Chinese names. These disparities, which likely arise from stereotypic beliefs about male Chinese requestors’ trustworthiness and deservingness, impede scientific progress by preventing the free circulation of knowledge.

https://doi.org/10.1038/s41597-023-02129-8

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "We Need a Plan D"


Researchers, institutions and funders should collaborate to develop an overarching strategy for data preservation — a plan D. There will doubtless be calls for a ‘PubMed Central for data’. But what we really need is a federated system of repositories with functionality tailored to the information that they archive. This will require domain experts to agree standards for different types of data from different fields: what should be archived and when, which format, where, and for how long.

https://doi.org/10.1038/s41592-023-01817-y

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Continuity and Discontinuity in Web Archives"


Web archival materials are not direct traces of the web, they are direct traces of crawlers. By design, the structure of web archives limits our collective capacity to explore the memory of the Web. These structural issues induce temporal discontinuities in the archives such as inconsistency, redundancy and blindness. In this paper, we address the question of re-injecting continuity within large corpora of web archives. We thus introduce the notions of persistences (series of time-stable snapshots of archived web pages) and continuity spaces (networks of time-consistent persistences). We demonstrate how { on the basis of a quality score { persistences can be used to select subsets of web archives within which in-depth historical analysis can be conducted at scale. We next propose to make use of a new visualization approach called the web cernes to graphically reconstruct the multi-level evolution of an archived web site. We finally apply our framework to study the archives of the firsttuesday movement: a constellation of networking web sites that acted in the interest of the economical growth of the web in the early 2000’s.

https://hal.science/hal-04057507

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Science Journals Integrate Dryad to Simplify Data Deposition and Strengthen Scientific Reproducibility"


The Science family journals have announced a partnership with the nonprofit data repository Dryad that simplifies the process by which authors deposit data underlying new work — a critical step to facilitating data’s routine reuse. The partnership is yet another step taken by the Science journals to ensure data the scientific community requires to verify, replicate and reanalyze new research is openly available. . . .

Because the partnership with Dryad integrates Dryad’s platform with the Science family journal’s submission process, authors will have the option to deposit data at Dryad directly from the submission site of any Science family journal. As authors submit research to the journals, they will be prompted about data availability and welcome to deposit their data to any suitable disciplinary repository. But, if data do not yet have a home, authors will have the opportunity to upload their data to Dryad. . . .

To ensure that this service is widely available, the Science journals will cover costs of Dryad data publication for accepted papers.

http://bit.ly/43wtVoD

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Guest Post — Why Interoperability Matters for Open Research — And More Than Ever"


The question remains, why have we not achieved more in delivering connectivity across the research system? While funding for this kind of underpinning infrastructure is notable in its absence (or where it is available it is often too temporary in nature), the other major challenge is in securing adoption among the service providers (funders, publishers, and institutions among the key players) that would maximize the use and potential of building those connections. It is notoriously hard for organisations to tweak or adapt existing workflows and legacy systems and to demonstrate the benefits (and hence prioritise the work) at an individual organisation level that may seem obvious at a system level.

https://cutt.ly/K7hxFQz

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Know(ing) Infrastructure: The Wayback Machine as Object and Instrument of Digital Research"


From documenting human rights abuses to studying online advertising, web archives are increasingly positioned as critical resources for a broad range of scholarly Internet research agendas. In this article, we reflect on the motivations and methodological challenges of investigating the world’s largest web archive, the Internet Archive’s Wayback Machine (IAWM). Using a mixed methods approach, we report on a pilot project centred around documenting the inner workings of ‘Save Page Now’ (SPN) — an Internet Archive tool that allows users to initiate the creation and storage of ‘snapshots’ of web resources. By improving our understanding of SPN and its role in shaping the IAWM, this work examines how the public tool is being used to ‘save the Web’ and highlights the challenges of operationalising a study of the dynamic sociotechnical processes supporting this knowledge infrastructure. Inspired by existing Science and Technology Studies (STS) approaches, the paper charts our development of methodological interventions to support an interdisciplinary investigation of SPN, including: ethnographic methods, ‘experimental blackbox tactics’, data tracing, modelling and documentary research. We discuss the opportunities and limitations of our methodology when interfacing with issues associated with temporality, scale and visibility, as well as critically engage with our own positionality in the research process (in terms of expertise and access). We conclude with reflections on the implications of digital STS approaches for ‘knowing infrastructure’, where the use of these infrastructures is unavoidably intertwined with our ability to study the situated and material arrangements of their creation.

https://doi.org/10.1177/13548565231164759

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Interoperable Infrastructure for Software and Data Publishing"


Achieving scalable, high-quality, interoperable data and software publishing is possible. There are already builders, some represented by the authorship of this article, that are on the right path, building tools that effectively meet the needs of researchers in an open and pluggable way. One example is InvenioRDM, a flexible and turn-key next-generation research data management repository built by CERN and more than 25 multi-disciplinary partners world-wide; InvenioRDM leverages community standards and supports FAIR practices out of the box. Another example of agnostic, pluggable tooling, in this case for software submission, are the submission workflow tools currently developed in the HERMES project. These allow researchers to automate the publication of software artifacts together with rich metadata, to create software publications following the FAIR Principles for Research Software.

http://bit.ly/42Lc5Oe

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Springer Nature Makes Data Sharing Easier with Single Data Policy across All Journals and Books"


Springer Nature has taken a further step forwards in its commitment to open science by requiring mandatory data availability statements (DAS) across its journals portfolio, and introducing its first unified data policy across the books portfolio.

Despite researchers’ support for open data sharing, less than 40% of authors actively make their data available. Researchers tell us this can be down to practical challenges, including a lack of clarity about what is required. Increasingly, governments, funders and research institutes are adopting data sharing requirements in their policies. Encouraging data sharing across all publishing formats recognises this growing need for clearer, more accessible, actionable and measurable data policies. As a longstanding supporter of Open Research, Springer Nature is Introducing DAS as standard for its journal portfolio to promote greater transparency and reproducibility. Adopting a unified policy for books for the first time, is a further exciting step towards encouraging open research practices across all publications and driving forward open science for all.

http://bit.ly/3FNihv9

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Study on the Readiness of Research Data and Literature Repositories to Facilitate Compliance With the Open Science Horizon Europe MGA Requirement

In this study we analysed 220 repositories and, via a structured methodology, we identified 165 trusted repositories and tested their readiness to facilitate the compliance with the HE MGA Open Science requirements.

We show that it is not straightforward to assess whether a given repository is suitable to facilitate compliance with the HE MGA requirements. This is mainly due to varying interpretations of definitions and requirements, whether information on repository specifications is publicly available, and the high level of technical expertise needed to assess all requirements.

We highlight that repository registries, such as FAIRsharing, re3data or the CoreTrustSeal (CTS) website, are not sufficient on their own to assess the readiness of repositories to facilitate compliance with the HE MGA requirements, as the definition of what constitutes a trusted repository is subtle and varied and needs to be carefully interpreted and applied to repositories. This is also the case for related concepts such as community endorsement or for policy requirements in terms of preservation, curation and security of the repository contents.

https://doi.org/10.5281/zenodo.7728016

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Characterizing Data Practices in Research Papers Across Four Disciplines"


In this paper, we focus on the five most common types of RDP — collecting data, processing data, analyzing data, representing data, and publishing or citing data. First, we compared the distributions of the five types of RDP across disciplines and observed noticeable differences between disciplines. In addition, we examined the characteristics of each type of RDP under different disciplinary contexts, by developing discipline-specific RDP vocabulary employing the tf-idf approach. Based on the common terms as well as the discipline-specific ones, we found that the five types of RDP can be distinctly conceptualized, while each type of RDP varies by disciplines in terms of their action, object, and instrument.

https://doi.org/10.1007/978-3-031-28035-1_26

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Trustworthy Digital Repository Certification: A Longitudinal Study"


To understand the impact of certification on repositories’ infrastructure, processes, and services, we analyzed a sample of publicly available TDR audit reports (n = 175) from the Data Seal of Approval (DSA) and Core Trust Seal (CTS) certification programs. This first longitudinal study of TDR certification over a ten-year period (from 2010 to 2020) found that many repositories either maintain a relatively high standard of trustworthiness in terms of their compliance with guidelines in DSA or CTS standards or improve their trustworthiness by raising their compliance levels with these guidelines each time they get recertified.

https://doi.org/10.1007/978-3-031-28032-0_42

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |