"Data Sharing in the Context of Community-Engaged Research Partnerships"


Over the past 20 years, the National Institutes for Health (NIH) has implemented several policies designed to improve sharing of research data, such as the NIH public access policy for publications, NIH genomic data sharing policy, and National Cancer Institute (NCI) Cancer Moonshot public access and data sharing policy. . . . Important questions that we must consider as data sharing is expanded are to whom do benefits of data sharing accrue and to whom do benefits not accrue? In an era of growing efforts to engage diverse communities in research, we must consider the impact of data sharing for all research participants and the communities that they represent.

We examine the issue of data sharing through a community-engaged research lens, informed by a long-standing partnership between community-engaged researchers and a key community health organization (Kruse et al., 2022). We contend that without effective community engagement and rich contextual knowledge, biases resulting from data sharing can remain unchecked. We provide several recommendations that would allow better community engagement related to data sharing to ensure both community and researcher understanding of the issues involved and move toward shared benefits. By identifying good models for evaluating the impact of data sharing on communities that contribute data, and then using those models systematically, we will advance the consideration of the community perspective and increase the likelihood of benefits for all.

https://doi.org/10.1016/j.socscimed.2023.115895

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Estimating Social Bias in Data Sharing Behaviours: An Open Science Experiment"


Open data sharing is critical for scientific progress. Yet, many authors refrain from sharing scientific data, even when they have promised to do so. Through a preregistered, randomized audit experiment (N = 1,634), we tested possible ethnic, gender and status-related bias in scientists’ data-sharing willingness. 814 (54%) authors of papers where data were indicated to be ‘available upon request’ responded to our data requests, and 226 (14%) either shared or indicated willingness to share all or some data. While our preregistered hypotheses regarding bias in data-sharing willingness were not confirmed, we observed systematically lower response rates for data requests made by putatively Chinese treatments compared to putatively Anglo-Saxon treatments. Further analysis indicated a theoretically plausible heterogeneity in the causal effect of ethnicity on data-sharing. In interaction analyses, we found indications of lower responsiveness and data-sharing willingness towards male but not female data requestors with Chinese names. These disparities, which likely arise from stereotypic beliefs about male Chinese requestors’ trustworthiness and deservingness, impede scientific progress by preventing the free circulation of knowledge.

https://doi.org/10.1038/s41597-023-02129-8

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "We Need a Plan D"


Researchers, institutions and funders should collaborate to develop an overarching strategy for data preservation — a plan D. There will doubtless be calls for a ‘PubMed Central for data’. But what we really need is a federated system of repositories with functionality tailored to the information that they archive. This will require domain experts to agree standards for different types of data from different fields: what should be archived and when, which format, where, and for how long.

https://doi.org/10.1038/s41592-023-01817-y

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Continuity and Discontinuity in Web Archives"


Web archival materials are not direct traces of the web, they are direct traces of crawlers. By design, the structure of web archives limits our collective capacity to explore the memory of the Web. These structural issues induce temporal discontinuities in the archives such as inconsistency, redundancy and blindness. In this paper, we address the question of re-injecting continuity within large corpora of web archives. We thus introduce the notions of persistences (series of time-stable snapshots of archived web pages) and continuity spaces (networks of time-consistent persistences). We demonstrate how { on the basis of a quality score { persistences can be used to select subsets of web archives within which in-depth historical analysis can be conducted at scale. We next propose to make use of a new visualization approach called the web cernes to graphically reconstruct the multi-level evolution of an archived web site. We finally apply our framework to study the archives of the firsttuesday movement: a constellation of networking web sites that acted in the interest of the economical growth of the web in the early 2000’s.

https://hal.science/hal-04057507

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Science Journals Integrate Dryad to Simplify Data Deposition and Strengthen Scientific Reproducibility"


The Science family journals have announced a partnership with the nonprofit data repository Dryad that simplifies the process by which authors deposit data underlying new work — a critical step to facilitating data’s routine reuse. The partnership is yet another step taken by the Science journals to ensure data the scientific community requires to verify, replicate and reanalyze new research is openly available. . . .

Because the partnership with Dryad integrates Dryad’s platform with the Science family journal’s submission process, authors will have the option to deposit data at Dryad directly from the submission site of any Science family journal. As authors submit research to the journals, they will be prompted about data availability and welcome to deposit their data to any suitable disciplinary repository. But, if data do not yet have a home, authors will have the opportunity to upload their data to Dryad. . . .

To ensure that this service is widely available, the Science journals will cover costs of Dryad data publication for accepted papers.

http://bit.ly/43wtVoD

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Guest Post — Why Interoperability Matters for Open Research — And More Than Ever"


The question remains, why have we not achieved more in delivering connectivity across the research system? While funding for this kind of underpinning infrastructure is notable in its absence (or where it is available it is often too temporary in nature), the other major challenge is in securing adoption among the service providers (funders, publishers, and institutions among the key players) that would maximize the use and potential of building those connections. It is notoriously hard for organisations to tweak or adapt existing workflows and legacy systems and to demonstrate the benefits (and hence prioritise the work) at an individual organisation level that may seem obvious at a system level.

https://cutt.ly/K7hxFQz

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Know(ing) Infrastructure: The Wayback Machine as Object and Instrument of Digital Research"


From documenting human rights abuses to studying online advertising, web archives are increasingly positioned as critical resources for a broad range of scholarly Internet research agendas. In this article, we reflect on the motivations and methodological challenges of investigating the world’s largest web archive, the Internet Archive’s Wayback Machine (IAWM). Using a mixed methods approach, we report on a pilot project centred around documenting the inner workings of ‘Save Page Now’ (SPN) — an Internet Archive tool that allows users to initiate the creation and storage of ‘snapshots’ of web resources. By improving our understanding of SPN and its role in shaping the IAWM, this work examines how the public tool is being used to ‘save the Web’ and highlights the challenges of operationalising a study of the dynamic sociotechnical processes supporting this knowledge infrastructure. Inspired by existing Science and Technology Studies (STS) approaches, the paper charts our development of methodological interventions to support an interdisciplinary investigation of SPN, including: ethnographic methods, ‘experimental blackbox tactics’, data tracing, modelling and documentary research. We discuss the opportunities and limitations of our methodology when interfacing with issues associated with temporality, scale and visibility, as well as critically engage with our own positionality in the research process (in terms of expertise and access). We conclude with reflections on the implications of digital STS approaches for ‘knowing infrastructure’, where the use of these infrastructures is unavoidably intertwined with our ability to study the situated and material arrangements of their creation.

https://doi.org/10.1177/13548565231164759

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Interoperable Infrastructure for Software and Data Publishing"


Achieving scalable, high-quality, interoperable data and software publishing is possible. There are already builders, some represented by the authorship of this article, that are on the right path, building tools that effectively meet the needs of researchers in an open and pluggable way. One example is InvenioRDM, a flexible and turn-key next-generation research data management repository built by CERN and more than 25 multi-disciplinary partners world-wide; InvenioRDM leverages community standards and supports FAIR practices out of the box. Another example of agnostic, pluggable tooling, in this case for software submission, are the submission workflow tools currently developed in the HERMES project. These allow researchers to automate the publication of software artifacts together with rich metadata, to create software publications following the FAIR Principles for Research Software.

http://bit.ly/42Lc5Oe

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Springer Nature Makes Data Sharing Easier with Single Data Policy across All Journals and Books"


Springer Nature has taken a further step forwards in its commitment to open science by requiring mandatory data availability statements (DAS) across its journals portfolio, and introducing its first unified data policy across the books portfolio.

Despite researchers’ support for open data sharing, less than 40% of authors actively make their data available. Researchers tell us this can be down to practical challenges, including a lack of clarity about what is required. Increasingly, governments, funders and research institutes are adopting data sharing requirements in their policies. Encouraging data sharing across all publishing formats recognises this growing need for clearer, more accessible, actionable and measurable data policies. As a longstanding supporter of Open Research, Springer Nature is Introducing DAS as standard for its journal portfolio to promote greater transparency and reproducibility. Adopting a unified policy for books for the first time, is a further exciting step towards encouraging open research practices across all publications and driving forward open science for all.

http://bit.ly/3FNihv9

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Study on the Readiness of Research Data and Literature Repositories to Facilitate Compliance With the Open Science Horizon Europe MGA Requirement

In this study we analysed 220 repositories and, via a structured methodology, we identified 165 trusted repositories and tested their readiness to facilitate the compliance with the HE MGA Open Science requirements.

We show that it is not straightforward to assess whether a given repository is suitable to facilitate compliance with the HE MGA requirements. This is mainly due to varying interpretations of definitions and requirements, whether information on repository specifications is publicly available, and the high level of technical expertise needed to assess all requirements.

We highlight that repository registries, such as FAIRsharing, re3data or the CoreTrustSeal (CTS) website, are not sufficient on their own to assess the readiness of repositories to facilitate compliance with the HE MGA requirements, as the definition of what constitutes a trusted repository is subtle and varied and needs to be carefully interpreted and applied to repositories. This is also the case for related concepts such as community endorsement or for policy requirements in terms of preservation, curation and security of the repository contents.

https://doi.org/10.5281/zenodo.7728016

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Characterizing Data Practices in Research Papers Across Four Disciplines"


In this paper, we focus on the five most common types of RDP — collecting data, processing data, analyzing data, representing data, and publishing or citing data. First, we compared the distributions of the five types of RDP across disciplines and observed noticeable differences between disciplines. In addition, we examined the characteristics of each type of RDP under different disciplinary contexts, by developing discipline-specific RDP vocabulary employing the tf-idf approach. Based on the common terms as well as the discipline-specific ones, we found that the five types of RDP can be distinctly conceptualized, while each type of RDP varies by disciplines in terms of their action, object, and instrument.

https://doi.org/10.1007/978-3-031-28035-1_26

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Trustworthy Digital Repository Certification: A Longitudinal Study"


To understand the impact of certification on repositories’ infrastructure, processes, and services, we analyzed a sample of publicly available TDR audit reports (n = 175) from the Data Seal of Approval (DSA) and Core Trust Seal (CTS) certification programs. This first longitudinal study of TDR certification over a ten-year period (from 2010 to 2020) found that many repositories either maintain a relatively high standard of trustworthiness in terms of their compliance with guidelines in DSA or CTS standards or improve their trustworthiness by raising their compliance levels with these guidelines each time they get recertified.

https://doi.org/10.1007/978-3-031-28032-0_42

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Participatory Web Archiving: Multifaceted Challenges"


There has been increasing interest in participatory web archiving in recent years. Indeed, it is widely regarded as a necessary step in the development of web archives. . . . Through a critical literature review, this paper addresses the need to analyse participatory web archiving practices, the mechanisms and power relations within them through political theories of power and participation.

https://doi.org/10.1007/978-3-031-28035-1_7

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Advancing Software Citation Implementation (Software Citation Workshop 2022)"


Software is foundationally important to scientific and social progress, however, traditional acknowledgment of the use of others’ work has not adapted in step with the rapid development and use of software in research. This report outlines a series of collaborative discussions that brought together an international group of stakeholders and experts representing many communities, forms of labor, and expertise. Participants addressed specific challenges about software citation that have so far gone unresolved. The discussions took place in summer 2022 both online and in-person and involved a total of 51 participants. The activities described in this paper were intended to identify and prioritize specific software citation problems, develop (potential) interventions, and lay out a series of mutually supporting approaches to address them. The outcomes of this report will be useful for the GLAM (Galleries, Libraries, Archives, Museums) community, repository managers and curators, research software developers, and publishers.

https://arxiv.org/abs/2302.07500v1

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Evolution of Research Data Management in Academic Libraries: A Review of the Literature"


The study is qualitative in nature and based on an extensive literature review survey. The analysis of the reviewed literature reveals that the idea of RDM has emerged as a new addition to library research support services. The more recent literature clearly established the pivotal role of libraries and librarians in developing and managing RDM services. However, data sharing practices and the development of RDM services in libraries are more prevalent in developed countries. While these trends are still lacking among researchers and libraries in developing countries.

https://doi.org/10.1177/02666669231157405

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Are the Humanities Ready for Data Sharing?


To get a sense of trends in data sharing within the humanities, we conducted semi-structured interviews with key personnel at several humanities projects with strong data components. The interviews focused on identifying where and how they planned to share their research data, how they imagined it might be used by others, and their perspective on barriers and opportunities to data sharing in the humanities. The research agendas, skills, and perspectives of the people we spoke with are not representative of most humanities-oriented research. However, the interviews provide important insight into the thinking of humanists who are already working across the cultural divide around data that separate the humanities from most other academic disciplines. We use them here as a springboard for consideration of what humanities data is, how to access and preserve it, and how it fits into the larger goals of creating an open research culture.

https://doi.org/10.18665/sr.318526

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Ten Lessons for Data Sharing with a Data Commons"


A data commons is a cloud-based data platform with a governance structure that allows a community to manage, analyze and share its data. Data commons provide a research community with the ability to manage and analyze large datasets using the elastic scalability provided by cloud computing and to share data securely and compliantly, and, in this way, accelerate the pace of research. Over the past decade, a number of data commons have been developed and we discuss some of the lessons learned from this effort.

https://doi.org/10.1038/s41597-023-02029-x

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Open Data and the 2023 NIH Data Management and Sharing Policy"


As the largest public funder of biomedical research in the world, the National Institutes of Health’s (NIH) new Data Management and Sharing (DMS) Policy is a large step toward shifting the culture of medical research toward a broader sharing of scientific data. . . . This article will serve as a primer on open data, data sharing, the NIH’s DMS Policy and its implications, and how librarians can support researchers in this landscape.

https://doi.org/10.1080/02763869.2023.2168103

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

79.3 Exabytes Capacity Sold in 2022: "Magnetic Tape Storage Is Seeing Cloud Go Back to the Future for Its Archival Data Needs"


Even then [in 1981], says Goodwin, people were saying tape was not long for this world. Those critics appear to have been silenced by recent sales figures, which show year-on-year shipments of hard disk drives (HDDs) sink by 34% in 2022, while consignments of magnetic tape drives rose by 14% — a total of 79.3 exabytes, or roughly equivalent to the entirety of data created on the internet every 32 days.

bit.ly/3ky5Trv

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Research Data Management Needs Assessment for Social Sciences Graduate Students: A Mixed Methods Study"


The complexity and privacy issues inherent in social science research data makes research data management (RDM) an essential skill for future researchers. Data management training has not fully addressed the needs of graduate students in the social sciences. To address this gap, this study used a mixed methods design to investigate the RDM awareness, preparation, confidence, and challenges of social science graduate students. A survey measuring RDM preparedness and training needs was completed by 98 graduate students in a school of education at a research university in the southern United States. Then, interviews exploring data awareness, knowledge of RDM, and challenges related to RDM were conducted with 10 randomly selected graduate students. All participants had low confidence in using RDM, but United States citizens had higher confidence than international graduate students. Most participants were not aware of on-campus RDM services, and were not familiar with data repositories or data sharing. Training needs identified for social science graduate students included support with data documentation and organization when collaborating, using naming procedures to track versions, data analysis using open access software, and data preservation and security. These findings are significant in highlighting the topics to cover in RDM training for social science graduate students. Additionally, RDM confidence and preparation differ between populations so being aware of the backgrounds of students taking the training will be essential for designing student-centered instruction.

https://doi.org/10.1371/journal.pone.0282152

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"How and Why Do Researchers Reference Data? A Study of Rhetorical Features and Functions of Data References in Academic Articles"


Data reuse is a common practice in the social sciences. While published data play an essential role in the production of social science research, they are not consistently cited, which makes it difficult to assess their full scholarly impact and give credit to the original data producers. Furthermore, it can be challenging to understand researchers’ motivations for referencing data. Like references to academic literature, data references perform various rhetorical functions, such as paying homage, signaling disagreement, or drawing comparisons. This paper studies how and why researchers reference social science data in their academic writing. We develop a typology to model relationships between the entities that anchor data references, along with their features (access, actions, locations, styles, types) and functions (critique, describe, illustrate, interact, legitimize). We illustrate the use of the typology by coding multidisciplinary research articles (n=30) referencing social science data archived at the Inter-university Consortium for Political and Social Research (ICPSR). We show how our typology captures researchers’ interactions with data and purposes for referencing data. Our typology provides a systematic way to document and analyze researchers’ narratives about data use, extending our ability to give credit to data that support research.

https://arxiv.org/abs/2302.08477

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |