“Research Data Lifecycle (RDLC): An Investigation into the Disciplinary Focus, Use Cases, Creator Backgrounds, Stages and Shapes of RDLC Models”


In this paper, we report the results of a study examining 78 Research and Data Lifecycle (RDLC) models located in a review of the literature. Through synthesis-analysis and the nominal group technique, we investigated the RDLC models from the point of view of their disciplinary focus, use cases, model creators, as well as the specific stages and shapes. Our study revealed that the majority of the disciplinary focus for the models was generic, science, or multi-disciplinary. Models originating in the social sciences and humanities are less common. The use cases varied in a wide spectrum, with a total of 34 different scenarios. The creators and authors of the RDLC models came from more than 20 countries with the majority of the models created as a result of collaboration within or across different organizations. Our stage and shape analysis also outlined key characteristics of the RDLC models by showing the commonalities and variations of named stages and varying structures of the models. As one of the first empirical investigations examining the deep substance of the RDLC models, our study provides significant insights into the context and setting where the models were developed, as well as the details with regard to the stages and shapes, and thereby identified gaps that may impact the use and value of the models. As such, our study establishes a foundation for further studies on the practical utilization of the RDLC models in research data management practice and education.

https://doi.org/10.2218/ijdc.v19i1.860

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Research Data Management and Crowdsourcing Personal Histories”


Drawing on experiences of the University of Oxford’s Sustainable Digital Scholarship (SDS) service and the World War Two crowdsourcing project ‘Their Finest Hour’, this paper explores how institutional digital repositories (such as the SDS platform) can be successfully leveraged to publish and sustainably host crowdsourced (‘warm-data’) collections beyond their funding period.

The paper examines the challenges in applying FAIR (Findable, Accessible, Interoperable, Reusable) principles to a collection containing first-hand testimonies and digitised objects of significant sentimental value, addressing both practical and ethical considerations, including the management of copyright, handling of sensitive material, use of AI tools and adherence to good research data management practices, with limited resources.

Reflecting on the importance of a caring approach to data stewardship, the paper examines how the ethos of the Their Finest Hour project, and its commitment to honouring contributors and their families, led organically to an alignment with CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) principles, originally developed for Indigenous data governance. It also explores the potential for the wider application of CARE principles for crowdsourced collections such as the Their Finest Hour Online Archive, while acknowledging and respecting the origins of this framework.

Lastly, it offers some practical ‘lessons learned’ to help GLAM and Higher Education professionals working with crowdsourced collections and personal histories to navigate some of the research data management challenges that they may encounter, while also highlighting the importance of understanding FAIR and CARE principles and how they can be applied to these types of data collections.

https://doi.org/10.5334/johd.265

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Copyright and Licencing for Cultural Heritage Collections as Data”


Cultural Heritage (CH) institutions have been exploring innovative ways to publish digital collections to facilitate reuse, through initiatives like Collections as data and the International GLAM Labs Community. When making a digital collection available for computational use, it is crucial to have reusable and machine-readable open licences and copyright terms. While existing studies address copyright for digital collections, this study focuses specifically on the unique requirements of collections as data. This research highlights both the legal and technical aspects of copyright concerning collections as data. It discusses permissible uses of copyrighted collections, emphasising the need for interoperable, machine-readable licences and open licences. By reviewing current literature and examples, this study presents best practices and examples to help CH institutions better navigate copyright and licencing issues, ultimately enhancing their ability to convert their content into collections as data for computational research.

https://doi.org/10.5334/johd.263

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Research Data Management and Crowdsourcing Personal Histories”


Drawing on experiences of the University of Oxford’s Sustainable Digital Scholarship (SDS) service and the World War Two crowdsourcing project ‘Their Finest Hour’, this paper explores how institutional digital repositories (such as the SDS platform) can be successfully leveraged to publish and sustainably host crowdsourced (‘warm-data’) collections beyond their funding period.

The paper examines the challenges in applying FAIR (Findable, Accessible, Interoperable, Reusable) principles to a collection containing first-hand testimonies and digitised objects of significant sentimental value, addressing both practical and ethical considerations, including the management of copyright, handling of sensitive material, use of AI tools and adherence to good research data management practices, with limited resources.

Reflecting on the importance of a caring approach to data stewardship, the paper examines how the ethos of the Their Finest Hour project, and its commitment to honouring contributors and their families, led organically to an alignment with CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) principles, originally developed for Indigenous data governance. It also explores the potential for the wider application of CARE principles for crowdsourced collections such as the Their Finest Hour Online Archive, while acknowledging and respecting the origins of this framework.

Lastly, it offers some practical ‘lessons learned’ to help GLAM and Higher Education professionals working with crowdsourced collections and personal histories to navigate some of the research data management challenges that they may encounter, while also highlighting the importance of understanding FAIR and CARE principles and how they can be applied to these types of data collections.

https://doi.org/10.5334/johd.265

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Understanding How to Identify and Manage Personal Identifying Information (PII) to Further Data Interoperability"


Respect for research participant rights is a key aspect for consideration when creating and utilizing interoperable data. From that perspective, requirements for sharing research data often call for the data to be de-identified, i.e., the removal of all personal identifying information (PII) prior to data sharing, to ensure that the participant’s data privacy rights are not infringed upon. However, what constitutes PII is often a point of confusion amongst researchers who are not familiar with privacy laws and regulations. This paper hopes to provide some clarity around what makes research data identifiable by presenting it under a different perspective from what most researchers are familiar with. It also provides a framework to help researchers determine where PII could exist within their data that they can use to help with privacy impact evaluations. The goal is to empower researchers to share their data with greater confidence that the privacy rights of their research subjects have been sufficiently protected, enabling access to greater amounts of data for research use.

https://tinyurl.com/2p95xtd2

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Staking Out the Stakeholders: Using NIST’s Research Data Framework within a Public University System"


Purpose: This article first introduces and contextualizes the National Institute of Standards and Technology (NIST) Research Data Framework (RDaF) and then explores its application in a local context.

Setting/Participants: The State University of New York (SUNY) System, both at a system-wide level and at two individual SUNY campuses, developed an approach to applying RDaF to improve research data management (RDM) practices.

Brief Description: As institutions work to establish sound, coordinated services and infrastructure that meet local needs, they look to strategic guidance and established best practices for doing so responsibly and successfully. Modeled after their Cybersecurity and Privacy Frameworks, NIST began developing RDaF in 2019 to address pressing research data community needs. The RDaF provides a comprehensive, structured approach to be used by diverse stakeholders to better understand the benefits, risks, and costs of research data management (RDM).

Results/Outcome: NIST continues to work with other organizations on RDaF’s utility in different contexts, and SUNY’s application offers both a use case and lessons learned that may offer other institutions a practical, grounded approach for leveraging the power of RDaF to improve their RDM strategy.

Conclusions: RDaF’s comprehensive guidance offers a robust, flexible framework for building thorough RDM strategy, whatever an organization’s institutional readiness.

https://tinyurl.com/55v3k7ux

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Persistent Identifiers for Instruments and Facilities: Current State, Challenges, and Opportunities"


Objective: Persistent Identifiers (PIDs) are central to the vision of open science described in the FAIR Principles. However, the use of PIDs for scientific instruments and facilities is decentralized and fragmented. This project aims to develop community-based standards, guidelines, and best practices for how and why PIDs can be assigned to facilities and instruments.

Methods: We hosted several online and in-person focus groups and discussions, cumulating in a two-day in-person workshop featuring stakeholders from a variety of organizations and disciplines, such as instrument and facilities operators, PID infrastructure providers, researchers who use instruments and facilities, journal publishers, university administrators, federal funding agencies, and information and data professionals.

Results: Our first-year efforts resulted in four main areas of interest: developing a better understanding of the current PID ecosystem; clarifying how and when PIDs could be assigned to scientific instruments and facilities; challenges and barriers involved with assigning PIDs; incentives for researchers, facility managers, and other stakeholders to encourage the use of PIDs.

Conclusions: The potential for PIDs to facilitate the discovery, connection, and attribution of research instruments and facilities indicates an obvious value in their use. The lack of standards of how and when they are created, assigned, updated, and used is a major barrier to their widespread use. Data and information professionals can work to create relationships with stakeholders, provide relevant education and outreach activities, and integrate PIDs for instruments and facilities into their data curation and publication workflows.

https://tinyurl.com/3b8r6xrx

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"In Sharing We Trust. Taking Advantage of a Diverse Consortium to Build a Transparent Data Service in Catalonia "


The Consorci de Serveis Universitaris de Catalunya (CSUC) is a consortium that serves 13 universities and 33 research centers in Catalonia and neighboring communities. In 2017 the Consortium created an Open Science department to collaborate with universities and research centers on facilitating the adoption of Open Science requirements. Even though CSUC also offers services to researchers directly (for example, its supercomputing resources), this report will focus on CSUC’s work with its member institutions to create and offer data management services. We will explain how CSUC has led the creation of a robust shared governance system, and how it takes advantage of the diversity of its members to create useful, high quality, and transparent services for all researchers in the Catalan research system. Through sharing each other’s experiences, values and priorities, the result is better than separate ad-hoc solutions. The process also creates a community of practitioners that develop expertise together with the help of professional development opportunities organized by CSUC, like recurrent self-learning labs focused on data curation tools, techniques and processes.

https://tinyurl.com/r2msbnsv

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Researchers and Research Data: Improving and Incentivising Sharing and Archiving "


There has been a lot of discussion within the scientific community around the issues of reproducibility in research, with questions being raised about the integrity of research due to failure to reproduce or confirm the findings of some of the studies. Researchers need to adhere to the FAIR (findable, accessible, interoperable, and reusable) principles to contribute to collaborative and open science, but these open data principles can also support reproducibility and issues around ensuring data integrity. This article uses observations and metrics from data sharing and research integrity related activities, undertaken by a Research Integrity and Data Specialist at the Francis Crick Institute, to discuss potential reasons behind a slow uptake of FAIR data practices. We then suggest solutions undertaken at the Francis Crick institute which can be followed by institutes and universities to improve the integrity of research from a data perspective. One major solution discussed is the implementation of a data archive system at the Francis Crick Institute to ensure the integrity of data long term, comply with our funders’ data management requirements, and to safeguard our researchers against any potential research integrity allegations in the future.

https://tinyurl.com/wkhw548z

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Trends and Changes in Academic Libraries’ Data Management Functions: A Topic Modeling Analysis of Job Advertisements"


This study aims to (i) track trends in academic library data management positions, (ii) identify key themes in job advertisements related to data management, and (iii) examine how these themes have evolved. Using text mining techniques, this study applied Latent Dirichlet Allocation (LDA) and TF-IDF vectorization to systematically analyze 803 job advertisements related to data management posted on the IFLA LIBJOBS platform from 1996 to 2023. The findings reveal that the development of these positions has undergone three phases: exploration, growth, and adjustment. Four core themes in data management functions emerged: “Cataloging and Metadata Management,” “Data Services and Support,” “Research Data Management,” and “Systems Management and Maintenance.” Over time, these themes have evolved from distinct roles to a more balanced distribution.

https://doi.org/10.1016/j.acalib.2025.103017

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Paris Declaration Calls for Data-Driven Forensics to Spearhead the Fight Against Fake Science"


Supporters of research integrity have signed a new declaration calling for data-driven forensics – known as Forensic Scientometrics (FoSci) – to lead the charge in detecting, exposing and even preventing fake science. . . .

The event involved researchers, experts, and professionals from around the world who are committed to upholding research integrity, many well-known sleuths among them. Attendees signed the declaration over the following weekend. . . .

The FoSci Paris Declaration has made the following key commitments:

  • Advocate for transformation
  • Open a dialogue with policymakers to design de-incentivizing strategies to tackle the mass production of problematic papers
  • Advocate for reform of institutions involved in scientific research based on the group’s findings
  • Develop expertise and share knowledge
  • Facilitate training for researchers and professionals exploring these questions
  • Share and provide research and data in the FoSci community
  • Establish a regular cycle of professional meetings
  • Improve the tools and methods of forensic scientometrics
  • Improve the group’s ability to communicate its findings
  • Inform editorial boards, publishers, research institutions, governments and all relevant involved parties about the group’s work
  • Participate in building software and tools to enable the reproducibility of their forensics findings
  • Establish points of contact between FoSci members and concerned organizations

https://tinyurl.com/mrywc3ch

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"A Primer for Applying and Interpreting Licenses for Research Data and Code"


This primer gives data curators an overview of the licenses that are commonly applied to datasets and code, familiarizes them with common requirements in institutional data policies, and makes recommendations for working with researchers who need to apply a license to their research outputs or understand a license applied to data or code they would like to reuse. While copyright issues are highly case-dependent, the introduction to the data copyright landscape and the general principles provided here can help data curators empower researchers to understand the copyright context of their own data.

https://tinyurl.com/34738m4s

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Evolution of the “Long-Tail” Concept for Scientific Data"


This paper examines the changing landscape of discussions about long-tail data over time. . . . The review also bridges discussions on data curation in Library & Information Science (LIS) and domain-specific contexts, contributing to a more comprehensive understanding of the long-tail concept’s utility for effective data management outcomes. The review aims to provide a more comprehensive understanding of this concept, its terminological diversity in the literature, and its utility for guiding data management, overall informing current and future information science research and practice.

https://doi.org/10.1002/asi.24967

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Daily Life in the Open Biologist’s Second Job, as a Data Curator"


Background

Data reusability is the driving force of the research data life cycle. However, implementing strategies to generate reusable data from the data creation to the sharing stages is still a significant challenge. Even when datasets supporting a study are publicly shared, the outputs are often incomplete and/or not reusable. The FAIR (Findable, Accessible, Interoperable, Reusable) principles were published as a general guidance to promote data reusability in research, but the practical implementation of FAIR principles in research groups is still falling behind. In biology, the lack of standard practices for a large diversity of data types, data storage and preservation issues, and the lack of familiarity among researchers are some of the main impeding factors to achieve FAIR data. Past literature describes biological curation from the perspective of data resources that aggregate data, often from publications.

Methods

Our team works alongside data-generating, experimental researchers so our perspective aligns with publication authors rather than aggregators. We detail the processes for organizing datasets for publication, showcasing practical examples from data curation to data sharing. We also recommend strategies, tools and web resources to maximize data reusability, while maintaining research productivity.

Conclusion

We propose a simple approach to address research data management challenges for experimentalists, designed to promote FAIR data sharing. This strategy not only simplifies data management, but also enhances data visibility, recognition and impact, ultimately benefiting the entire scientific community.

https://doi.org/10.12688/wellcomeopenres.22899.1

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Scaling Up Digital Preservation Workflows With Homegrown Tools and Automation"


At NC State University Libraries, the Special Collections Research Center leverages an integrated system of locally developed applications and open-source technologies to facilitate the long-term preservation of digitized and born-digital archival assets. These applications automate many previously manual tasks, such as creating access derivatives from preservation scans and ingest into preservation storage. They have allowed us to scale up the number of digitized assets we create and publish online; born-digital assets we acquire from storage media, appraise, and package; and total assets in local and distributed preservation storage. The origin of these applications lies in scripted workflows put into use more than a decade ago, and the applications were built in close collaboration with developers in the Digital Library Initiatives department between 2011 and 2023. This paper presents a strategy for managing digital curation and preservation workflows that does not solely depend on standalone and third-party applications. It describes our iterative approach to deploying these tools, the functionalities of each application, and sustainability considerations of managing in-house applications and using Academic Preservation Trust for offsite preservation.

https://tinyurl.com/4mjpzth2

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"An Analysis of the Effects of Sharing Research Data, Code, and Preprints on Citations"


In this study, we investigate whether adopting one or more Open Science practices leads to significantly higher citations for an associated publication, which is one form of academic impact. We use a novel dataset known as Open Science Indicators, produced by PLOS and DataSeer, which includes all PLOS publications from 2018 to 2023 as well as a comparison group sampled from the PMC Open Access Subset. In total, we analyze circa 122’000 publications. We calculate publication and author-level citation indicators and use a broad set of control variables to isolate the effect of Open Science Indicators on received citations. We show that Open Science practices are adopted to different degrees across scientific disciplines. We find that the early release of a publication as a preprint correlates with a significant positive citation advantage of about 20.2% (±.7) on average. We also find that sharing data in an online repository correlates with a smaller yet still positive citation advantage of 4.3% (±.8) on average. However, we do not find a significant citation advantage for sharing code. Further research is needed on additional or alternative measures of impact beyond citations. Our results are likely to be of interest to researchers, as well as publishers, research funders, and policymakers.

https://doi.org/10.1371/journal.pone.0311493

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

University of Illinois Urbana-Champaign: "Research Data Service Extended Review"


Building upon the first 5-year review in 2019, this report presents an extended review of the Research Data Service at the University of Illinois Urbana-Champaign from 2018-2023, assessing its current efforts and future directions.

https://hdl.handle.net/2142/124781

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Metrics to Increase Data Usage Understanding and Transparency "


Data metrics are essential to assess the impact of data repositories’ holdings and to understand the research practices of the community that they serve. These metrics are useful for reporting to funders, to inform community engagement strategies, and to direct and sustain repository services. In turn, communicating these metrics to the user community conveys transparency and elicits their trust in data sharing. However, because data metrics are time-sensitive and context-dependent, tracking, interpreting, and communicating them is challenging. In this work we introduce data usage analyses including benchmarking and grouping, developed to better assess the impact of the DesignSafe Data Depot, a natural hazards data repository. Make Data Count compliant metrics are analysed in relation to research methods, sub-disciplines, natural hazard types, and time, to learn what data is being used, what influences data usage, and to establish realistic usage expectations. Results are interpreted in relation to the research and publication practices of the community and to natural hazard events. In addition, we introduce strategies to clearly communicate dataset metrics to users.

https://doi.org/10.2218/ijdc.v18i1.929

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Institutionally Based Research Data Services: Current Developments and Future Direction


The Summit for Academic Institutional Readiness in Data Sharing (STAIRS) was a multi-phased project that brought together a diverse group of representatives from academic institutions across the United States who support research data sharing efforts. Building off preliminary assessment work and a virtual learning series, this was a unique chance to discuss the opportunities and challenges in supporting researchers’ data sharing needs within and across institutions. This report captures the details of the project, including the preliminary assessment work as well as the summit. Following a description of the broad themes and overarching takeaways from this multi-phased effort, we conclude with next steps and future directions for the academic data services community.

https://tinyurl.com/3v8b5xc3

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Research Data Policy: A Library and Information Science Publishers’ Perspective"


Apart from the common features identified in the literature, the authors found numerous distinct research data policy features of publishers, such as deposition of data sets, division of research data policy types, and sharing of research code. Furthermore, institutional publishers with research data policies have more rigid features for the execution of research data policy features since their beneficiaries are uniform, in contrast to the varied nature of journals’ and publishers’ authors.

https://doi.org/10.1007/s11135-024-01994-8

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Data Curation Maturity Model"


The Data Curation Maturity Model (DCMM) enhances the effectiveness of data curation activities with its innovative, mathematically quantifiable matrix approach, specifically tailored to meet the dynamic needs of data curation. By integrating focus area maturity models with specific curation requirements, the DCMM provides a structured progression framework that allows organizations to visualize and measure their advancements in data management. This model not only elevates data quality and reproducibility across various research domains but also establishes a new standard for strategic, targeted improvement in data curation practices.

https://tinyurl.com/yb8am8ba

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Supporting Data Discovery: Comparing Perspectives of Support Specialists and Researchers Authors"


Purpose: Much of the research in data discovery is centered on the users’ viewpoint, frequently overlooking the perspective of those who develop and maintain the discovery infrastructure. Our goal is to conduct a comparative study on research data discovery, examining both support specialists’ and researchers’ views by merging new analysis with prior research insights.

Methods: This work summarizes the studies the authors have conducted over the last seven years investigating the data discovery practices of support specialists from different disciplines. Although support specialists were not the main target of some of these studies, data about their perspectives was collected. Our corpus comprises in-depth interviews with 6 social science support specialists, interviews with 19 researchers and 3 support specialists from multiple disciplines, a global survey with 1630 researchers and 47 support specialists, and a use case analysis of 25 support specialists. In the analysis section, we juxtapose the fresh insights on support specialists’ views with the already documented perspectives of researchers for a holistic understanding. The latter is primarily discussed in the literature review, with references made in the analysis section to draw comparisons.

Results: We found that support specialists’ views on data discovery are not entirely different from those of the researchers. There are, however, some differences that we have identified, most notably the interconnection of data discovery with general web search, literature search, and social networks. . . .

We conclude by proposing recommendations for different types of support work to better support researchers’ data discovery practices.

https://doi.org/10.5334/dsj-2024-048

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"IOP Publishing Study Reveals Varied Adoption and Barriers in Open Data Sharing Among Physical Research Communities"


Environmental scientists are the most open with their research data, yet legal constraints related to third-party ownership often limit their ability to follow the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles. Physicists are also willing to share data but have concerns about the accessibility and understanding of the formats used. Engineering and materials scientists face the most significant barriers to sharing FAIR data due to concerns over confidentiality and sensitivity.

https://tinyurl.com/2s3jjzft

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Paradox of Competition: How Funding Models Could Undermine the Uptake of Data Sharing Practices"


Although beneficial to scientific development, data sharing is still uncommon in many research areas. Various organisations, including funding agencies that endorse open science, aim to increase its uptake. However, estimating the large-scale implications of different policy interventions on data sharing by funding agencies, especially in the context of intense competition among academics, is difficult empirically. Here, we built an agent-based model to simulate the effect of different funding schemes (i.e., highly competitive large grants vs. distributive small grants), and varying intensity of incentives for data sharing on the uptake of data sharing by academic teams strategically adapting to the context. Our results show that more competitive funding schemes may lead to higher rates of data sharing in the short term, but lower rates in the long-term, because the uncertainty associated with competitive funding negatively affects the cost/benefit ratio of data sharing. At the same time, more distributive grants do not allow academic teams to cover the costs and time required for data sharing, limiting uptake. Our findings suggest that without support services and infrastructure to minimise the costs of data sharing and other ancillary conditions (e.g., university policy support, reputational rewards and benefits of data sharing for academic teams), it is unlikely that funding agencies alone can play a leading role for the uptake of data sharing. Therefore, any attempt to reform reward and recognition systems towards open science principles should carefully consider the potential impact of their proposed policies and their long-term side effects.

https://doi.org/10.31222/osf.io/gb4v2

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |