“Software Reuse in the Generative AI Era: From Cargo Cult Towards Systematic Practices”


In many ways, generative reuse can be viewed as a new form of cargo cult programming, or inclusion of code or program structures that originates from external sources without consideration or adequate understanding of relevance or side-effects [8, 19, 20, 27]. Just like in cargo cult programming, developers are (re)using code that they do not really necessarily understand at all. In the classic cargo cult programming scheme, developers are blindly doing something simply because others have used a certain piece of code or certain development approach earlier – basically putting their trust on artifacts that have already been known to work in other contexts. In contrast, in AI-assisted generative reuse developers place their trust on code that is generated by an external “oracle” whose inner workings are usually completely unknown to the user.

https://doi.org/10.1145/3755881.3755981

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“From Policy to Practice: Progress Towards Data- And Code-Sharing in Ecology and Evolution”


Data and code are essential for ensuring the credibility of scientific results and facilitating reproducibility, areas in which journal sharing policies play a crucial role. However, in ecology and evolution, we still do not know how widespread data- and code-sharing policies are, how accessible they are, and whether journals support data and code peer review. Here, we first assessed the clarity, strictness and timing of data- and code-sharing policies across 275 journals in ecology and evolution. Second, we assessed initial compliance to journal policies using submissions from two journals: Proceedings of the Royal Society B (Mar 2023–Feb 2024: n = 2340) and Ecology Letters (Jun 2021–Nov 2023: n = 571). Our results indicate the need for improvement: across 275 journals, 22.5% encouraged and 38.2% mandated data-sharing, while 26.6% encouraged and 26.9% mandated code-sharing. Journals that mandated data- or code-sharing typically required it for peer review (59.0% and 77.0%, respectively), which decreased when journals only encouraged sharing (40.3% and 24.7%, respectively). Our evaluation of policy compliance confirmed the important role of journals in increasing data- and code-sharing but also indicated the need for meaningful changes to enhance reproducibility. We provide seven recommendations to help improve data- and code-sharing, and policy compliance.

https://doi.org/10.1098/rspb.2025.1394

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Scientific Open-Source Software Is Less Likely to Become Abandoned Than One Might Think! Lessons from Curating a Catalog of Maintained Scientific Software”


Scientific software is essential to scientific innovation and in many ways it is distinct from other types of software. Abandoned (or unmaintained), buggy, and hard to use software, a perception often associated with scientific software can hinder scientific progress, yet, in contrast to other types of software, its longevity is poorly understood. Existing data curation efforts are fragmented by science domain and/or are small in scale and lack key attributes. We use large language models to classify public software repositories in World of Code into distinct scientific domains and layers of the software stack, curating a large and diverse collection of over 18,000 scientific software projects. Using this data, we estimate survival models to understand how the domain, infrastructural layer, and other attributes of scientific software affect its longevity. We further obtain a matched sample of non-scientific software repositories and investigate the differences. We find that infrastructural layers, downstream dependencies, mentions of publications, and participants from government are associated with a longer lifespan, while newer projects with participants from academia had shorter lifespan. Against common expectations, scientific projects have a longer lifetime than matched non-scientific open-source software projects. We expect our curated attribute-rich collection to support future research on scientific software and provide insights that may help extend longevity of both scientific and other projects.

https://dl.acm.org/doi/10.1145/3729369

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“FAIR for Research Software (FAIR4RS) – Can Funders Keep Up With Open Science Developments?”


Encouraging signs of momentum abound. International initiatives such as the Research Software Alliance’s Funders Forum and the ADORE.software declaration are helping to rally research funders to endorse FAIR4RS. Even so, challenges persist, including:

  • Competing priorities. Funders juggle open access mandates, data-sharing policies and infrastructure investments. Research software can slip down the priority list.
  • Capacity gaps. Research funders, and to some extent the researchers they fund, often lack in-house expertise on implementing FAIR4RS practices.

https://tinyurl.com/26sk7zwe

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Understanding and Advancing Research Software Grant Funding Models”


Research software funding currently operates across a disconnected landscape of public and private grant-making organizations, leading to inefficiencies for software projects and the broader research community. The lack of coordination forces projects to pursue multiple, often overlapping opportunities, and forces funders to independently evaluate projects and proposals, resulting in duplicated effort and suboptimal resource distribution. By examining existing collaboration models, including centralized and distributed approaches, we highlight how joint decision-making mechanisms could improve sustainability for reusable software resources. An international set of examples illustrates how cross-organization cooperation for research software funding can be structured. Such collaborations can optimize grant disbursement and align priorities. Increased collaboration could allow funders to better address the ongoing maintenance and evolution of research software, lowering barriers that hamper discovery across multiple research domains. Encouraging both bottom-up user-driven and top-down coordination mechanisms ultimately supports more robust, widely accessible research software, improving global research outcomes.

https://doi.org/10.12688/openreseurope.20210.1

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Automatic Classification of Software Repositories: a Systematic Mapping Study”


The rapid growth of software repositories on development platforms such as GitHub, as well as archives like Software Heritage, prompts the need for better repository classification. Machine learning is increasingly used to automate this classification, but there are no secondary studies analyzing this research landscape. We present a systematic mapping study of 43 primary sources published between 2002 and 2023, where we examine the goals, inputs, outputs, training, and evaluation processes involved in automatic repository classification. Our findings reveal a growing interest in automatic classification, particularly to enhance the discoverability and recommendation of relevant repositories. Other applications, such as classification for mining studies, were surprisingly underrepresented. We also observe that a lack of standardized datasets, classification tasks, and evaluation metrics makes it difficult to compare the performance of different techniques.

https://hal.science/hal-05049757v1

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“CODE beyond FAIR”


FAIR principles are a set of guidelines aiming at simplifying the distribution of scientific data to enhance reuse and reproducibility. This article focuses on research software, which significantly differs from data through its living nature, and its relationship with free and open-source software. Based on the second French plan for Open Science, we provide a tiered roadmap to improve the state of research software, which is inclusive to all stakeholders in the research software ecosystem: scientific staff, but also institutions, funders, libraries and publishers.

https://inria.hal.science/hal-04930405

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“The Economic Impact of Open Science: A Scoping Review”


This paper summarised a comprehensive scoping review of the economic impact of Open Science (OS), examining empirical evidence from 2000 to 2023. It focuses on Open Access (OA), Open/FAIR Data (OFD), Open Source Software (OSS), and Open Methods, assessing their contributions to efficiency gains in research production, innovation enhancement, and economic growth. Evidence, although limited, indicates that OS accelerates research processes, reduces the related costs, fosters innovation by improving access to data and resources and this ultimately generates economic growth. Specific sectors, such as life sciences, are researched more and the literature exhibits substantial gains, mainly thanks to OFD and OA. OSS supports productivity, while the very limited studies on Open Methods indicate benefits in terms of productivity gains and innovation enhancement. However, gaps persist in the literature, particularly in fields like Citizen Science and Open Evaluation, for which no empirical findings on economic impact could be detected. Despite limitations, empirical evidence on specific cases highlight economic benefits. This review underscores the need for further metrics and studies across diverse sectors and regions to fully capture OS’s economic potential.

https://osf.io/preprints/metaarxiv/kqse5_v1

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Unveiling the Report Findings from IOI’s Study on the State of Open Research Software Infrastructure”


Key recommendations

  • Surface hidden information – One of the biggest challenges we discovered during the study is a scarcity of available, standardized, and meaningful data. This information gap limits the visibility of what is happening in research software and in the development of infrastructure to support it. There is a pressing need to give time and attention at the field level to identify and subsequently gather the needed data to fill the information gaps.
  • Strengthen the scaffolding – As the field matures, its actors need stronger scaffolding to support norms and activities. Scaffolding, in this instance, can be defined as elements that, with appropriate instantiation, might become the backbone (social, technical, administrative) infrastructure supporting the field. There is a need to shift the priority from creating to integrating and maintaining, and to encourage and enable consolidation, specialization, mergers, and handoffs.
  • Grow the market – One of the challenges we have noticed is that research software infrastructures are leaning on the same funding sources, and those funding sources may not last. We’ve seen this in other fields as well. There is a need to figure out how to identify the research software users and how those users connect to customers. Understanding how that user connects to the dollars necessary to keep the research infrastructures running is also essential. This is not about profit but keeping things running and having a dependable system.
  • Invest in coordination – Research software is still in its infancy and lacks well-established practices, scaffolding, and market structures. With these conditions, no single actor can succeed alone in this evolving field, especially amid today’s challenging fiscal and political landscape for open science. Philanthropic funders can step in with targeted investments that build the foundational architecture of research software infrastructure. Such investments would bolster individual projects, programs, and organizations and create the necessary environment — providing time, space, tools, and structured support — across training, packaging, hosting, socialization, and advocacy) to collaborate across disciplines and geographies.

https://tinyurl.com/bddnzd7c

The State of Open Research Software Infrastructure

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“How Will We Prepare for an Uncertain Future? The Value of Open Data and Code for Unborn Generations Facing Climate Change”


What is the unit of knowledge that we would most like to protect for future generations? Is it the scientific publication? Or is it our datasets? Datasets are snapshots in space and time of n-dimensional hypervolumes of information that are resources in and of themselves—each giving numerous insights into the measured world [134,135]. New publishing paradigms, such as Octopus, allow researchers to link multiple ‘Analysis’ and/or ‘Interpretation’ publications to a single ‘Results’ publication as alternative analyses and interpretations of the same data [159]. A more traditional research paper, on the other hand, is one realization of many possible assessments of the data that were originally collected, and a wide diversity of results can be obtained when many individuals analyse one dataset with the same research question in mind [160,161]. That is, publications are one version of an oversimplified projection through n-dimensional space which communicate stories that our human minds can comprehend. Manuscript narratives, by necessity, leave out information to craft such a story.

This is not to say that scientific publications in and of themselves are not useful. On the contrary, they frame our current and historical understanding of the world and put scientific inquiry into the relevant spatial and temporal context. Scientific articles offer analysis and interpretation of data which will allow future generations to understand why certain policies, management actions, or approaches were attempted and/or abandoned. However, if future researchers are not granted access to our (past) data, future humans will have to repeat costly (e.g. time and resources) experiments, laboriously extract information directly from figures, tables and text in the articles themselves (assuming the relevant information is available and detailed enough, although there is evidence that this is not the case in at least some disciplines [55,162]) or will have to trust our analytical procedures and our intuitions and perceptions about the data we collected [160,161].

https://doi.org/10.1098/rspb.2024.1515

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"A Primer for Applying and Interpreting Licenses for Research Data and Code"


This primer gives data curators an overview of the licenses that are commonly applied to datasets and code, familiarizes them with common requirements in institutional data policies, and makes recommendations for working with researchers who need to apply a license to their research outputs or understand a license applied to data or code they would like to reuse. While copyright issues are highly case-dependent, the introduction to the data copyright landscape and the general principles provided here can help data curators empower researchers to understand the copyright context of their own data.

https://tinyurl.com/34738m4s

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Framework for Managing University Open Source Software


This document serves as a comprehensive guide for universities looking to develop or refine an open source software framework. It provides the foundational knowledge and tools needed to create an environment that supports open source that is aligned with the unique needs and goals of each institution.

https://doi.org/10.5281/zenodo.14392733

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Scaling Up Digital Preservation Workflows With Homegrown Tools and Automation"


At NC State University Libraries, the Special Collections Research Center leverages an integrated system of locally developed applications and open-source technologies to facilitate the long-term preservation of digitized and born-digital archival assets. These applications automate many previously manual tasks, such as creating access derivatives from preservation scans and ingest into preservation storage. They have allowed us to scale up the number of digitized assets we create and publish online; born-digital assets we acquire from storage media, appraise, and package; and total assets in local and distributed preservation storage. The origin of these applications lies in scripted workflows put into use more than a decade ago, and the applications were built in close collaboration with developers in the Digital Library Initiatives department between 2011 and 2023. This paper presents a strategy for managing digital curation and preservation workflows that does not solely depend on standalone and third-party applications. It describes our iterative approach to deploying these tools, the functionalities of each application, and sustainability considerations of managing in-house applications and using Academic Preservation Trust for offsite preservation.

https://tinyurl.com/4mjpzth2

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"An Analysis of the Effects of Sharing Research Data, Code, and Preprints on Citations"


In this study, we investigate whether adopting one or more Open Science practices leads to significantly higher citations for an associated publication, which is one form of academic impact. We use a novel dataset known as Open Science Indicators, produced by PLOS and DataSeer, which includes all PLOS publications from 2018 to 2023 as well as a comparison group sampled from the PMC Open Access Subset. In total, we analyze circa 122’000 publications. We calculate publication and author-level citation indicators and use a broad set of control variables to isolate the effect of Open Science Indicators on received citations. We show that Open Science practices are adopted to different degrees across scientific disciplines. We find that the early release of a publication as a preprint correlates with a significant positive citation advantage of about 20.2% (±.7) on average. We also find that sharing data in an online repository correlates with a smaller yet still positive citation advantage of 4.3% (±.8) on average. However, we do not find a significant citation advantage for sharing code. Further research is needed on additional or alternative measures of impact beyond citations. Our results are likely to be of interest to researchers, as well as publishers, research funders, and policymakers.

https://doi.org/10.1371/journal.pone.0311493

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Software Management Plans – Current Concepts, Tools, and Application"


The present article is a review of the state of the art about software management plans (SMPs). It provides a selection of questionnaires, tools and application cases for SMPs from a European (German) point of view, and discusses the possible connections of SMPs to other aspects of software sustainability, such as metadata, FAIR4RS principles or machine-actionable SMPs. The aim of our publication is to provide basic knowledge to start diving into the subject and a handout for infrastructure providers who are about to establish/develop a SMP service in one’s own institution.

https://doi.org/10.5334/dsj-2024-043

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Creating a Fully Open Environment for Research Code and Data"


Quantitative research in the social and natural sciences is increasingly dependent on new datasets and forms of code. Making these resources open and accessible is a key aspect of open research and underpins efforts to maintain research integrity. Erika Pastrana explains how Springer Nature developed Nature Computational Science to be fully compliant with open research and data principles.

https://tinyurl.com/7uwdxrrz

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Ten Simple Rules for Recognizing Data and Software Contributions in Hiring, Promotion, and Tenure"


The ways in which promotion and tenure committees operate vary significantly across universities and departments. While committees often have the capability to evaluate the rigor and quality of articles and monographs in their scientific field, assessment with respect to practices concerning research data and software is a recent development and one that can be harder to implement, as there are few guidelines to facilitate the process. More specifically, the guidelines given to tenure and promotion committees often reference data and software in general terms, with some notable exceptions such as guidelines in [5] and are almost systematically trumped by other factors such as the number and perceived impact of journal publications. The core issue is that many colleges establish a scholarship versus service dichotomy: Peer-reviewed articles or monographs published by university presses are considered scholarship, while community service, teaching, and other categories are given less weight in the evaluation process. This dichotomy unfairly disadvantages digital scholarship and community-based scholarship, including data and software contributions [6]. In addition, there is a lack of resources for faculties to facilitate the inclusion of responsible data and software metrics into evaluation processes or to assess faculty’s expertise and competencies to create, manage, and use data and software as research objects. As a result, the outcome of the assessment by the tenure and promotion committee is as dependent on the guidelines provided as on the committee members’ background and proficiency in the data and software domains.

The presented guidelines aim to help alleviate these issues and align the academic evaluation processes to the principles of open science. We focus here on hiring, tenure, and promotion processes, but the same principles apply to other areas of academic evaluation at institutions. While these guidelines are by no means sufficient for handling the complexity of a multidimensional process that involves balancing a large set of nuanced and diverse information, we hope that they will support an increasing adoption of processes that recognize data and software as key research contributions.

https://doi.org/10.1371/journal.pcbi.1012296

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Sharing Practices of Software Artefacts and Source Code for Reproducible Research"


While source code of software and algorithms depicts an essential component in all fields of modern research involving data analysis and processing steps, it is uncommonly shared upon publication of results throughout disciplines. Simple guidelines to generate reproducible source code have been published. Still, code optimization supporting its repurposing to different settings is often neglected and even less thought of to be registered in catalogues for a public reuse. Though all research output should be reasonably curated in terms of reproducibility, it has been shown that researchers are frequently non-compliant with availability statements in their publications. These do not even include the use of persistent unique identifiers that would allow referencing archives of code artefacts at certain versions and time for long-lasting links to research articles. In this work, we provide an analysis on current practices of authors in open scientific journals in regard to code availability indications, FAIR principles applied to code and algorithms. We present common repositories of choice among authors. Results further show disciplinary differences of code availability in scholarly publications over the past years. We advocate proper description, archiving and referencing of source code and methods as part of the scientific knowledge, also appealing to editorial boards and reviewers for supervision.

https://doi.org/10.1007/s41060-024-00617-7

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Software Preservation after the Internet"


Software preservation must consider knowledge management as a key challenge. We suggest a conceptualization of software preservation approaches that are available at different stages of the software lifecycle and can support memory institutions to assess the current state of software items in their collection, the capabilities of their infrastructure, and completeness and applicability of knowledge that is required to successfully steward the collection.

https://tinyurl.com/8y9svs7x

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Where Is All the Research Software? An Analysis of Software in UK Academic Repositories"


This research examines the prevalence of research software as independent records of output within UK academic institutional repositories (IRs). There has been a steep decline in numbers of research software submissions to the UK’s Research Excellence Framework from 2008 to 2021, but there has been no investigation into whether and how the official academic IRs have affected the low return rates. In what we believe to be the first such census of its kind, we queried the 182 online repositories of 157 UK universities. Our findings show that the prevalence of software within UK Academic IRs is incredibly low. Fewer than 28% contain software as recognised academic output. Of greater concern, we found that over 63% of repositories do not currently record software as a type of research output and that several Universities appeared to have removed software as a defined type from default settings of their repository. We also explored potential correlations, such as being a member of the Russell group, but found no correlation between these metadata and prevalence of records of software. Finally, we discuss the implications of these findings with regards to the lack of recognition of software as a discrete research output in institutions, despite the opposite being mandated by funders, and we make recommendations for changes in policies and operating procedures.

https://doi.org/10.7717/peerj-cs.1546

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"How Does Mandated Code-Sharing Change Peer Review?"


At the end of the year-long trial period, code sharing had risen from 53% in 2019 to 87% for 2021 articles submitted after the policy went into effect. Evidence in hand, the journal Editors-in-Chief decided to make code sharing a permanent feature of the journal. Today, the sharing rate is 96%.

https://tinyurl.com/5n9yh9yj

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Towards Research Software-ready Libraries"


Software is increasingly acknowledged as valid research output. Academic libraries adapt to this change to become research software-ready. Software publication and citation are key areas in this endeavor. We present and discuss the current state of the practice of software publication and software citation, and discuss four areas of activity that libraries engage in: (1) technical infrastructure, (2) training and support, (3) software management and curation, (4) policies.

https://doi.org/10.1515/abitech-2023-0031

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Computational Reproducibility of Jupyter Notebooks from Biomedical Publications"


Jupyter notebooks facilitate the bundling of executable code with its documentation and output in one interactive environment, and they represent a popular mechanism to document and share computational workflows. The reproducibility of computational aspects of research is a key component of scientific reproducibility but has not yet been assessed at scale for Jupyter notebooks associated with biomedical publications. We address computational reproducibility at two levels: First, using fully automated workflows, we analyzed the computational reproducibility of Jupyter notebooks related to publications indexed in PubMed Central. We identified such notebooks by mining the articles full text, locating them on GitHub and re-running them in an environment as close to the original as possible. We documented reproduction success and exceptions and explored relationships between notebook reproducibility and variables related to the notebooks or publications. Second, this study represents a reproducibility attempt in and of itself, using essentially the same methodology twice on PubMed Central over two years. Out of 27271 notebooks from 2660 GitHub repositories associated with 3467 articles, 22578 notebooks were written in Python, including 15817 that had their dependencies declared in standard requirement files and that we attempted to re-run automatically. For 10388 of these, all declared dependencies could be installed successfully, and we re-ran them to assess reproducibility. Of these, 1203 notebooks ran through without any errors, including 879 that produced results identical to those reported in the original notebook and 324 for which our results differed from the originally reported ones. Running the other notebooks resulted in exceptions. We zoom in on common problems, highlight trends and discuss potential improvements to Jupyter-related workflows associated with biomedical publications.

https://arxiv.org/abs/2308.07333

More about Jupyter notebooks.

The Jupyter Notebook is an interactive computing environment that enables users to author notebook documents that include code, interactive widgets, plots, narrative text, equations, images and even video! The Jupyter name comes from 3 programming languages: Julia, Python, and R. It is a popular tool for literate programming. Donald Knuth first defined literate programming as a script, notebook, or computational document that contains an explanation of the program logic in a natural language (e.g. English or Mandarin), interspersed with snippets of macros and source code, which can be compiled and rerun. You can think of it as an executable paper!

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Code Sharing Increases Citations, but Remains Uncommon"


Overall, R code was only available in 49 of the 1001 papers examined (4.9%) (Figure 1). When included, code was most often in the Supplemental Information (41%), followed by Github (20%), Figshare (6%), or other repositories (33%). Open-access publications were 70% more likely to include code than closed access publications (7.21% vs. 4.22%, X2 = 4.442, p < 0.05). Code-sharing was estimated to increase at 0.5% / year, but this trend was not significant (p=0.11). The year of 2021 and 2022 showed a shift towards more frequent sharing, but the percentage of code-sharing has been consistently below 15% over the past decade (Figure 1).

We found papers including code disproportionately impact the literature (Figure 2), and accumulate citations faster (i.e., a marginally significant year-by-code-inclusion interaction; p = 0.0863). Further, we found a significant interaction between Open Access and code inclusion (p = 0.0265), with publications meeting both Open Science criteria (i.e., open code and open access) having highest overall predicted citation rates (Figure 2). For example, Open Science papers are expected to receive more than doubled citations (96.25 vs. 36.89) in year 13 post-publication compared with fully closed papers (Figure 2).

https://doi.org/10.21203/rs.3.rs-3222221/v1

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |