"Daily Life in the Open Biologist’s Second Job, as a Data Curator"


Background

Data reusability is the driving force of the research data life cycle. However, implementing strategies to generate reusable data from the data creation to the sharing stages is still a significant challenge. Even when datasets supporting a study are publicly shared, the outputs are often incomplete and/or not reusable. The FAIR (Findable, Accessible, Interoperable, Reusable) principles were published as a general guidance to promote data reusability in research, but the practical implementation of FAIR principles in research groups is still falling behind. In biology, the lack of standard practices for a large diversity of data types, data storage and preservation issues, and the lack of familiarity among researchers are some of the main impeding factors to achieve FAIR data. Past literature describes biological curation from the perspective of data resources that aggregate data, often from publications.

Methods

Our team works alongside data-generating, experimental researchers so our perspective aligns with publication authors rather than aggregators. We detail the processes for organizing datasets for publication, showcasing practical examples from data curation to data sharing. We also recommend strategies, tools and web resources to maximize data reusability, while maintaining research productivity.

Conclusion

We propose a simple approach to address research data management challenges for experimentalists, designed to promote FAIR data sharing. This strategy not only simplifies data management, but also enhances data visibility, recognition and impact, ultimately benefiting the entire scientific community.

https://doi.org/10.12688/wellcomeopenres.22899.1

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Scaling Up Digital Preservation Workflows With Homegrown Tools and Automation"


At NC State University Libraries, the Special Collections Research Center leverages an integrated system of locally developed applications and open-source technologies to facilitate the long-term preservation of digitized and born-digital archival assets. These applications automate many previously manual tasks, such as creating access derivatives from preservation scans and ingest into preservation storage. They have allowed us to scale up the number of digitized assets we create and publish online; born-digital assets we acquire from storage media, appraise, and package; and total assets in local and distributed preservation storage. The origin of these applications lies in scripted workflows put into use more than a decade ago, and the applications were built in close collaboration with developers in the Digital Library Initiatives department between 2011 and 2023. This paper presents a strategy for managing digital curation and preservation workflows that does not solely depend on standalone and third-party applications. It describes our iterative approach to deploying these tools, the functionalities of each application, and sustainability considerations of managing in-house applications and using Academic Preservation Trust for offsite preservation.

https://tinyurl.com/4mjpzth2

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"An Analysis of the Effects of Sharing Research Data, Code, and Preprints on Citations"


In this study, we investigate whether adopting one or more Open Science practices leads to significantly higher citations for an associated publication, which is one form of academic impact. We use a novel dataset known as Open Science Indicators, produced by PLOS and DataSeer, which includes all PLOS publications from 2018 to 2023 as well as a comparison group sampled from the PMC Open Access Subset. In total, we analyze circa 122’000 publications. We calculate publication and author-level citation indicators and use a broad set of control variables to isolate the effect of Open Science Indicators on received citations. We show that Open Science practices are adopted to different degrees across scientific disciplines. We find that the early release of a publication as a preprint correlates with a significant positive citation advantage of about 20.2% (±.7) on average. We also find that sharing data in an online repository correlates with a smaller yet still positive citation advantage of 4.3% (±.8) on average. However, we do not find a significant citation advantage for sharing code. Further research is needed on additional or alternative measures of impact beyond citations. Our results are likely to be of interest to researchers, as well as publishers, research funders, and policymakers.

https://doi.org/10.1371/journal.pone.0311493

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

University of Illinois Urbana-Champaign: "Research Data Service Extended Review"


Building upon the first 5-year review in 2019, this report presents an extended review of the Research Data Service at the University of Illinois Urbana-Champaign from 2018-2023, assessing its current efforts and future directions.

https://hdl.handle.net/2142/124781

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Metrics to Increase Data Usage Understanding and Transparency "


Data metrics are essential to assess the impact of data repositories’ holdings and to understand the research practices of the community that they serve. These metrics are useful for reporting to funders, to inform community engagement strategies, and to direct and sustain repository services. In turn, communicating these metrics to the user community conveys transparency and elicits their trust in data sharing. However, because data metrics are time-sensitive and context-dependent, tracking, interpreting, and communicating them is challenging. In this work we introduce data usage analyses including benchmarking and grouping, developed to better assess the impact of the DesignSafe Data Depot, a natural hazards data repository. Make Data Count compliant metrics are analysed in relation to research methods, sub-disciplines, natural hazard types, and time, to learn what data is being used, what influences data usage, and to establish realistic usage expectations. Results are interpreted in relation to the research and publication practices of the community and to natural hazard events. In addition, we introduce strategies to clearly communicate dataset metrics to users.

https://doi.org/10.2218/ijdc.v18i1.929

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Institutionally Based Research Data Services: Current Developments and Future Direction


The Summit for Academic Institutional Readiness in Data Sharing (STAIRS) was a multi-phased project that brought together a diverse group of representatives from academic institutions across the United States who support research data sharing efforts. Building off preliminary assessment work and a virtual learning series, this was a unique chance to discuss the opportunities and challenges in supporting researchers’ data sharing needs within and across institutions. This report captures the details of the project, including the preliminary assessment work as well as the summit. Following a description of the broad themes and overarching takeaways from this multi-phased effort, we conclude with next steps and future directions for the academic data services community.

https://tinyurl.com/3v8b5xc3

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Research Data Policy: A Library and Information Science Publishers’ Perspective"


Apart from the common features identified in the literature, the authors found numerous distinct research data policy features of publishers, such as deposition of data sets, division of research data policy types, and sharing of research code. Furthermore, institutional publishers with research data policies have more rigid features for the execution of research data policy features since their beneficiaries are uniform, in contrast to the varied nature of journals’ and publishers’ authors.

https://doi.org/10.1007/s11135-024-01994-8

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Data Curation Maturity Model"


The Data Curation Maturity Model (DCMM) enhances the effectiveness of data curation activities with its innovative, mathematically quantifiable matrix approach, specifically tailored to meet the dynamic needs of data curation. By integrating focus area maturity models with specific curation requirements, the DCMM provides a structured progression framework that allows organizations to visualize and measure their advancements in data management. This model not only elevates data quality and reproducibility across various research domains but also establishes a new standard for strategic, targeted improvement in data curation practices.

https://tinyurl.com/yb8am8ba

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Supporting Data Discovery: Comparing Perspectives of Support Specialists and Researchers Authors"


Purpose: Much of the research in data discovery is centered on the users’ viewpoint, frequently overlooking the perspective of those who develop and maintain the discovery infrastructure. Our goal is to conduct a comparative study on research data discovery, examining both support specialists’ and researchers’ views by merging new analysis with prior research insights.

Methods: This work summarizes the studies the authors have conducted over the last seven years investigating the data discovery practices of support specialists from different disciplines. Although support specialists were not the main target of some of these studies, data about their perspectives was collected. Our corpus comprises in-depth interviews with 6 social science support specialists, interviews with 19 researchers and 3 support specialists from multiple disciplines, a global survey with 1630 researchers and 47 support specialists, and a use case analysis of 25 support specialists. In the analysis section, we juxtapose the fresh insights on support specialists’ views with the already documented perspectives of researchers for a holistic understanding. The latter is primarily discussed in the literature review, with references made in the analysis section to draw comparisons.

Results: We found that support specialists’ views on data discovery are not entirely different from those of the researchers. There are, however, some differences that we have identified, most notably the interconnection of data discovery with general web search, literature search, and social networks. . . .

We conclude by proposing recommendations for different types of support work to better support researchers’ data discovery practices.

https://doi.org/10.5334/dsj-2024-048

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"IOP Publishing Study Reveals Varied Adoption and Barriers in Open Data Sharing Among Physical Research Communities"


Environmental scientists are the most open with their research data, yet legal constraints related to third-party ownership often limit their ability to follow the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles. Physicists are also willing to share data but have concerns about the accessibility and understanding of the formats used. Engineering and materials scientists face the most significant barriers to sharing FAIR data due to concerns over confidentiality and sensitivity.

https://tinyurl.com/2s3jjzft

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Paradox of Competition: How Funding Models Could Undermine the Uptake of Data Sharing Practices"


Although beneficial to scientific development, data sharing is still uncommon in many research areas. Various organisations, including funding agencies that endorse open science, aim to increase its uptake. However, estimating the large-scale implications of different policy interventions on data sharing by funding agencies, especially in the context of intense competition among academics, is difficult empirically. Here, we built an agent-based model to simulate the effect of different funding schemes (i.e., highly competitive large grants vs. distributive small grants), and varying intensity of incentives for data sharing on the uptake of data sharing by academic teams strategically adapting to the context. Our results show that more competitive funding schemes may lead to higher rates of data sharing in the short term, but lower rates in the long-term, because the uncertainty associated with competitive funding negatively affects the cost/benefit ratio of data sharing. At the same time, more distributive grants do not allow academic teams to cover the costs and time required for data sharing, limiting uptake. Our findings suggest that without support services and infrastructure to minimise the costs of data sharing and other ancillary conditions (e.g., university policy support, reputational rewards and benefits of data sharing for academic teams), it is unlikely that funding agencies alone can play a leading role for the uptake of data sharing. Therefore, any attempt to reform reward and recognition systems towards open science principles should carefully consider the potential impact of their proposed policies and their long-term side effects.

https://doi.org/10.31222/osf.io/gb4v2

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Changes to Data Management and Sharing (DMS) Plan Progress Reporting and the Submission of Revised DMS Plans Are Coming on October 1"


On October 1, NIH is adding several new Data Management and Sharing (DMS) questions to Research Performance Progress Reports (RPPRs) and updating the process for submitting revised DMS Plans to NIH for review. In brief:

  • As mentioned in a May 2024 Guide Notice, NIH is including several new questions about DMS activities in RPPRs submitted on or after October 1, 2024 (See Guide Notice NOT-OD-24-175). For awards for which the NIH DMS Policy applies, recipients will now be asked:
  • Whether data has been generated or shared to date
  • What repositories any data was shared to and under what unique digital identifier
  • If data has not been generated and/or shared per the award’s DMS Plan, why and what corrective actions have or will be taken to comply with the plan
  • If significant changes to the DMS Plan are anticipated in the coming year, recipients will be asked to explain them and provide a revised DMS Plan for approval.

https://tinyurl.com/4mxwtn8k

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Knowledge Infrastructures are Growing Up: The Case for Institutional (Data) Repositories 10 Years After the Holdren Memo"


Institutional data repositories are uniquely positioned to support researchers in sharing scholarly outputs. As funding agencies develop and institute policies for research data access and sharing, institutional data repositories have emerged as a critical feature in ecosystems for data stewardship and sharing. We show that institutional data repositories can meet and exceed the requirements and recommendations of federal data policy, thereby maximizing the benefits of data sharing. We present results of a mixed-method study which explores the adoption and usage of institutional repositories to share data from 2017 to 2023. Data from two previous studies were combined with data collected in 2023 on the data sharing solutions of Association of Research Libraries member institutions in the United States and Canada. The analysis of the aggregated data indicates that data stewardship has increased in both institutional repositories and institutional data repositories with an increase in complementary infrastructure to support data sharing. We then conduct an “infrastructural inversion” (Bowker & Star, 1999) to ‘surface invisible work’ of making data repositories function well, and demonstrate that institutional data repositories have advantages for providing sustainable stewardship, curation, and sharing of research data. Finally, we show that institutional data repositories may produce additional benefits through established infrastructure, local interoperability, and control.

https://doi.org/10.5334/dsj-2024-046

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Plan S: "New Tool to Assess Equity in Scholarly Communication Models"


The tool [https://tinyurl.com/2crwwhes], which was inspired by the “How Open Is It?” framework, is targeted at institutions, library consortia, funders and publishers, i.e. the stakeholders either investing or receiving funds for publishing services. It offers users the opportunity to rate scholarly communication models and arrangements across seven criteria:

  • Access to Read
  • Publishing immediate Open Access
  • Maximizing participation
  • Re-use rights
  • Pricing and fee transparency
  • Promoting and encouraging open research practices: data and code
  • Promoting and encouraging open research practices: preprints and open peer review

https://tinyurl.com/ycwmp3nk

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"A Proposal for a FAIR Management of 3D Data in Cultural Heritage: The Aldrovandi Digital Twin Case"


n this article we analyse 3D models of cultural heritage with the aim of answering three main questions: what processes can be put in place to create a FAIR-by-design digital twin of a temporary exhibition? What are the main challenges in applying FAIR principles to 3D data in cultural heritage studies and how are they different from other types of data (e.g. images) from a data management perspective? We begin with a comprehensive literature review touching on: FAIR principles applied to cultural heritage data; representation models; both Object Provenance Information (OPI) and Metadata Record Provenance Information (MRPI), respectively meant as, on the one hand, the detailed history and origin of an object, and – on the other hand – the detailed history and origin of the metadata itself, which describes the primary object (whether physical or digital); 3D models as cultural heritage research data and their creation, selection, publication, archival and preservation. We then describe the process of creating the Aldrovandi Digital Twin, by collecting, storing and modelling data about cultural heritage objects and processes. We detail the many steps from the acquisition of the Digital Cultural Heritage Objects (DCHO), through to the upload of the optimised DCHO onto a web-based framework (ATON), with a focus on open technologies and standards for interoperability and preservation. Using the FAIR Principles for Heritage Library, Archive and Museum Collections as a framework, we look in detail at how the Digital Twin implements FAIR principles at the object and metadata level. We then describe the main challenges we encountered and we summarise what seem to be the peculiarities of 3D cultural heritage data and the possible directions for further research in this field.

https://arxiv.org/abs/2407.02018

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Research Data Alliance: Recommendations on Data Versioning


We often say that “A is a version of B” but do not explain what we mean by “version”. We imply that B was somehow derived from A or that they share a common ancestor. But how is B related to A? How do they differ? Do they differ in content or format? What is the significance of this difference? While this sounds like a question about the provenance of a dataset, it goes beyond that and asks questions about the identity of a digital object and the intellectual and creative work it embodies.

The Research Data Alliance Data Versioning Working Group (https://www.rd-alliance.org/groups/data-versioning-ig/) collected over forty use cases of versioning practices for data and software and published a set of principles distilled from the group’s analysis of them. These Principles define terminology that helps us differentiate different types of versioning and thus allow us to address the use cases more precisely. In follow-up discussions, we learned that the Principles are too abstract to apply to the operation of data repositories or to guide the citation of digital resources. Therefore, this document aims to translate the Principles into actionable recommendations for data versioning.

https://doi.org/10.5281/zenodo.13743876

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Ephemeral Geodata: An Impending Digital Dark Age"


Despite the unprecedented rate of geospatial data (“geodata”) generation, we are paradoxically creating a potential “dark age” in geospatial knowledge due to a failure to archive it. In the twentieth century, map libraries systematically collected and preserved government-issued maps. However, many have not expanded to include digital formats, which have replaced paper maps in most domains. Compounding this issue is the prevailing practice among government data providers to continuously update public data without adequately preserving previous iterations, thus overwriting the historical record.

https://doi.org/10.1080/15420353.2024.2398542

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "The FAIRification Process for Data Stewardship: A Comprehensive Discourse on the Implementation of the Fair Principles for Data Visibility, Interoperability and Management"


Using a systematic literature review, the study focuses on the implementation of these [FAIR] principles in research data management and their applicability in data repositories and data centres. It highlights the importance of implementing these principles systematically, allowing stakeholders to choose the minimum requirements and provide a vision for implementing them in data repositories and data centres. The article also highlights the steps in the FAIRification process, which can enhance data interoperability, discovery and reusability.

https://doi.org/10.1177/03400352241270692

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Research on the Generation Mechanism and Action Mechanism of Scientific Data Reuse Behavior"


Specifically, this study takes scientific data reuse attitudes as a breakthrough to discuss the factors that influence researchers’ scientific data reuse attitudes and the extent to which these factors influence scientific data reuse behaviors. It also further explores the impact of scientific data reuse behavior on research and innovation performance and the moderating effect of scientific data services on scientific data reuse behavior.

https://doi.org/10.1016/j.acalib.2024.102921

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Artificial Intelligence Assisted Curation of Population Groups in Biomedical Literature "


Curation of the growing body of published biomedical research is of great importance to both the synthesis of contemporary science and the archiving of historical biomedical literature. Each of these tasks has become increasingly challenging given the expansion of journal titles, preprint repositories and electronic databases. Added to this challenge is the need for curation of biomedical literature across population groups to better capture study populations for improved understanding of the generalizability of findings. To address this, our study aims to explore the use of generative artificial intelligence (AI) in the form of large language models (LLMs) such as GPT-4 as an AI curation assistant for the task of curating biomedical literature for population groups. We conducted a series of experiments which qualitatively and quantitatively evaluate the performance of OpenAI’s GPT-4 in curating population information from biomedical literature. Using OpenAI’s GPT-4 and curation instructions, executed through prompts, we evaluate the ability of GPT-4 to classify study ‘populations’, ‘continents’ and ‘countries’ from a previously curated dataset of public health COVID-19 studies.

Using three different experimental approaches, we examined performance by: A) evaluation of accuracy (concordance with human curation) using both exact and approximate string matches within a single experimental approach; B) evaluation of accuracy across experimental approaches; and C) conducting a qualitative phenomenology analysis to describe and classify the nature of difference between human curation and GPT curation. Our study shows that GPT-4 has the potential to provide assistance in the curation of population groups in biomedical literature. Additionally, phenomenology provided key information for prompt design that further improved the LLM’s performance in these tasks. Future research should aim to improve prompt design, as well as explore other generative AI models to improve curation performance. An increased understanding of the populations included in research studies is critical for the interpretation of findings, and we believe this study provides keen insight on the potential to increase the scalability of population curation in biomedical studies.

https://doi.org/10.2218/ijdc.v18i1.950

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Unfolding the Downloads of Datasets: A Multifaceted Exploration of Influencing Factors"


Scientific data are essential to advancing scientific knowledge and are increasingly valued as scholarly output. Understanding what drives dataset downloads is crucial for their effective dissemination and reuse. Our study, analysing 55,473 datasets from 69 data repositories, identifies key factors driving dataset downloads, focusing on interpretability, reliability, and accessibility. We find that while lengthy descriptive texts can deter users due to complexity and time requirements, readability boosts a dataset’s appeal. Reliability, evidenced by factors like institutional reputation and citation counts of related papers, also significantly increases a dataset’s attractiveness and usage. Additionally, our research shows that open access to datasets increases their downloads and amplifies the importance of interpretability and reliability. This indicates that easy access enhances the overall attractiveness and usage of datasets in the scholarly community. By emphasizing interpretability, reliability, and accessibility, this study offers a comprehensive framework for future research and guides data management practices toward ensuring clarity, credibility, and open access to maximize the impact of scientific datasets.

https://doi.org/10.1038/s41597-024-03591-8

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Curation is Communal: Transparency, Trust, and (In)visible Labour "


Research about trust and transparency within the realm of research data management and sharing typically centres on accreditation and compliance. Missing from many of these conversations are the social systems and enabling structures that are built on interpersonal connections. As members of the Data Curation Network (DCN), a consortium of United States-based institutional and non-profit data repositories, we have experienced first-hand the effort required to develop and sustain interpersonal trust and the benefits it provides to curation. In this paper, we reflect on the well-documented realities of curator and labour invisibility; the importance of fostering active communities (such as the DCN); and how trust, vulnerability and connectivity among colleagues leads to better curation practices. Through an investigation into data curators in the DCN, we found that, while curation can be isolating and invisible work, having a network of trusted peers helps alleviate these burdens and makes us better curators. We conclude with practical suggestions for implementing trust and transparency in relationships with colleagues and researchers.

https://doi.org/10.2218/ijdc.v18i1.938

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Closing Gaps: A Model of Cumulative Curation and Preservation Levels for Trustworthy Digital Repositories "


Curation and preservation measures carried out by digital repository staff are an important building block in maintaining the accessibility and usability of digital resources over time. The measures adequate to achieve long-term usability for a given audience strongly depend on scenarios of (re)use, the (intended) users’ needs and skills, the organisational setting (e.g., mission, resources, policies), as well as the characteristics of the digital objects to be preserved. The assessment of curation and preservation measures also forms an important part of existing certification procedures for trustworthy digital repositories (TDRs) as offered, for example, by the CoreTrustSeal foundation, the nestor network, or ISO.

The digital curation community is presented with the challenge of finding community-, organisation-, and object-specific approaches to curation and preservation at the same time as defining the minimum level of curation and preservation measures expected from a TDR in sufficiently generic terms to ensure applicability to a wide array of repositories. Against this backdrop, this paper discusses the need for and benefits of community-agreed levels of curation and preservation to address this challenge, and considers the tiered model proposed by the CoreTrustSeal Board as an example.

The proposed model is then applied in an analysis of successful CoreTrustSeal applications from 2018–2022 in an effort to better understand the capacity of the curation and preservation levels to capture the respective practices of repositories and to identify potential gaps.

https://doi.org/10.2218/ijdc.v18i1.926

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Ten Simple Rules for Recognizing Data and Software Contributions in Hiring, Promotion, and Tenure"


The ways in which promotion and tenure committees operate vary significantly across universities and departments. While committees often have the capability to evaluate the rigor and quality of articles and monographs in their scientific field, assessment with respect to practices concerning research data and software is a recent development and one that can be harder to implement, as there are few guidelines to facilitate the process. More specifically, the guidelines given to tenure and promotion committees often reference data and software in general terms, with some notable exceptions such as guidelines in [5] and are almost systematically trumped by other factors such as the number and perceived impact of journal publications. The core issue is that many colleges establish a scholarship versus service dichotomy: Peer-reviewed articles or monographs published by university presses are considered scholarship, while community service, teaching, and other categories are given less weight in the evaluation process. This dichotomy unfairly disadvantages digital scholarship and community-based scholarship, including data and software contributions [6]. In addition, there is a lack of resources for faculties to facilitate the inclusion of responsible data and software metrics into evaluation processes or to assess faculty’s expertise and competencies to create, manage, and use data and software as research objects. As a result, the outcome of the assessment by the tenure and promotion committee is as dependent on the guidelines provided as on the committee members’ background and proficiency in the data and software domains.

The presented guidelines aim to help alleviate these issues and align the academic evaluation processes to the principles of open science. We focus here on hiring, tenure, and promotion processes, but the same principles apply to other areas of academic evaluation at institutions. While these guidelines are by no means sufficient for handling the complexity of a multidimensional process that involves balancing a large set of nuanced and diverse information, we hope that they will support an increasing adoption of processes that recognize data and software as key research contributions.

https://doi.org/10.1371/journal.pcbi.1012296

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Transparent Disclosure, Curation & Preservation of Dynamic Digital Resources "


This paper explores an enhanced curation lifecycle being developed at the UK Data Service (UKDS), with our Data Product Builder. Through a Graphical User Interface, we aim to provide the researcher with a tailored digital resource. We detail the threefold motivation behind this initiative: data dissemination scalability, researcher satisfaction and the reduction of nationwide duplication of research effort.

Subsequent sections detail the technical components and challenges involved. In addition to more standard data subsetting, filtering and linking components, this data dissemination platform offers dynamic disclosure assessments – identifying combinations of variables that present a potential disclosure risk. All components are underpinned by the Data Documentation Initiative’s new Cross-Domain Integration standard (DDI-CDI), designed to handle the many structures in which data may be organised.

Ever conscious of the scale of the task we are embarking on, we remain motivated by the need for such advances in data dissemination and optimistic of the feasibility of such a system to meet the needs of the researcher while balancing the data disclosivity concerns of the data depositor.

https://doi.org/10.2218/ijdc.v18i1.937

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |