"Connecting Fragmented Support on Campus: Growing Research Data Services Programs Through Collaboration"


Research data services are provided by multiple units across and beyond the library, which is why communication and collaboration are paramount to building support for researchers. By exploring how Research Data Services (RDS) programs can function in the fragmented landscape of research support on campuses, we outline the role of collaboration in building programs. In this paper, we discuss building an RDS program by emphasizing three strategies for collaboration: collaborating within the library, collaborating across campus, and collaborating externally with those without direct ties to your organization. The aim of this paper is to offer attainable examples and strategies for building collaborations across campuses for libraries that have small or nascent RDS programs—how to approach and cultivate partnerships, how to set realistic goals, and how to work holistically within the fragmented academy.

https://tinyurl.com/9hbz49df

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "DMPFrame: A Conceptual Metadata Framework for Data Management Plans"


We have examined 12 open-source DMP tools, in particular, to evaluate the metadata adopted by these tools. The current study spots and highlights the gaps in the DMP metadata management in DMP tools and suggests DMPFrame as a conceptual framework addressing such gaps to improve the existing tools’ DMP metadata management practices. Based on the examined DMP tool’s metadata elements analysis and mapping, DMPFrame manages DMP metadata under 6 categories, namely, contributors, project, funding, organization, DMP, and output. The current study also suggests a systematic workflow that DMP tools could incorporate for metadata creation for DMPs.

https://doi.org/10.1080/19386389.2023.2268474

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices"


In recent years, funding agencies and journals increasingly advocate for open science practices (e.g. data and method sharing) to improve the transparency, access, and reproducibility of science. However, quantifying these practices at scale has proven difficult. In this work, we leverage a large-scale dataset of 1.1M papers from arXiv that are representative of the fields of physics, math, and computer science to analyze the adoption of data and method link-sharing practices over time and their impact on article reception. To identify links to data and methods, we train a neural text classification model to automatically classify URL types based on contextual mentions in papers. We find evidence that the practice of link-sharing to methods and data is spreading as more papers include such URLs over time. Reproducibility efforts may also be spreading because the same links are being increasingly reused across papers (especially in computer science); and these links are increasingly concentrated within fewer web domains (e.g. Github) over time. Lastly, articles that share data and method links receive increased recognition in terms of citation count, with a stronger effect when the shared links are active (rather than defunct). Together, these findings demonstrate the increased spread and perceived value of data and method sharing practices in open science.

https://arxiv.org/abs/2310.03193

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Kristin Briney: The Research Data Management Workbook


The Research Data Management Workbook is made up of a collection of exercises for researchers to improve their data management. The Workbook contains exercises across the data lifecycle, though the range of activities is not comprehensive. Instead, exercises focus on discrete practices within data management that are structured and can be reproduced by any researcher.

The book is divided into chapters, loosely by phases of the data lifecycle, with one or more exercises in each chapter. Every exercise comes with a description of its value within data management, instructions on how to do the exercise, original source of the exercise (when applicable), and the exercise itself.

https://tinyurl.com/2p8sk5xd

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Data Curation in Interdisciplinary and Highly Collaborative Research"


This paper provides a systematic analysis of publications that discuss data curation in interdisciplinary and highly collaborative research (IHCR). Using content analysis methodology, it examined 159 publications and identified patterns in definitions of interdisciplinarity, projects’ participants and methodologies, and approaches to data curation. The findings suggest that data is a prominent component in interdisciplinarity. In addition to crossing disciplinary and other boundaries, IHCR is defined as curating and integrating heterogeneous data and creating new forms of knowledge from it. Using personal experiences and descriptive approaches, the publications discussed challenges that data curation in IHCR faces, including an increased overhead in coordination and management, lack of consistent metadata practices, and custom infrastructure that makes interoperability across projects, domains, and repositories difficult. The paper concludes with suggestions for future research.

https://doi.org/10.2218/ijdc.v17i1.835

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Scholarly Communication Librarianship and Open Knowledge


The book consists of three parts. Part I offers definitions of scholarly communication and scholarly communication librarianship and provides an introduction to the social, economic, technological, and policy/legal pressures that underpin and shape scholarly communication work in libraries. These pressures, which have framed ACRL’s understanding of scholarly communication for the better part of the past two decades, have unsettled many foundational assumptions and practices in the field, removing core pillars of scholarly communication as it was practiced in the twentieth century. These pressures have also cleared fresh ground, and scholarly communication practitioners have begun to seed the space with values and practices designed to renew and often improve the field. Part II begins with an introduction to "open," the core response to the pressures described in part I. This part offers a general overview of the idea of openness in scholarly communication followed by chapters on different permutations and practices of open, each edited by a recognized expert of these areas with authors of their selection. Amy Buckland edited chapter 2.1, "Open Access." Brianna Marshall edited chapter 2.2, "Open Data." Lillian Hogendoorn edited chapter 2.3, "Open Education." Micah Vandegrift edited chapter 2.4, "Open Science and Infrastructure." Each of them brought on incredible expertise through contributors whom they identified, through both original contributions and repurposing existing openly licensed work, which is something we want to model where possible. Part III consists of twenty-four concise perspectives, intersections, and case studies from practicing librarians and closely related stakeholders, which we hope will stimulate discussion and reflection on theory and implications for practice. In every single case, we’re really excited by the editors and authors and the ideas they bring to the whole. Each contribution features light pedagogical apparatuses like suggested further reading, discussion or reflection prompts, and potential activities. It’s all available for free and openly licensed with a Creative Commons Attribution Non-Commercial (CC BY-NC) license, so anyone is encouraged to grab whatever parts are useful and to adapt and repurpose and improve them to meet specific course goals and student needs within the confines of the license.

https://bit.ly/SCLAOK

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"ACME-FAIR: a Guide for Research Performing Organisations (RPO)"


The overall purpose of ACME-FAIR is to help those managing and delivering relevant professional services to self-assess how they are enabling researchers and their colleagues to do just that. Each part deals with one of the key issues that Research Performing Organisations (RPO) face in establishing the capabilities to put the FAIR principles into practice. . . . Each of the 7 guides has a thematic introduction, an overview of the relevant capabilities, and a rubric for assessing the levels of maturity and community engagement for each capability.

https://tinyurl.com/yckfdjtd

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"An Approach to Assess the Quality of Jupyter Projects Published by GLAM Institutions"


Jupyter Notebooks have become a powerful tool to foster use of these collections by digital humanities researchers. Based on previous approaches for quality assessment, which have been adapted for cultural heritage collections, this paper proposes a methodology for assessing the quality of projects based on Jupyter Notebooks published by relevant GLAM institutions. A list of projects based on Jupyter Notebooks using cultural heritage data has been evaluated. Common features and best practices have been identified. A detailed analysis, that can be useful for organizations interested in creating their own Jupyter Notebooks projects, has been provided. Open issues requiring further work and additional avenues for exploration are outlined.

https://doi.org/10.1002/asi.24835

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Umbrella Data Management Plans to Integrate FAIR Data: Lessons From the ISIDORe and BY-COVID Consortia for Pandemic Preparedness"


The Horizon Europe project ISIDORe is dedicated to pandemic preparedness and responsiveness research. It brings together 17 research infrastructures (RIs) and networks to provide a broad range of services to infectious disease researchers. An efficient and structured treatment of data is central to ISIDORe’s aim to furnish seamless access to its multidisciplinary catalogue of services, and to ensure that users’ results are treated FAIRly. ISIDORe therefore requires a data management plan (DMP) covering both access management and research outputs, applicable over a broad range of disciplines, and compatible with the constraints and existing practices of its diverse partners.

Here, we describe how, to achieve that aim, we undertook an iterative, step-by-step, process to build a community-approved living document, identifying good practices and processes, on the basis of use cases, presented as proof of concepts. International fora such as the RDA and EOSC, and primarily the BY-COVID project, furnished registries, tools and online data platforms, as well as standards, and the support of data scientists. Together, these elements provide a path for building an umbrella, FAIR-compliant DMP, aligned as fully as possible with FAIR principles, which could also be applied as a framework for data management harmonisation in other large-scale, challenge-driven projects. Finally, we discuss how data management and reuse can be further improved through the use of knowledge models when writing DMPs and, how, in the future, an inter-RI network of data stewards could contribute to the establishment of a community of practice, to be integrated subsequently into planned trans-RI competence centres.

https://doi.org/10.5334/dsj-2023-035

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Digital Preservation Coalition: "Digital Preservation Documentation: A Guide"


This guide discusses the importance of documentation, who it is for, and highlights some of the features of good and bad documentation. It goes on to provide some tips on creating documentation, including some of the tools or platforms available. Review and update of documentation is discussed, as are requirements for long term preservation.

http://doi.org/10.7207/documentation-23

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Data Management Plan Tools: Overview and Evaluation"


Data Management Plans (DMPs) are crucial for a structured research data management and often a mandatory part of research proposals. DMP tools support the development of DMPs. Among the variety of tools available, it can be difficult for researchers, data stewards and institutions to choose the one that is most appropriate for their specific needs and context. We evaluated 18 DMP tools according to 31 requirement parameters covering aspects relating to basic functions, technical aspects and user-friendliness. The highest total evaluation scores were reached by Data Stewardship Wizard (703.5), DMPTool (615.5) and RDMO NFDI4Ing (549.5). The tools evaluated satisfied between 10 % and 87 % of the requirement parameters. 11 tools cover at least half of the parameters. In terms of correlation among the tools, which indicates to which degree their scores in the different requirement parameters are alike, we found the highest correlation for ezDMP and GFBio DMPT. Regarding the relatedness between the tools, 85 % of the DMP tools were positively and 16 % negatively correlated. Accounting for the recent developments in the area of DMP tools, this study provides an up-to-date evaluation that can support tool developers in identifying potential improvements, and hosting institutions to select a tool suited to their specific needs.

https://tinyurl.com/yewhv8rn

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Understanding Barriers Affecting the Adoption and Usage of Open Access Data in the Context of Organizations"


Although the benefits of organizational adoption are significant, most OAD-related projects fail because of organizational barriers and resistance to adoption. This study first aims to find these organizational barriers to adopting OAD to raise awareness of the obstacles organizations must overcome. Towards this aim, after conducting a systematic literature review (SLR) and an expert panel, a research model based on the Technology – Organization – Environment (TOE) framework is proposed in this study. As a result of SLR, 97 barriers were identified from ten primary studies. After critically examining these barriers, a research model classifying 22 crucial barriers to organizational OAD adoption based on the TOE framework is proposed.

https://doi.org/10.1016/j.dim.2023.100049

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Tracing Data: A Survey Investigating Disciplinary Differences in Data Citation"


Data citations, or citations in reference lists to data, are increasingly seen as an important means to trace data reuse and incentivize data sharing. Although disciplinary differences in data citation practices have been well documented via scientometric approaches, we do not yet know how representative these practices are within disciplines. Nor do we yet have insight into researchers’ motivations for citing — or not citing — data in their academic work. Here, we present the results of the largest known survey (n = 2,492) to explicitly investigate data citation practices, preferences, and motivations, using a representative sample of academic authors by discipline, as represented in the Web of Science (WoS). We present findings about researchers’ current practices and motivations for reusing and citing data and also examine their preferences for how they would like their own data to be cited. We conclude by discussing disciplinary patterns in two broad clusters, focusing on patterns in the social sciences and humanities, and consider the implications of our results for tracing and rewarding data sharing and reuse.

https://doi.org/10.1162/qss_a_00264

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Expanding the Data Ark: An Attempt to Make the Data from Highly Cited Social Science Papers Publicly Available"


Access to scientific data can enable independent reuse and verification; however, most data are not available and become increasingly irrecoverable over time. This study aimed to retrieve and preserve important datasets from 160 of the most highly-cited social science articles published between 2008-2013 and 2015-2018. We asked authors if they would share data in a public repository — the Data Ark — or provide reasons if data could not be shared. Of the 160 articles, data for 117 (73%, 95% CI [67% – 80%]) were not available and data for 7 (4%, 95% CI [0% – 12%]) were available with restrictions. Data for 36 (22%, 95% CI [16% – 30%]) articles were available in unrestricted form: 29 of these datasets were already available and 7 datasets were made available in the Data Ark. Most authors did not respond to our data requests and a minority shared reasons for not sharing, such as legal or ethical constraints. These findings highlight an unresolved need to preserve important scientific datasets and increase their accessibility to the scientific community.

https://doi.org/10.31222/osf.io/w9crz

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Towards a Toolbox for Automated Assessment of Machine-Actionable Data Management Plans"


Most research funders require Data Management Plans (DMPs). The review process can be time consuming, since reviewers read text documents submitted by researchers and provide their feedback. Moreover, it requires specific expert knowledge in data stewardship, which is scarce. Machine-actionable Data Management Plans (maDMPs) and semantic technologies increase the potential for automatic assessment of information contained in DMPs. However, the level of automation and new possibilities are still not well-explored and leveraged. This paper discusses methods for the automation of DMP assessment. It goes beyond generating human-readable reports. It explores how the information contained in maDMPs can be used to provide automated pre-assessment or to fetch further information, allowing reviewers to better judge the content. We map the identified methods to various reviewer goals.

https://doi.org/10.5334/dsj-2023-028

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Engaging with Researchers and Raising Awareness of FAIR and Open Science through the FAIR+ Implementation Survey Tool (FAIRIST)"


Seven years after the seminal paper on FAIR was published, that introduced the concept of making research outputs Findable, Accessible, Interoperable, and Reusable, researchers still struggle to understand how to implement the principles. For many researchers, FAIR promises long-term benefits for near-term effort, requires skills not yet acquired, and is one more thing in a long list of unfunded mandates and onerous requirements for scientists. Even for those required to, or who are convinced that they must make time for FAIR research practices, their preference is for just-in-time advice properly sized to the scientific artifacts and process. Because of the generality of most FAIR implementation guidance, it is difficult for a researcher to adjust to the advice according to their situation. Technological advances, especially in the area of artificial intelligence (AI) and machine learning (ML), complicate FAIR adoption, as researchers and data stewards ponder how to make software, workflows, and models FAIR and reproducible. The FAIR+ Implementation Survey Tool (FAIRIST) mitigates the problem by integrating research requirements with research proposals in a systematic way. FAIRIST factors in new scholarly outputs, such as nanopublications and notebooks, and the various research artifacts related to AI research (data, models, workflows, and benchmarks). Researchers step through a self-serve survey process and receive a table ready for use in their data management plan (DMP) and/or work plan. while gaining awareness of the FAIR Principles and Open Science concepts. FAIRIST is a model that uses part of the proposal process as a way to do outreach, raise awareness of FAIR dimensions and considerations, while providing timely assistance for competitive proposals.

https://doi.org/10.5334/dsj-2023-032

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Effects of Research Data Management Services: Associating the Data Curation Lifecycle with Open Research Output"


This study seeks to understand the relationship between research data management (RDM) services framed in the data curation life cycle and the production of open data. An electronic questionnaire was distributed to US researchers and RDM specialists, and the results were analyzed using Chi-Square tests for association. The data curation life cycle does associate with the production of open data and shareable research, but tasks like data management plans have stronger associations with the production of open data. The findings analyze the intersection of these concepts and provide insight into RDM services that facilitate the production of open data and shareable research.

https://doi.org/10.5860/crl.84.5.751

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"re3data — Indexing the Global Research Data Repository Landscape Since 2012"


For more than ten years, re3data, a global registry of research data repositories (RDRs), has been helping scientists, funding agencies, libraries, and data centers with finding, identifying, and referencing RDRs. As the world’s largest directory of RDRs, re3data currently describes over 3,000 RDRs on the basis of a comprehensive metadata schema. The service allows searching for RDRs of any type and from all disciplines, and users can filter results based on a wide range of characteristics. The re3data RDR descriptions are available as Open Data accessible through an API and are utilized by numerous Open Science services. re3data is engaged in various initiatives and projects concerning data management and is mentioned in the policies of many scientific institutions, funding organizations, and publishers. This article reflects on the ten-year experience of running re3data and discusses ten key issues related to the management of an Open Science service that caters to RDRs worldwide.

https://doi.org/10.1038/s41597-023-02462-y

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Towards Research Software-ready Libraries"


Software is increasingly acknowledged as valid research output. Academic libraries adapt to this change to become research software-ready. Software publication and citation are key areas in this endeavor. We present and discuss the current state of the practice of software publication and software citation, and discuss four areas of activity that libraries engage in: (1) technical infrastructure, (2) training and support, (3) software management and curation, (4) policies.

https://doi.org/10.1515/abitech-2023-0031

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Data Management Plan Implementation, Assessments, and Evaluations: Implications and Recommendations"


Data management plans (DMPs) have become nearly a worldwide requirement for research funding. To meet these new funding agency expectations, information professionals across domains and the world have worked to create resources and services to successfully implement and sometimes assess DMPs. This essay presents a series of case studies from different institutions across the globe to highlight current practices and share recommendations for future work. A summary of various projects related to DMP implementation, assessment, and evaluation in different contexts provides a useful overview of current practices. The essay concludes with recommendations for practical oversight and scoring to improve DMPs’ utility in enabling the sharing of data.

https://doi.org/10.5334/dsj-2023-027

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Digital Preservation Maturity Modelling Tool: "New DPC Resource ‘Level-up with RAM’ Now Available on General Release"


Designed to enable rapid benchmarking of an organization’s digital preservation capability, the DPC RAM is a digital preservation maturity modelling tool which is applicable for organizations of any size in any sector, and for all content of long-term value.

https://tinyurl.com/4xxe3dcu

Level up with DPC RAM

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Computational Reproducibility of Jupyter Notebooks from Biomedical Publications"


Jupyter notebooks facilitate the bundling of executable code with its documentation and output in one interactive environment, and they represent a popular mechanism to document and share computational workflows. The reproducibility of computational aspects of research is a key component of scientific reproducibility but has not yet been assessed at scale for Jupyter notebooks associated with biomedical publications. We address computational reproducibility at two levels: First, using fully automated workflows, we analyzed the computational reproducibility of Jupyter notebooks related to publications indexed in PubMed Central. We identified such notebooks by mining the articles full text, locating them on GitHub and re-running them in an environment as close to the original as possible. We documented reproduction success and exceptions and explored relationships between notebook reproducibility and variables related to the notebooks or publications. Second, this study represents a reproducibility attempt in and of itself, using essentially the same methodology twice on PubMed Central over two years. Out of 27271 notebooks from 2660 GitHub repositories associated with 3467 articles, 22578 notebooks were written in Python, including 15817 that had their dependencies declared in standard requirement files and that we attempted to re-run automatically. For 10388 of these, all declared dependencies could be installed successfully, and we re-ran them to assess reproducibility. Of these, 1203 notebooks ran through without any errors, including 879 that produced results identical to those reported in the original notebook and 324 for which our results differed from the originally reported ones. Running the other notebooks resulted in exceptions. We zoom in on common problems, highlight trends and discuss potential improvements to Jupyter-related workflows associated with biomedical publications.

https://arxiv.org/abs/2308.07333

More about Jupyter notebooks.

The Jupyter Notebook is an interactive computing environment that enables users to author notebook documents that include code, interactive widgets, plots, narrative text, equations, images and even video! The Jupyter name comes from 3 programming languages: Julia, Python, and R. It is a popular tool for literate programming. Donald Knuth first defined literate programming as a script, notebook, or computational document that contains an explanation of the program logic in a natural language (e.g. English or Mandarin), interspersed with snippets of macros and source code, which can be compiled and rerun. You can think of it as an executable paper!

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |