“Frontiers introduces FAIR² Data Management”


FAIR² Data Management leverages AI-assisted curation to structure research data for publication, making it easier to find, reuse, and analyze—both by humans and machines—so researchers can focus on discovery rather than data preparation. By making datasets shareable and optimized for reuse, FAIR² Data Management enhances research efficiency and reproducibility, accelerating breakthroughs in global health, planetary sustainability, and scientific innovation. . . .

FAIR² (FAIR Squared) extends the FAIR principles by defining a formal specification that makes research data AI-ready, aligned with Responsible AI principles, and structured for deep scientific reuse. Compatible with MLCommons Croissant’s AI-ready format, it integrates essential elements for scientific rigor, reproducibility, and interoperability. FAIR² ensures data is richly documented and linked to provenance, methodology, and a detailed data dictionary, creating a context-rich representation of each dataset. It also integrates with TensorFlow, JAX, and PyTorch, enabling AI-driven analysis and easy sharing on Kaggle and Hugging Face, amplifying its impact across disciplines.

https://tinyurl.com/3bwjbsw6

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Developing Practices for FAIR and Linked Data in Heritage Science”


Heritage Science has a lot to gain from the Open Science movement but faces major challenges due to the interdisciplinary nature of the field, as a vast array of technological and scientific methods can be applied to any imaginable material. Historical and cultural contexts are as significant as the methods and material properties, which is something the scientific templates for research data management rarely take into account. While the FAIR data principles are a good foundation, they do not offer enough practical help to researchers facing increasing demands from funders and collaborators. In order to identify the issues and needs that arise “on the ground floor”, the staff at the Heritage Laboratory at the Swedish National Heritage Board took part in a series of workshops with case studies. The results were used to develop guides for good data practices and a list of recommended online vocabularies for standardised descriptions, necessary for findable and interoperable data. However, the project also identified areas where there is a lack of useful vocabularies and the consequences this could have for discoverability of heritage studies on materials from areas of the world that have historically been marginalised by Western culture. If Heritage Science as a global field of study is to reach its full potential this must be addressed.

https://doi.org/10.1038/s40494-025-01598-x

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“The Economic Impact of Open Science: A Scoping Review”


This paper summarised a comprehensive scoping review of the economic impact of Open Science (OS), examining empirical evidence from 2000 to 2023. It focuses on Open Access (OA), Open/FAIR Data (OFD), Open Source Software (OSS), and Open Methods, assessing their contributions to efficiency gains in research production, innovation enhancement, and economic growth. Evidence, although limited, indicates that OS accelerates research processes, reduces the related costs, fosters innovation by improving access to data and resources and this ultimately generates economic growth. Specific sectors, such as life sciences, are researched more and the literature exhibits substantial gains, mainly thanks to OFD and OA. OSS supports productivity, while the very limited studies on Open Methods indicate benefits in terms of productivity gains and innovation enhancement. However, gaps persist in the literature, particularly in fields like Citizen Science and Open Evaluation, for which no empirical findings on economic impact could be detected. Despite limitations, empirical evidence on specific cases highlight economic benefits. This review underscores the need for further metrics and studies across diverse sectors and regions to fully capture OS’s economic potential.

https://doi.org/10.31222/osf.io/kqse5_v1

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“The Economic Impact of Open Science: A Scoping Review”


This paper summarised a comprehensive scoping review of the economic impact of Open Science (OS), examining empirical evidence from 2000 to 2023. It focuses on Open Access (OA), Open/FAIR Data (OFD), Open Source Software (OSS), and Open Methods, assessing their contributions to efficiency gains in research production, innovation enhancement, and economic growth. Evidence, although limited, indicates that OS accelerates research processes, reduces the related costs, fosters innovation by improving access to data and resources and this ultimately generates economic growth. Specific sectors, such as life sciences, are researched more and the literature exhibits substantial gains, mainly thanks to OFD and OA. OSS supports productivity, while the very limited studies on Open Methods indicate benefits in terms of productivity gains and innovation enhancement. However, gaps persist in the literature, particularly in fields like Citizen Science and Open Evaluation, for which no empirical findings on economic impact could be detected. Despite limitations, empirical evidence on specific cases highlight economic benefits. This review underscores the need for further metrics and studies across diverse sectors and regions to fully capture OS’s economic potential.

https://osf.io/preprints/metaarxiv/kqse5_v1

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Building Trustworthy AI Solutions: Integrating Artificial Intelligence Literacy into Records Management and Archival Systems”


This paper explores the essential role of Artificial Intelligence (AI) competencies and literacy in the fields of records management and archival practices, within the framework of the InterPARES Trust AI project. . . . The study employs two complementary approaches: (1) a detailed competency framework developed through literature reviews, interviews with archival professionals who have applied AI to the processing of records, and validation workshops with practitioners; and (2) a comprehensive AI literacy framework derived from multiple case studies and theoretical discussions. . . . Findings indicate that archival professionals can leverage AI in their work practices by acquiring basic AI literacy, practical AI skills, data-related skills, tool-testing and evaluation, adaptation of AI to their workflows, and by actively engaging in collaborative projects with information technology (IT) developers.

https://doi.org/10.48550/arXiv.2307.14852

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Datafication and Cultural Heritage Collections Data Infrastructures: Critical Perspectives on Documentation, Cataloguing and Data-sharing in Cultural Heritage Institutions”


The role of cultural heritage collections within the research ecosystem is rapidly changing. From often-passive primary source or reference point for humanities research, cultural heritage collections are now becoming integral part of large-scale interdisciplinary inquiries using computational-driven methods and tools. This new status for cultural heritage collections, in the ‘collections-as-data’ era, would not be possible without foundational work that was and is still going on ‘behind the scenes’ in cultural heritage institutions through cataloguing, documentation and curation of cultural heritage records. This article assesses the landscape for cultural heritage collections data infrastructure in the UK through an empirical and critical perspective, presenting insights on the infrastructure that cultural heritage organisations use to record and manage their collections, exploring the range of systems being used, the levels of complexity or ease at which collections data can be accessed, and the shape of interactions between software suppliers, cultural heritage organisations, and third-party partners. The paper goes on to include a critical analysis of the findings based on the sector’s approach to ‘3s’, that is standards, skill sets and scale, and how that applies to different cultural heritage organisations throughout the data lifecycle, from data creation, stewardship to sharing and re-using.

https://doi.org/10.5334/johd.277

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Building as They Come: Comparative Case Studies of Co-constructing Data Visualization Services with Academic Communities”


Academic libraries are well-situated to be strong supporters of democratizing and building knowledge and expertise in the use of data and data visualization as they cut across all of academia, regardless of discipline or department. Within the past decade, many academic libraries across North America have added data visualization services to their offerings. This has been done in several ways, from existing librarians with related portfolios like GIS or research data learning new skills to libraries creating new positions with the focus on the portfolio on data visualization. This chapter presents and compares two case studies of building data visualization services at York University Libraries and McMaster University Library.

https://hdl.handle.net/10315/42647

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Data and Code Availability in Political Science Publications from 1995 to 2022”


In this paper, we assess the availability of reproduction archives in political science. By “reproduction archive,” we mean the data and code supporting quantitative research articles that allows others to reproduce the computations described in the published paper. We collect a random sample of quantitative research articles published in political science from 1995 to 2022. We find that—even in 2022—most quantitative research articles do not point a reproduction archive. However, practices are improving. In 2014, when the DA-RT symposium was published in PS, about 12% of quantitative research articles point to the data and code. Eight years later, in 2022, that has increased to 31%. This underscores a massive shift in norms, requirements, and infrastructure. Still, only a minority of articles share the supporting data and code.

https://doi.org/10.31235/osf.io/a5yxe_v2

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Peer Review of Data Papers: Does It Achieve Expectations for Facilitating Data Sharing and Reuse?”


This paper presents a qualitative study of open peer review reports of data papers in a data journal Earth System Science Data. We examine to what extent the actual review practices of data papers align with identifying the most valuable datasets and promoting data reuse. We conclude that peer reviewers adopted a variety of criteria to evaluate data papers, but it is still challenging for reviewers to identify the most valuable datasets that should be reused. In addition, our findings demonstrate the correlation between data paper evaluations and subsequent reuse of the underlying datasets.

https://dx.doi.org/10.2139/ssrn.5130257

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Data Stewardship Decoded: Mapping Its Diverse Manifestations and Emerging Relevance at a Time of AI”


Data stewardship has become a critical component of modern data governance, especially with the growing use of artificial intelligence (AI). Despite its increasing importance, the concept of data stewardship remains ambiguous and varies in its application. This paper explores four distinct manifestations of data stewardship to clarify its emerging position in the data governance landscape. These manifestations include a) data stewardship as a set of competencies and skills, b) a function or role within organizations, c) an intermediary organization facilitating collaborations, and d) a set of guiding principles. The paper subsequently outlines the core competencies required for effective data stewardship, explains the distinction between data stewards and Chief Data Officers (CDOs), and details the intermediary role of stewards in bridging gaps between data holders and external stakeholders. It also explores key principles aligned with the FAIR framework (Findable, Accessible, Interoperable, Reusable) and introduces the emerging principle of AI readiness to ensure data meets the ethical and technical requirements of AI systems. The paper emphasizes the importance of data stewardship in enhancing data collaboration, fostering public value, and managing data reuse responsibly, particularly in the era of AI. It concludes by identifying challenges and opportunities for advancing data stewardship, including the need for standardized definitions, capacity building efforts, and the creation of a professional association for data stewardship.

https://arxiv.org/abs/2502.10399

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Data Curation Network: “New Format Data Curation Primers in 2024”


We’re excited to share three new data curation primers released by the Data Curation Network, focusing on critical formats and approaches in scientific and cultural data management: FITS (Flexible Image Transport System), TIFF (Tagged Image File Format), and Linked Data.

(Links added in the above.)

https://tinyurl.com/3kht2syn

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

U.S. Research Data Summit: Strengthening Cooperation Across Organizations and Sectors: Proceedings of a Workshop


On October 10-11, 2023, the National Academies of Sciences, Engineering, and Medicine hosted the U.S. Research Data Summit at the National Academy of Sciences Building in Washington, DC. The summit was undertaken by a planning committee organized under the U.S. National Committee for CODATA. The summit was informed by input from 29 organizations, including leaders from federal government agencies, the private sector, public and nonprofit organizations, and research institutions. This publication summarizes the presentations and discussion of the summit.

https://tinyurl.com/yjbuhkwz

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Supporting the Research Data Management Journey of a Postgraduate Student at the University of St Andrews”


Most research funders have requirements for data management plans and open data to foster good research data management practices. In order to embed these practices in the postgraduate research (PGR) student journey we have introduced the requirement for a data management plan as part of the first-year progress review and the encouragement to make data underpinning theses publicly available. To support students through these processes we provide a suite of training workshops and are available for one-to-one consultations. User feedback and frequently asked questions are used to review and improve our support offering.

This brief report discusses the planning and implementation processes for data management plan requirement and encouragement of underpinning data. It dives deeper into the workflows, especially for the data deposit, and describes training and support available to students. Statistics on training uptake, data management plan submissions and annual trends for data deposit are also presented. The report concludes with lessons learnt and the team’s plans for the near future.

https://doi.org/10.2218/ijdc.v19i1.980

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“How Will We Prepare for an Uncertain Future? The Value of Open Data and Code for Unborn Generations Facing Climate Change”


What is the unit of knowledge that we would most like to protect for future generations? Is it the scientific publication? Or is it our datasets? Datasets are snapshots in space and time of n-dimensional hypervolumes of information that are resources in and of themselves—each giving numerous insights into the measured world [134,135]. New publishing paradigms, such as Octopus, allow researchers to link multiple ‘Analysis’ and/or ‘Interpretation’ publications to a single ‘Results’ publication as alternative analyses and interpretations of the same data [159]. A more traditional research paper, on the other hand, is one realization of many possible assessments of the data that were originally collected, and a wide diversity of results can be obtained when many individuals analyse one dataset with the same research question in mind [160,161]. That is, publications are one version of an oversimplified projection through n-dimensional space which communicate stories that our human minds can comprehend. Manuscript narratives, by necessity, leave out information to craft such a story.

This is not to say that scientific publications in and of themselves are not useful. On the contrary, they frame our current and historical understanding of the world and put scientific inquiry into the relevant spatial and temporal context. Scientific articles offer analysis and interpretation of data which will allow future generations to understand why certain policies, management actions, or approaches were attempted and/or abandoned. However, if future researchers are not granted access to our (past) data, future humans will have to repeat costly (e.g. time and resources) experiments, laboriously extract information directly from figures, tables and text in the articles themselves (assuming the relevant information is available and detailed enough, although there is evidence that this is not the case in at least some disciplines [55,162]) or will have to trust our analytical procedures and our intuitions and perceptions about the data we collected [160,161].

https://doi.org/10.1098/rspb.2024.1515

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Leveraging Task-Specific Large Language Models to Enhance Research Data Management Services”


Applying prompt engineering and RAG [Retrieval-Augmented Generation ] to research data management and sharing activities offers numerous opportunities for enhancing institutional research data support services. Here, we present just a few illustrative examples that highlight how these technologies could significantly improve service efficiencies, reduce researcher burden, and support adherence with evolving policies. These examples aim to inspire further exploration and future work rather than serve as extensive case studies.

  • Task-Specific, Agent-Based Chatbots for Data Management and Sharing Plans (DMSPs): Agent-based chatbots can assist researchers in drafting DMSPs by prompting for specific information based on funder requirements. This would offer researchers an interactive, guided experience that streamlines the process of developing a DMSP. The chatbot can be pre-loaded with knowledge of DMSP policies, institutional resources, and common pitfalls observed during plan reviews. Moreover, by incorporating review criteria, these chatbots could also provide real-time feedback on draft plans, allowing researchers to refine their submissions before institutional review.
  • Automated Text Extraction for Structured Compliance Reporting: Using these approaches, institutions can also automate the extraction of key details from narrative-based DMSPs and transform them into structured, formatted fields. This could be particularly useful for converting narrative-based DMSPs into actionable steps for researchers, service providers, and compliance officers, enabling efficient monitoring and follow-up on data management and sharing commitments.
  • Customized Knowledge Retrieval for Policy Guidance and Updates: Institutions can further leverage these approaches to develop tools that offer researchers up-to-date guidance on data management and sharing policies from major funders and publishers as well as institutional requirements. For instance, a researcher could query these tools to receive the latest mandates, institutional requirements, or best practices related to data management and sharing. This capability would reduce the burden for researchers in tracking down the most recent policy update.

https://tinyurl.com/bdee5u29

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: Data Culture in Academic Libraries: A Practical Guide to Building Communities, Partnerships, and Collaborations


In five parts, Data Culture in Academic Libraries: A Practical Guide to Building Communities, Partnerships, and Collaborations can help you foster an institutional culture that favors the curation, creation, and wider use of datasets.

  • Data at all Levels
  • Data Services and Instruction
  • Data Outreach
  • Data Communities
  • Data Partnerships

https://tinyurl.com/ydsmdjbj

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“From Data Creator to Data Reuser: Distance Matters”


Sharing research data is necessary, but not sufficient, for data reuse. Open science policies focus more heavily on data sharing than on reuse, yet both are complex, labor-intensive, expensive, and require infrastructure investments by multiple stakeholders. The value of data reuse lies in relationships between creators and reusers. By addressing knowledge exchange, rather than mere transactions between stakeholders, investments in data management and knowledge infrastructures can be made more wisely. Drawing upon empirical studies of data sharing and reuse, we develop the metaphor of distance between data creator and data reuser, identifying six dimensions of distance that influence the ability to transfer knowledge effectively: domain, methods, collaboration, curation, purposes, and time and temporality. We explore how social and socio-technical aspects of these dimensions may decrease – or increase – distances to be traversed between creators and reusers. Our theoretical framing of the distance between data creators and prospective reusers leads to recommendations to four categories of stakeholders on how to make data sharing and reuse more effective: data creators, data reusers, data archivists, and funding agencies. ‘It takes a village’ to share research data – and a village to reuse data. Our aim is to provoke new research questions, new research, and new investments in effective and efficient circulation of research data; and to identify criteria for investments at each stage of data and research life cycles.

https://tinyurl.com/3429p526

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Research Data Lifecycle (RDLC): An Investigation into the Disciplinary Focus, Use Cases, Creator Backgrounds, Stages and Shapes of RDLC Models”


In this paper, we report the results of a study examining 78 Research and Data Lifecycle (RDLC) models located in a review of the literature. Through synthesis-analysis and the nominal group technique, we investigated the RDLC models from the point of view of their disciplinary focus, use cases, model creators, as well as the specific stages and shapes. Our study revealed that the majority of the disciplinary focus for the models was generic, science, or multi-disciplinary. Models originating in the social sciences and humanities are less common. The use cases varied in a wide spectrum, with a total of 34 different scenarios. The creators and authors of the RDLC models came from more than 20 countries with the majority of the models created as a result of collaboration within or across different organizations. Our stage and shape analysis also outlined key characteristics of the RDLC models by showing the commonalities and variations of named stages and varying structures of the models. As one of the first empirical investigations examining the deep substance of the RDLC models, our study provides significant insights into the context and setting where the models were developed, as well as the details with regard to the stages and shapes, and thereby identified gaps that may impact the use and value of the models. As such, our study establishes a foundation for further studies on the practical utilization of the RDLC models in research data management practice and education.

https://doi.org/10.2218/ijdc.v19i1.860

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Research Data Management and Crowdsourcing Personal Histories”


Drawing on experiences of the University of Oxford’s Sustainable Digital Scholarship (SDS) service and the World War Two crowdsourcing project ‘Their Finest Hour’, this paper explores how institutional digital repositories (such as the SDS platform) can be successfully leveraged to publish and sustainably host crowdsourced (‘warm-data’) collections beyond their funding period.

The paper examines the challenges in applying FAIR (Findable, Accessible, Interoperable, Reusable) principles to a collection containing first-hand testimonies and digitised objects of significant sentimental value, addressing both practical and ethical considerations, including the management of copyright, handling of sensitive material, use of AI tools and adherence to good research data management practices, with limited resources.

Reflecting on the importance of a caring approach to data stewardship, the paper examines how the ethos of the Their Finest Hour project, and its commitment to honouring contributors and their families, led organically to an alignment with CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) principles, originally developed for Indigenous data governance. It also explores the potential for the wider application of CARE principles for crowdsourced collections such as the Their Finest Hour Online Archive, while acknowledging and respecting the origins of this framework.

Lastly, it offers some practical ‘lessons learned’ to help GLAM and Higher Education professionals working with crowdsourced collections and personal histories to navigate some of the research data management challenges that they may encounter, while also highlighting the importance of understanding FAIR and CARE principles and how they can be applied to these types of data collections.

https://doi.org/10.5334/johd.265

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Copyright and Licencing for Cultural Heritage Collections as Data”


Cultural Heritage (CH) institutions have been exploring innovative ways to publish digital collections to facilitate reuse, through initiatives like Collections as data and the International GLAM Labs Community. When making a digital collection available for computational use, it is crucial to have reusable and machine-readable open licences and copyright terms. While existing studies address copyright for digital collections, this study focuses specifically on the unique requirements of collections as data. This research highlights both the legal and technical aspects of copyright concerning collections as data. It discusses permissible uses of copyrighted collections, emphasising the need for interoperable, machine-readable licences and open licences. By reviewing current literature and examples, this study presents best practices and examples to help CH institutions better navigate copyright and licencing issues, ultimately enhancing their ability to convert their content into collections as data for computational research.

https://doi.org/10.5334/johd.263

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Conceptualizing Aggregate-Level Description in Web Archives”


Web archives collections are often excluded from archival science discussions, and their description instead focuses on bibliographic approaches to item-level metadata. This article argues that web archives are best understood using approaches of archival description, focusing on a case study of the Danish Netarchive, a long-running national web archive. By capturing and preserving web sites for the purposes of legal deposit, the Netarchive creates and maintains historical records of the web. Examining the Netarchive’s systems and activities through the lens of archival representation, this article develops a typology of representational artifacts that support this work, including the use of database entities, wiki documentation, classification and management via Jira issues, and codes, identifiers, and structures embedded in network protocols themselves. The analysis considers how meaningful aggregations can be understood via these representational schemes, systems and architectures, and how the nature of born-networked records challenges concepts of singular, hierarchical orderings of records aggregations. The closing discussion proposes new modes of description that address these multiple interconnected systems, and raises questions about what this might mean for aggregate-level description in the context of digital and born-networked records more broadly.

https://doi.org/10.5334/johd.265

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

“Research Data Management and Crowdsourcing Personal Histories”


Drawing on experiences of the University of Oxford’s Sustainable Digital Scholarship (SDS) service and the World War Two crowdsourcing project ‘Their Finest Hour’, this paper explores how institutional digital repositories (such as the SDS platform) can be successfully leveraged to publish and sustainably host crowdsourced (‘warm-data’) collections beyond their funding period.

The paper examines the challenges in applying FAIR (Findable, Accessible, Interoperable, Reusable) principles to a collection containing first-hand testimonies and digitised objects of significant sentimental value, addressing both practical and ethical considerations, including the management of copyright, handling of sensitive material, use of AI tools and adherence to good research data management practices, with limited resources.

Reflecting on the importance of a caring approach to data stewardship, the paper examines how the ethos of the Their Finest Hour project, and its commitment to honouring contributors and their families, led organically to an alignment with CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) principles, originally developed for Indigenous data governance. It also explores the potential for the wider application of CARE principles for crowdsourced collections such as the Their Finest Hour Online Archive, while acknowledging and respecting the origins of this framework.

Lastly, it offers some practical ‘lessons learned’ to help GLAM and Higher Education professionals working with crowdsourced collections and personal histories to navigate some of the research data management challenges that they may encounter, while also highlighting the importance of understanding FAIR and CARE principles and how they can be applied to these types of data collections.

https://doi.org/10.5334/johd.265

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Understanding How to Identify and Manage Personal Identifying Information (PII) to Further Data Interoperability"


Respect for research participant rights is a key aspect for consideration when creating and utilizing interoperable data. From that perspective, requirements for sharing research data often call for the data to be de-identified, i.e., the removal of all personal identifying information (PII) prior to data sharing, to ensure that the participant’s data privacy rights are not infringed upon. However, what constitutes PII is often a point of confusion amongst researchers who are not familiar with privacy laws and regulations. This paper hopes to provide some clarity around what makes research data identifiable by presenting it under a different perspective from what most researchers are familiar with. It also provides a framework to help researchers determine where PII could exist within their data that they can use to help with privacy impact evaluations. The goal is to empower researchers to share their data with greater confidence that the privacy rights of their research subjects have been sufficiently protected, enabling access to greater amounts of data for research use.

https://tinyurl.com/2p95xtd2

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |