Skip to content
DigitalKoans

DigitalKoans

What Is the Sound of One E-Print Downloading?

Category: Metadata

"RDA National PID Strategies Guide and Checklist: Final Outputs and Supporting Materials Available"


The National PID Strategies Working Group was endorsed to explore how Persistent Identifiers (PIDs) form part of national policy and research infrastructure implementation frameworks. . . .

The findings of the 18 months of the WG have been that:

  1. National PID strategies are on the rise, evidenced in the case studies collected by the WG and the growing momentum of discussions at RDA Plenaries and other international fora.
  2. The development of national PID strategies is a relatively new phenomenon and many countries are in the very early stages. In fact, many have more of a national approach that they are seeking to transform into a strategy.
  3. All national PID strategies are currently in development and therefore subject to a high degree of change. During the course of the WG, nine case studies were collected and several of these needed to be updated prior to the Group’s final output due to changes that had taken place in those countries.
  4. There is no single "cookie cutter" approach to developing a national PID strategy. Critical components include:
    • A clear value proposition with use cases
    • A group or organisation that is responsible for driving strategy development
    • An open, inclusive, iterative process that involves all stakeholders
    • An accompanying roadmap that outlines practical steps for implementation
  5. International PID providers such as ORCID and DataCite have begun to actively engage with national PID strategies and the RDA National PID Strategies WG provides a focal point for furthering this engagement.

https://tinyurl.com/559x4dmy

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on October 13, 2023October 12, 2023Categories Metadata

"Developing a Preservation Metadata Standard for Languages"


We have so many languages to communicate with others as humans. There are approximately 7000 languages in the world, and many are becoming extinct for a variety of reasons. In order to preserve and prevent the extinction of these languages, we need to preserve them. One way of preservation is to have a preservation metadata for languages. Metadata is data about data. Metadata is required for item description, preservation, and retrieval. There are various types of metadata, e.g., descriptive, administrative, structural, preservation, etc. After the literature study, the authors observed that there is a lack of study on the preservation metadata for language. Consequently, the purpose of this paper is to demonstrate the need for language preservation metadata. We found some archaeological metadata standards for this purpose, and after applying inclusion and exclusion criteria, we chose three archaeological metadata standards, namely: Archaeon-core, CARARE, and LIDO (Lightweight Information Describing Objects) for mapping metadata.

https://arxiv.org/abs/2310.04155

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on October 10, 2023October 9, 2023Categories Digital Curation & Digital Preservation, Metadata

"From ChatGPT to CatGPT: The Implications of Artificial Intelligence on Library Cataloging "


This paper explores the potential of language models such as ChatGPT to transform library cataloging. Through experiments with ChatGPT, the author demonstrates its ability to generate accurate MARC records using RDA and other standards such as the Dublin Core Metadata Element Set. These results demonstrate the potential of ChatGPT as a tool for streamlining the record creation process and improving efficiency in library settings. The use of AI-generated records, however, also raises important questions related to intellectual property rights and bias. The paper reviews recent studies on AI in libraries and concludes that further research and development of this innovative technology is necessary to ensure its responsible implementation in the field of library cataloging.

https://tinyurl.com/fd8xjmnt

| Artificial Intelligence and Libraries Bibliography |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on September 20, 2023September 27, 2023Categories Artificial Intelligence/Robots, Metadata

Paywall: "Images as Metadata: A New Perspective for Describing Research Data"


Abstract Through studies and work developed over the last few years, we propose a new approach to description, where images can have a preponderant role in the description of data, assuming the role of metadata. We present several pieces of evidence, point out their challenges and determine the opportunities this new perspective can have in the research. Images have specific characteristics that can be leveraged in improving data description. Historical evidence establish that images have always been used and produced in research, yet their representational ability has never been harnessed to describe data and give more context to the scientific process.

https://doi.org/10.1080/19386389.2023.2252722

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on September 15, 2023September 14, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Metadata, Open Access, Open Science

"Global Visibility of Publications through Digital Object Identifiers"


This brief research report analyzes the availability of Digital Object Identifiers (DOIs) worldwide, highlighting the dominance of large publishing houses and the need for unique persistent identifiers to increase the visibility of publications from developing countries. The study reveals that a considerable amount of publications from developing countries are excluded from the global flow of scientific information due to the absence of DOIs, emphasizing the need for alternative publishing models. The authors suggest that the availability of DOIs should receive more attention in scholarly communication and scientometrics, contributing to a necessary debate on DOIs relevant for librarians, publishers, and scientometricians.

https://doi.org/10.3389/frma.2023.1207980

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on September 11, 2023September 10, 2023Categories Metadata, Publishing, Scholarly Books, Scholarly Journals

"PreprintResolver: Improving Citation Quality by Resolving Published Versions of ArXiv Preprints using Literature Databases"


The growing impact of preprint servers enables the rapid sharing of time-sensitive research. Likewise, it is becoming increasingly difficult to distinguish high-quality, peer-reviewed research from preprints. Although preprints are often later published in peer-reviewed journals, this information is often missing from preprint servers. To overcome this problem, the PreprintResolver was developed, which uses four literature databases (DBLP, SemanticScholar, OpenAlex, and CrossRef / CrossCite) to identify preprint-publication pairs for the arXiv preprint server. . . . Experiments were performed on a sample of 1,000 arXiv-preprints from the research field of computer science and without any publication information. . . . The results show that the PreprintResolver was able to resolve 603 out of 1,000 (60.3 %) arXiv-preprints from the research field of computer science and without any publication information. . . . In conclusion the PreprintResolver is suitable for individual, manually reviewed requests, but less suitable for bulk requests. The PreprintResolver tool (this https URL, Available from 2023-08-01) and source code (this https URL, Accessed: 2023-07-19) is available online.

https://arxiv.org/abs/2309.01373

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on September 8, 2023September 7, 2023Categories E-Prints, Metadata, Open Access, Open Science, Search Engines and Discovery Systems, Self-Archiving

Digital Preservation Coalition: Choosing a Persistent Identifier Type for Your Digital Objects


This report is intended to help you get started using persistent identifiers (PIDs) for digital objects. Its intended audience is people who are involved in digital preservation in heritage and research organizations. The report answers questions such as: "What are persistent identifiers?", "Why are they important?", "Which type should you choose?", "Are you ready for them?", and "How should you implement them?". The report does not specifically cover persistent identifiers for people, organizations, grants, workflows, and so on, but some of the same general concepts would also apply

http://doi.org/10.7207/twgn23-02

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on September 8, 2023September 7, 2023Categories Digital Curation & Digital Preservation, Metadata

"Metadata Standard for Continuous Preservation, Discovery, and Reuse of Research Data in Repositories by Higher Education Institutions: A Systematic Review"


This systematic review synthesised existing research papers that explore the available metadata standards to enable researchers to preserve, discover, and reuse research data in repositories. This review provides a broad overview of certain aspects that must be taken into consideration when creating and assessing metadata standards to enhance research data preservation discoverability and reusability strategies. Research papers on metadata standards, research data preservation, discovery and reuse, and repositories published between January 2003 and April 2023 were reviewed from a total of five databases. The review retrieved 1597 papers, and 13 papers were selected in this review. We revealed 13 research articles that explained the creation and application of metadata standards to enhance preservation, discovery, and reuse of research data in repositories. Among them, eight presented the three main types of metadata, descriptive, structural, and administrative, to enable the preservation of research data in data repositories. We noted limited evidence on how these metadata standards can be used to enhance the discovery and reuse of research data in repositories to enable the preservation, discovery, and reuse of research data in repositories. No reviews indicated specific higher education institutions employing metadata standards for the research data created by their researchers. Repository designs and a lack of expertise and technology know-how were among the challenges identified from the reviewed papers. The review has the potential to influence professional practice and decision-making by stakeholders, including researchers, students, librarians, information communication technologists, data managers, private and public organisations, intermediaries, research institutions, and non-profit organizations.

https://doi.org/10.3390/info14080427

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on August 10, 2023August 9, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Metadata, Open Access, Open Science

Paywall: "Proposal for the Publication of Linked Open Bibliographic Data"


The objective of this paper is to analyze the publishing of bibliographic data such as LOD, having as a product the elaboration of theoretical-methodological recommendations for the publication of these data, in an approach based on the ten best practices for publishing LOD, from the World Wide Web Consortium. The starting point was the conduction of a Systematic Review of Literature, where initiatives to publish bibliographic data such as LOD were identified. An empirical study of these institutions was also conducted.

https://doi.org/10.1080/01639374.2023.2234358

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on August 4, 2023August 3, 2023Categories Metadata, Open Access

"Signing Data Citations Enables Data Verification and Citation Persistence"


Increasingly, digital datasets are being published with assigned identifiers, then cited in papers as the basis for repeatable experiments. To help future readers find and verify data, customary citations can be extended with content signatures, which can be introduced without having to replace existing identifier such as DOIs and ARKs. That is, signatures can be seen as complementary identifiers to help keep specific versions of cited data findable and identifiable as they evolve and change locations. For example, if a DOI identifies an evolving dataset, rather than a fixed version — i.e., content drift is expected — the DOI can safely be cited for the sake of attribution, metadata linking, and citation statistics (e.g., by Crossref (https://www.crossref.org) and DataCite (https://datacite.org)), while the content signature helps the reader find the exact content that was cited, possibly with assistance from metadata linked to the DOI. Additionally, a citation that includes both the DOI (for example) and content signature of a dataset creates a fixed mapping between the two identifiers. Then, unintentional content drift by the DOI can be detected and reported, and an alternative location may potentially be discovered by consulting public content signature registries.

https://doi.org/10.1038/s41597-023-02230-y

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on July 11, 2023July 10, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Metadata

Paywall: "Interoperability of Open Science Metadata: What About the Reality?"


This paper leads to identify complementary ways to improve dataset interoperability: (1) to adapt mapping algorithms to the issues raised by metadata schema matching; (2) to adapt metadata schemata, for instance by sharing a core vocabulary and/or reusing existing standards; (3) to combine various trends in a more complex interoperability approach that would also make available and operational the (RDA) crosswalks between schemata and that would promote good practices in metadata labeling and documentation.

https://doi.org/10.1007/978-3-031-33080-3_28

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on June 1, 2023May 31, 2023Categories Metadata, Open Access, Open Science

Common Scholarly Communication Infrastructure Landscape Review


Scholarly communication is a complicated sector, with numerous participants and multiple mechanisms for communicating and reviewing materials created in an increasing variety of formats by researchers across the globe.[1] In turn, the researcher who seeks to use the products of this system wishes to discover, access, and use relevant and trustworthy materials as effortlessly as possible. The work of driving efficiency into this complex sector while bringing its multiple strands together seamlessly for the reader (or, increasingly, for a computational user) rests on a foundation of infrastructure, much of it shared across multiple publishers. In this landscape review, we seek to provide a high-level overview of the shared infrastructure that supports scholarly communication.

https://doi.org/10.18665/sr.318775

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on April 26, 2023April 25, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, ERM/Discovery Systems, Metadata, Open Access and Other Publishing License Agreements, Publishing, Reports and White Papers, Scholarly Metrics

"The Viability of Using an Open Source Locally Hosted AI for Creating Metadata in Digital Image Collections"


Artificial intelligence (AI) can support metadata creation for images by generating descriptions, titles, and keywords for digital collections in libraries. Many AI options are available, ranging from cloud-based corporate software solutions, including Microsoft Azure Custom Vision and Google Cloud Vision, to open-source locally hosted software packages. This case study examines the feasibility of deploying the open-source, locally hosted AI software, Sheeko, and the accuracy of the descriptions generated for images using two of the pre-trained models. The study aims to ascertain if Sheeko’s AI would be a viable solution for producing metadata in the form of descriptions, or titles for digital collections in Libraries and Cultural Resources at the University of Calgary.

https://journal.code4lib.org/articles/17186

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on April 24, 2023April 23, 2023Categories Artificial Intelligence/Robots, Digital Archives and Special Collections, Digital Media, Metadata

Digital Scholarship Releases New PDF Versions of Its Bibliographies

This spring, Digital Scholarship’s bibliographies in the HTML format were reformatted as single page files with internal navigation. This included all bibliographies that were in HTML format only as well as the HTML versions of paperback books. These new PDFs are in a 12 point font and are designed for printing; however, they also have live links for immediate access. There were no content changes. For a list of all Digital Scholarship publications, see the site map.

Academic Library as Scholarly Publisher Bibliography, v. 3

Altmetrics Bibliography

Digital Curation and Preservation Bibliography, v. 2

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works, 2012 Supplement

Digital Curation Resource Guide

Electronic Theses and Dissertations Bibliography, v.7

E-science and Academic Libraries Bibliography

Google Books Bibliography, v. 7

Institutional Repository Bibliography, v. 4

Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals

Open Access Journals Bibliography

Open Access Webliography

Research Data Curation Bibliography, v. 10

Research Data Publication and Citation Bibliography

Research Data Sharing and Reuse Bibliography

Scholarly Electronic Publishing Bibliography, v. 80

Transforming Peer Review Bibliography

Transforming Scholarly Publishing through Open Access: A Bibliography

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on April 5, 2023April 6, 2023Categories Bibliographies, Copyright, Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Digital Scholarship Publications, E-Books, Institutional Repositories, Metadata, Open Access, Open Access and Other Publishing License Agreements, Open Science, Publishing, Research Libraries, Scholarly Books, Scholarly Communication, Scholarly Journals, University Presses

"Scaling Identifiers and their Metadata to Gigascale: An Architecture to Tackle the Challenges of Volume and Variety"


Persistent identifiers are applied to an ever-increasing variety of research objects, including software, samples, models, people, instruments, grants, and projects, and there is a growing need to apply identifiers at a finer and finer granularity. Unfortunately, the systems developed over two decades ago to manage identifiers and the metadata describing the identified objects no longer scale. Communities working with physical samples have grappled with these three challenges of the increasing volume, variety, and variability of identified objects for many years. To address this dual challenge, the IGSN 2040 project explored how metadata and catalogues for physical samples could be shared at the scale of billions of samples across an ever-growing variety of users and disciplines. In this paper, we focus on how we scale identifiers and their describing metadata to billions of objects and who the actors involved with this system are. Our analysis of these requirements resulted in the definition of a minimum viable product and the design of an architecture that not only addresses the challenges of increasing volume and variety but, more importantly, is easy to implement because it reuses commonly used Web components. Our solution is based on a Web architectural model that utilises Schema.org, JSON-LD, and sitemaps. Applying these commonly used architectural patterns on the internet allows us to not only handle increasing variety but also enable better compliance with the FAIR Guiding Principles.

http://doi.org/10.5334/dsj-2023-005

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on April 5, 2023April 4, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Metadata

"Who Owns Bibliographic Metadata Created by Libraries?"


The ownership of MARC bibliographic data has been an issue between OCLC and other companies in the marketplace. Two lawsuits are discussed between OCLC and Clarivate and SkyRiver.

https://doi.org/10.1080/01930826.2023.2177928

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on March 28, 2023March 27, 2023Categories Copyright, Digital Copyright Wars, Libraries, Metadata

"The Continued Problem of URL Decay: An Updated Analysis of Health Care Management Journal Citations"


Objective: This study updates a 2009 study which examined uniform resource locator (URL) decay in health care management journals and seeks to determine whether continued URL availability relates to publication date, resource type, or top-level domain. The authors also provide an analysis of differences in findings between the two study periods.

Methods: The authors collected the URLs of web-based resources cited in articles published in five health care management source journals from 2016 to 2018. The URLs were checked to see if they were still active and then analyzed to determine if continued availability was related to publication date, resource type, or top-level domain.

Results: There were statistically significant differences in URL availability across publication date, resource type, and top-level domain. Domains with the highest rate of decay were .com and .net and the lowest rate were .edu and .gov. As expected, the older the citation, the higher the rate of decay. The overall rate of URL decay decreased from 49.3% to 36.1% between studies.

Conclusion: URL decay in health care management journals has decreased in the last 15 years. Still, URL decay does continue to be a problem. Interestingly, health services policy research journals had a lower rate of decay than practitioner-oriented journals (34.8% vs. 51.7%). Authors, publishers, and librarians should continue to promote the use of digital object identifiers and web archiving and perhaps study and replicate efforts used by health services policy research journals to increase continued URL availability rates.

https://doi.org/10.5195/jmla.2022.1456

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on March 27, 2023March 26, 2023Categories Digital Curation & Digital Preservation, Metadata, Publishing, Scholarly Books, Scholarly Journals

"Twenty Years of Wikipedia in Scholarly Publications: A Bibliometric Network Analysis of the Thematic and Citation Landscape"


Results also show that the author collaboration network is very sparsely connected, indicating the absence of close collaboration among the authors in the field. Furthermore, results reveal that the Wikipedia research institutions’ collaboration network reflects a North–South divide as very limited cooperation occurs between developed and developing countries’ institutions. Finally, the multiple correspondence analysis applied to obtain the Wikipedia research conceptual map reveals the breadth, diversity, and intellectual thrust of the Wikipedia’s scholarly publications.

https://doi.org/10.1007/s11135-023-01626-7

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on March 7, 2023March 21, 2023Categories Digital Repositories, Metadata, Publishing, Scholarly Books, Scholarly Journals, Scholarly Metrics

"Scaling Identifiers and Their Metadata to Gigascale: An Architecture to Tackle the Challenges of Volume and Variety"


Persistent identifiers are applied to an ever-increasing variety of research objects, including software, samples, models, people, instruments, grants, and projects, and there is a growing need to apply identifiers at a finer and finer granularity. Unfortunately, the systems developed over two decades ago to manage identifiers and the metadata describing the identified objects no longer scale. Communities working with physical samples have grappled with these three challenges of the increasing volume, variety, and variability of identified objects for many years. To address this dual challenge, the IGSN 2040 project explored how metadata and catalogues for physical samples could be shared at the scale of billions of samples across an ever-growing variety of users and disciplines. In this paper, we focus on how we scale identifiers and their describing metadata to billions of objects and who the actors involved with this system are. Our analysis of these requirements resulted in the definition of a minimum viable product and the design of an architecture that not only addresses the challenges of increasing volume and variety but, more importantly, is easy to implement because it reuses commonly used Web components. Our solution is based on a Web architectural model that utilises Schema.org, JSON-LD, and sitemaps. Applying these commonly used architectural patterns on the internet allows us to not only handle increasing variety but also enable better compliance with the FAIR Guiding Principles.

http://doi.org/10.5334/dsj-2023-005

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on March 6, 2023March 21, 2023Categories Cyberinfrastructure/E-Science, Metadata

"Geospatial Open Data Usage and Metadata Quality"


The Open Government Data portals (OGD), thanks to the presence of thousands of geo-referenced datasets, containing spatial information are of extreme interest for any analysis or process relating to the territory. For this to happen, users must be enabled to access these datasets and reuse them. An element often considered as hindering the full dissemination of OGD data is the quality of their metadata. Starting from an experimental investigation conducted on over 160,000 geospatial datasets belonging to six national and international OGD portals, this work has as its first objective to provide an overview of the usage of these portals measured in terms of datasets views and downloads. Furthermore, to assess the possible influence of the quality of the metadata on the use of geospatial datasets, an assessment of the metadata for each dataset was carried out, and the correlation between these two variables was measured. The results obtained showed a significant underutilization of geospatial datasets and a generally poor quality of their metadata. In addition, a weak correlation was found between the use and quality of the metadata, not such as to assert with certainty that the latter is a determining factor of the former.

https://doi.org/10.3390/ijgi10010030

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on March 2, 2023March 21, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Metadata, Open Access, Open Science

"Measuring the Concept of PID Literacy: User Perceptions and Understanding of Persistent Identifiers in Support of Open Scholarly Infrastructure"


The increasing centrality of persistent identifiers (PIDs) to scholarly ecosystems and the contribution they can make to the burgeoning ‘PID graph’ has the potential to transform scholarship. Despite their importance as originators of PID data, little is known about researchers’ awareness and understanding of PIDs, or their efficacy in using them. In this article we report on the results of an online interactive test designed to elicit exploratory data about researcher awareness and understanding of PIDs. This instrument was designed to explore recognition of PIDs and the extent to which researchers correctly apply PIDs within digital scholarly ecosystems, as well as measure researchers’ perceptions of PIDs. Our results reveal irregular patterns of PID understanding and certainty across all participants, though statistically significant disciplinary and academic job role differences were observed in some instances. Uncertainty and confusion were found to exist in relation to dominant schemes such as ORCID and DOIs, even when contextualized within real-world examples. We also show researchers’ perceptions of PIDs to be generally positive but that disciplinary differences can be noted, as well as higher levels of aversion to PIDs in specific use cases and negative perceptions where PIDs are measured on an ‘activity’ semantic dimension. This work therefore contributes to our understanding of academics’ ‘PID literacy’ and should inform those designing PID-centric scholarly infrastructures, that a significant need for training and outreach to active researchers remains necessary.

https://arxiv.org/abs/2211.07367

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on February 23, 2023February 22, 2023Categories Metadata, Open Access, Publishing, Scholarly Communication

"The Preprint Revolution — Implications for Bibliographic Databases"


In the box below, we present six recommendations for optimizing the indexing of preprints in bibliographic databases. As we will discuss later, implementing these recommendations requires close collaboration between bibliographic databases and other actors in the scholarly publishing system.

Recommendation 1: Cover all relevant preprint servers.

A bibliographic database should index preprints from all relevant preprint servers. A disciplinary database (e.g., PubMed and Europe PMC) should index preprints from all preprint servers relevant in a particular discipline. A multidisciplinary database (e.g., Dimensions, the Lens, Scopus, and Web of Science) should index preprints from all preprint servers across all disciplines.

Recommendation 2: Provide comprehensive preprint metadata.

A bibliographic database should provide metadata for preprints that is as comprehensive as metadata for journal articles. The metadata should at least include the title and abstract of a preprint, the names and affiliations of the authors, the reference list, and funding information. It should also include a version history.

Recommendation 3: Provide links between preprints and journal articles.

If an article has been published both on a preprint server and in a journal, a bibliographic database should provide a link between the preprint and the journal article. The link establishes that the preprint and the journal article are different versions of the same article. The preprint and the journal article belong to the same publication family.

Recommendation 4: Provide links between preprints and peer reviews.

If a preprint has been peer reviewed and the reviews have been made openly available, a bibliographic database should index the reviews and should provide links between the preprint and the reviews.

Recommendation 5: Provide deduplicated citation links between publication families.

A bibliographic database should provide deduplicated citation links at the level of publication families. If there are multiple citation links from publications in one publication family (e.g., from a preprint and from a journal article) to publications in another publication family, these citation links should be deduplicated.

Recommendation 6: Do not make arbitrary distinctions between publication types (preprints, journal articles, and others).

A bibliographic database should not make arbitrary distinctions between preprints, journal articles, and other publication types. A database may inform its users about relevant differences between publications of different types (e.g., whether publications have been peer reviewed or not), but otherwise it should treat all publications in the same way, regardless of their publication type.

bit.ly/3KtuWXl

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on February 22, 2023February 21, 2023Categories E-Prints, Metadata, Open Access, Publishing, Search Engines and Discovery Systems, Self-Archiving

"NISO Publishes New Recommended Practice for Video and Audio Metadata"


NISO’s new Video and Audio Metadata Recommended Practice will help address these challenges, by providing a vocabulary that enables connectivity between existing standards covering key metadata elements: administrative (e.g., dates, versions, and identifiers); semantic (e.g., subject classifications and keywords); technical (e.g., media type, encoding, and bitrate); rights (e.g., rights owner, licensor, and embargo information); and accessibility (e.g., accessibility features and access).

bit.ly/3ImojVp

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on February 14, 2023February 13, 2023Categories Metadata, Standards

"Cluster Analysis of Open Research Data: A Case for Replication Metadata"


Research data are often released upon journal publication to enable result verification and reproducibility. For that reason, research dissemination infrastructures typically support diverse datasets coming from numerous disciplines, from tabular data and program code to audio-visual files. Metadata, or data about data, is critical to making research outputs adequately documented and FAIR. Aiming to contribute to the discussions on the development of metadata for research outputs, I conducted an exploratory analysis to determine how research datasets cluster based on what researchers organically deposit together. I use the content of over 40,000 datasets from the Harvard Dataverse research data repository as my sample for the cluster analysis. I find that the majority of the clusters are formed by single-type datasets, while in the rest of the sample, no meaningful clusters can be identified. For the result interpretation, I use the metadata standard employed by DataCite, a leading organization for documenting a scholarly record, and map existing resource types to my results. About 65% of the sample can be described with a single-type metadata (such as Dataset, Software orReport), while the rest would require aggregate metadata types. Though DataCite supports an aggregate type such as a Collection, I argue that a significant number of datasets, in particular those containing both data and code files (about 20% of the sample), would be more accurately described as a Replication resource metadata type. Such resource type would be particularly useful in facilitating research reproducibility.

http://www.ijdc.net/article/view/833

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on February 7, 2023February 6, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Metadata, Open Access, Open Science

"Persistent Identifiers — Risks and Trust Related Issues Explored with New Knowledge Exchange Report and Case Studies"


As part of the work around Risks and Trust in pursuit of a well-functioning PID infrastructure for research, this Knowledge Exchange report examines the complex PID landscape within its six partner countries and beyond. The benefits of an efficient PID infrastructure and how this is a precondition for research communities impending research agendas, are explained. The report provides an in-depth look at what can go wrong with an unreliable PID service.

https://www.knowledge-exchange.info/news/articles/2-2-23

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Avatar photoAuthor Charles W. Bailey, Jr.Posted on February 7, 2023February 6, 2023Categories Metadata, Publishing, Reports and White Papers, Scholarly Books, Scholarly Journals

Posts pagination

Previous page Page 1 Page 2 Page 3 … Page 14 Next page

DigitalKoans Overview

DigitalKoans provides news and commentary on digital copyright, digital curation, digital repository, open access, research data management, scholarly communication, and other digital information issues. It is also available via an RSS feed.

A Digital Scholarship publication. Digital Scholarship is a noncommercial publisher and it accepts no advertising. Charles W. Bailey, Jr. is the publisher of Digital Scholarship.

Copyright © 2005-2025 by Charles W. Bailey, Jr. This work is licensed under a Creative Commons Attribution 4.0 International License.

Search

Categories

  • Academic Libraries (15)
  • ALA (56)
  • Alerts (18)
  • Announcements (228)
  • ARL Libraries (674)
  • Artificial Intelligence/Robots (474)
  • Author Rights (60)
  • Bibliographies (151)
  • Cloud Computing/SaaS (48)
  • Copyright (1,422)
  • Creative Commons/Open Licenses (146)
  • Current News: DigitalKoans Twitter Updates (425)
  • Cyberinfrastructure/E-Science (92)
  • Data and Text Mining (33)
  • Data Curation, Open Data, and Research Data Management (1,423)
  • Digital Archives and Special Collections (242)
  • Digital Art (43)
  • Digital Asset Management Systems (29)
  • Digital Commons (11)
  • Digital Copyright Wars (567)
  • Digital Culture (189)
  • Digital Curation & Digital Preservation (1,686)
  • Digital Curation News (418)
  • Digital Humanities (278)
  • Digital Libraries (180)
  • Digital Library Jobs (4,376)
  • Digital Media (87)
  • Digital Presses (47)
  • Digital Repositories (826)
  • Digital Rights Management (38)
  • Digital Scholarship Publications (200)
  • Digitization (230)
  • Disciplinary Archives (84)
  • DSpace (87)
  • DuraSpace (26)
  • E-Books (525)
  • E-Journal Management and Publishing Systems (25)
  • E-Journals (107)
  • E-Prints (261)
  • E-Reserves (39)
  • Electronic Resources (95)
  • Electronic Resources Jobs (221)
  • Electronic Theses and Dissertations (ETDs) (74)
  • Emerging Technologies (147)
  • EPrints (44)
  • ERM/Discovery Systems (18)
  • Federated Searching (12)
  • Fedora (82)
  • General (13)
  • Grants and Government Funding (246)
  • Higher Education (2)
  • Higher Education Budget Cuts (11)
  • ILS/LSP (33)
  • Information Schools (54)
  • Information Technology (104)
  • Institutional Repositories (646)
  • Internet Regulation (230)
  • Last Week's DigitalKoan's Tweets (24)
  • Learning Objects (14)
  • Legislation and Government Regulation (460)
  • Libraries (346)
  • Library IT Jobs (1,551)
  • Library Publishing (36)
  • Linking, Linked Data, and Semantic Web (48)
  • Mass Digitizaton (241)
  • Metadata (326)
  • Museums (61)
  • Net Neutrality (169)
  • OAI-ORE (23)
  • OAI-PMH (41)
  • Obituaries (22)
  • OCLC (61)
  • OPACs (28)
  • Open Access (3,225)
  • Open Access and Other Publishing License Agreements (387)
  • Open Educational Resources (70)
  • Open Science (631)
  • Open Source Software (296)
  • Other News (2)
  • P2P File Sharing (64)
  • People in the News (213)
  • Print-on-Demand (16)
  • Privacy (144)
  • Public Domain (94)
  • Publishing (3,290)
  • Reports and White Papers (641)
  • Research Libraries (1,334)
  • Research Tools (31)
  • Scholarly Books (490)
  • Scholarly Communication (1,009)
  • Scholarly Journals (2,294)
  • Scholarly Metrics (281)
  • Search Engines and Discovery Systems (337)
  • Security (81)
  • Self-Archiving (589)
  • Serials Crisis (97)
  • Social Media (217)
  • Software Curation and Preservation (54)
  • Standards (87)
  • Texas Academic Libraries (37)
  • Texas Digital Library (8)
  • University Presses (136)
  • Virtual Reality (24)
  • Webliographies (13)
  • Weblogs/Websites (16)

Archives

  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • May 2011
  • April 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • October 2006
  • September 2006
  • August 2006
  • July 2006
  • June 2006
  • May 2006
  • April 2006
  • March 2006
  • February 2006
  • January 2006
  • December 2005
  • November 2005
  • October 2005
  • September 2005
  • August 2005
  • July 2005
  • June 2005
  • May 2005
  • April 2005
DigitalKoans Proudly powered by WordPress