Skip to content
DigitalKoans

DigitalKoans

What Is the Sound of One E-Print Downloading?

Category: Metadata

"Challenges and Roadblocks to Robust Metadata in the Scholarly Communications Industry"


For the scholarly communications community to extract the most value from scientific research, as well as to successfully move from subscription to open access publishing models, it is essential to have ‘clean’ metadata and a robust infrastructure to build the required workflows, processes and systems to support effective use of that metadata. The obstacles to achieving a supporting infrastructure and implementing effective workflows are many, negatively impacting every stakeholder group and every aspect of the research and publishing process.

https://doi.org/10.1629/uksg.642

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on May 24, 2024May 23, 2024Categories Digital Curation & Digital Preservation, Metadata, Open Access, Publishing

"Crossref as a Source of Open Bibliographic Metadata"


Several initiatives have been taken to promote the open availability of bibliographic metadata of scholarly publications in Crossref. We present an up-to-date overview of the availability of six metadata elements in Crossref: reference lists, abstracts, ORCIDs, author affiliations, funding information, and license information. Our analysis shows that the availability of these metadata elements has improved over time, at least for journal articles, the most common publication type in Crossref. However, the analysis also shows that many publishers need to make additional efforts to realize full openness of bibliographic metadata.

https://doi.org/10.31222/osf.io/smxe5

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on April 24, 2024April 23, 2024Categories Metadata, Open Access

"What Do We Know about DOIs"


What, though, do we actually know about the state of persistence of these links? How many DOIs resolve correctly? How many landing pages, at the other end of the DOI resolution, contain the information that is supposed to be there, including the title and the DOI itself?. . . .

Let’s talk about the resolution statistics. Other studies, looking at general links on the web, have found a link-rot rate of about 60%-70% over a ten-year period (Lessig, Zittrain, and Albert 2014; Stox 2022). The DOI resolution rate that we have, with 97% of links resolving (or a 3% link-rot rate), is far better and more robust than a web link in general.

Is 3% a good or a bad number? It’s more robust than the web in general, but it still means that for every 100 DOIs, just under 3 will fail to resolve. We also cannot tell whether these DOIs are resolving to the correct target, except by using the metadata detection metrics (are the title and DOI on the landing page, which we could only detect at a far lower rate). It is entirely possible for a website to resolve with an HTTP 200 (OK) response, but for the page in question to be something very different to what the user expected, a phenomenon dubbed content drift.

https://www.crossref.org/blog/what-do-we-know-about-dois/

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on March 1, 2024February 29, 2024Categories Digital Curation & Digital Preservation, Metadata, Publishing

"How Persistent Identifiers Work Together in the Research Ecosystem"


Each PID requires metadata, or information about the person, thing or organization that the identifier is identifying, and different PIDs can be included in the metadata of other PIDs:

  • ORCID iDs can be included in DOI metadata to identify the people involved in the existence of the object that the DOI is identifying.
  • DOIs can be included in an ORCID record to identify the works that a person has produced, or the funding that a person has received.
  • ROR IDs can be included in ORCID records to identify the organizations that an individual is affiliated with.
  • ROR IDs can be included in DOI metadata to identify the organizations that are involved in the existence of the object that the DOI is identifying.

http://tinyurl.com/3bkx6ytv

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on February 22, 2024February 21, 2024Categories Metadata, Publishing, Scholarly Communication

"Completeness Degree of Publication Metadata in Eight Free-Access Scholarly Databases"


The main objective of this study is to compare the amount of metadata and the completeness degree of research publications in new academic databases. Using a quantitative approach, we selected a random Crossref sample of more than 115k records, which was then searched in seven databases (Dimensions, Google Scholar, Microsoft Academic, OpenAlex, Scilit, Semantic Scholar, and The Lens). Seven characteristics were analyzed (abstract, access, bibliographic info, document type, publication date, language, and identifiers), to observe fields that describe this information, the completeness rate of these fields, and the agreement among databases. The results show that academic search engines (Google Scholar, Microsoft Academic, and Semantic Scholar) gather less information and have a low degree of completeness. Conversely, third-party databases (Dimensions, OpenAlex, Scilit, and The Lens) have more metadata quality and a higher completeness rate. We conclude that academic search engines lack the ability to retrieve reliable descriptive data by crawling the Web, while the main problem of third-party databases is the loss of information derived from integrating different sources.

https://doi.org/10.1162/qss_a_00286

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on February 14, 2024February 13, 2024Categories Metadata, Search Engines and Discovery Systems

"Are Open Bibliometric Data Sources Better than Proprietary Ones?"


The authors concluded that the resulting data based on open bibliometric sources was more comprehensive and of better quality than the data based on sources provided by the commercial provider. In addition, the use of open source data allows scidecode to comply with the requirement set by cOAlition S to openly licence their results, which would not necessarily be the case if they had used a commercial provider.

http://tinyurl.com/2btc8a76

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on January 26, 2024January 25, 2024Categories Metadata, Open Access, Scholarly Metrics

"Artificial Intelligence and Music Discovery"


Our article seeks to contextualize this new technology within music discovery research by defining the historical underpinnings of artificial intelligence in computer music and music information retrieval. We also identify four areas in librarianship research and practice where applications of this technology should be explored; augmenting library interactions with conversational search, creating tools to assist with metadata clean up and creation, integrating music holdings into commercial discovery systems, and improving music discovery platforms.

https://doi.org/10.1080/10588167.2023.2287924

| Artificial Intelligence and Libraries Bibliography |
Research Data Curation and Management Works | | Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on December 11, 2023December 10, 2023Categories Artificial Intelligence/Robots, Metadata, OPACs

Paywall: "FAST Headings in MODS: Michigan State University Libraries Digital Repository Case Study"


Since 2016, the [MSUL] digital repository has been using Faceted Application of Subject Terminology (FAST) subject headings as its primary subject vocabulary. . . The MSUL FAST use case presents some challenges that are not addressed by existing MARC-focused FAST tools. This paper will outline the MSUL digital repository team’s justification for including FAST headings in the digital repository as well as workflows for adding FAST headings to Metadata Object Description Schema (MODS) metadata, their maintenance, and utilization for discovery.

https://doi.org/10.1080/01639374.2023.2213708

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on November 1, 2023October 31, 2023Categories ARL Libraries, Digital Repositories, E-Prints, Institutional Repositories, Metadata, Open Access, Self-Archiving

"OER Discovery: Ensuring that OER Rise to the Top"


This paper discusses the challenges of ensuring discoverability of Open Educational Resources (OER) in the absence of clear standards for sharing them. Despite the efforts of librarians and instructors to create a wealth of OER, discoverability remains limited and often relegated to a list of links on a LibGuide. The authors address this challenge by highlighting technical and descriptive barriers to OER discoverability. The authors then describe the development of a hybrid metadata standard for OER and its deployment through the institutional repository. Although provisional, this approach ensures that OER records can be adapted to future metadata standards and exported to third-party repositories. This paper underscores the importance of developing an effective metadata standard for OER to ensure their discoverability for learners and educators.

https://doi.org/10.13001/joerhe.v2i1.7879

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on October 20, 2023October 19, 2023Categories Institutional Repositories, Metadata, Open Access, Open Educational Resources

Software Metadata Recommended Format Guide, Version 1.1.0


The Software Metadata Recommended Format Guide (SMRF) describes and represents metadata elements identified by the Software Preservation Network that are appropriate to describe software materials in the context of a wide range of collections. SMRF aims to be adaptable, so that it can be used in different contexts and systems across libraries, museums, archives, and repositories. It is not meant to be exhaustive; instead SMRF is meant to provide a framework for cultural institutions and collections to determine which metadata to capture, and how to capture it, for their own collections

https://tinyurl.com/mr3hecys

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on October 19, 2023October 18, 2023Categories Metadata, Software Curation and Preservation

"RDA National PID Strategies Guide and Checklist: Final Outputs and Supporting Materials Available"


The National PID Strategies Working Group was endorsed to explore how Persistent Identifiers (PIDs) form part of national policy and research infrastructure implementation frameworks. . . .

The findings of the 18 months of the WG have been that:

  1. National PID strategies are on the rise, evidenced in the case studies collected by the WG and the growing momentum of discussions at RDA Plenaries and other international fora.
  2. The development of national PID strategies is a relatively new phenomenon and many countries are in the very early stages. In fact, many have more of a national approach that they are seeking to transform into a strategy.
  3. All national PID strategies are currently in development and therefore subject to a high degree of change. During the course of the WG, nine case studies were collected and several of these needed to be updated prior to the Group’s final output due to changes that had taken place in those countries.
  4. There is no single "cookie cutter" approach to developing a national PID strategy. Critical components include:
    • A clear value proposition with use cases
    • A group or organisation that is responsible for driving strategy development
    • An open, inclusive, iterative process that involves all stakeholders
    • An accompanying roadmap that outlines practical steps for implementation
  5. International PID providers such as ORCID and DataCite have begun to actively engage with national PID strategies and the RDA National PID Strategies WG provides a focal point for furthering this engagement.

https://tinyurl.com/559x4dmy

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on October 13, 2023October 12, 2023Categories Metadata

"Developing a Preservation Metadata Standard for Languages"


We have so many languages to communicate with others as humans. There are approximately 7000 languages in the world, and many are becoming extinct for a variety of reasons. In order to preserve and prevent the extinction of these languages, we need to preserve them. One way of preservation is to have a preservation metadata for languages. Metadata is data about data. Metadata is required for item description, preservation, and retrieval. There are various types of metadata, e.g., descriptive, administrative, structural, preservation, etc. After the literature study, the authors observed that there is a lack of study on the preservation metadata for language. Consequently, the purpose of this paper is to demonstrate the need for language preservation metadata. We found some archaeological metadata standards for this purpose, and after applying inclusion and exclusion criteria, we chose three archaeological metadata standards, namely: Archaeon-core, CARARE, and LIDO (Lightweight Information Describing Objects) for mapping metadata.

https://arxiv.org/abs/2310.04155

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on October 10, 2023October 9, 2023Categories Digital Curation & Digital Preservation, Metadata

"From ChatGPT to CatGPT: The Implications of Artificial Intelligence on Library Cataloging "


This paper explores the potential of language models such as ChatGPT to transform library cataloging. Through experiments with ChatGPT, the author demonstrates its ability to generate accurate MARC records using RDA and other standards such as the Dublin Core Metadata Element Set. These results demonstrate the potential of ChatGPT as a tool for streamlining the record creation process and improving efficiency in library settings. The use of AI-generated records, however, also raises important questions related to intellectual property rights and bias. The paper reviews recent studies on AI in libraries and concludes that further research and development of this innovative technology is necessary to ensure its responsible implementation in the field of library cataloging.

https://tinyurl.com/fd8xjmnt

| Artificial Intelligence and Libraries Bibliography |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on September 20, 2023September 27, 2023Categories Artificial Intelligence/Robots, Metadata

Paywall: "Images as Metadata: A New Perspective for Describing Research Data"


Abstract Through studies and work developed over the last few years, we propose a new approach to description, where images can have a preponderant role in the description of data, assuming the role of metadata. We present several pieces of evidence, point out their challenges and determine the opportunities this new perspective can have in the research. Images have specific characteristics that can be leveraged in improving data description. Historical evidence establish that images have always been used and produced in research, yet their representational ability has never been harnessed to describe data and give more context to the scientific process.

https://doi.org/10.1080/19386389.2023.2252722

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on September 15, 2023September 14, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Metadata, Open Access, Open Science

"Global Visibility of Publications through Digital Object Identifiers"


This brief research report analyzes the availability of Digital Object Identifiers (DOIs) worldwide, highlighting the dominance of large publishing houses and the need for unique persistent identifiers to increase the visibility of publications from developing countries. The study reveals that a considerable amount of publications from developing countries are excluded from the global flow of scientific information due to the absence of DOIs, emphasizing the need for alternative publishing models. The authors suggest that the availability of DOIs should receive more attention in scholarly communication and scientometrics, contributing to a necessary debate on DOIs relevant for librarians, publishers, and scientometricians.

https://doi.org/10.3389/frma.2023.1207980

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on September 11, 2023September 10, 2023Categories Metadata, Publishing, Scholarly Books, Scholarly Journals

"PreprintResolver: Improving Citation Quality by Resolving Published Versions of ArXiv Preprints using Literature Databases"


The growing impact of preprint servers enables the rapid sharing of time-sensitive research. Likewise, it is becoming increasingly difficult to distinguish high-quality, peer-reviewed research from preprints. Although preprints are often later published in peer-reviewed journals, this information is often missing from preprint servers. To overcome this problem, the PreprintResolver was developed, which uses four literature databases (DBLP, SemanticScholar, OpenAlex, and CrossRef / CrossCite) to identify preprint-publication pairs for the arXiv preprint server. . . . Experiments were performed on a sample of 1,000 arXiv-preprints from the research field of computer science and without any publication information. . . . The results show that the PreprintResolver was able to resolve 603 out of 1,000 (60.3 %) arXiv-preprints from the research field of computer science and without any publication information. . . . In conclusion the PreprintResolver is suitable for individual, manually reviewed requests, but less suitable for bulk requests. The PreprintResolver tool (this https URL, Available from 2023-08-01) and source code (this https URL, Accessed: 2023-07-19) is available online.

https://arxiv.org/abs/2309.01373

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on September 8, 2023September 7, 2023Categories E-Prints, Metadata, Open Access, Open Science, Search Engines and Discovery Systems, Self-Archiving

Digital Preservation Coalition: Choosing a Persistent Identifier Type for Your Digital Objects


This report is intended to help you get started using persistent identifiers (PIDs) for digital objects. Its intended audience is people who are involved in digital preservation in heritage and research organizations. The report answers questions such as: "What are persistent identifiers?", "Why are they important?", "Which type should you choose?", "Are you ready for them?", and "How should you implement them?". The report does not specifically cover persistent identifiers for people, organizations, grants, workflows, and so on, but some of the same general concepts would also apply

http://doi.org/10.7207/twgn23-02

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on September 8, 2023September 7, 2023Categories Digital Curation & Digital Preservation, Metadata

"Metadata Standard for Continuous Preservation, Discovery, and Reuse of Research Data in Repositories by Higher Education Institutions: A Systematic Review"


This systematic review synthesised existing research papers that explore the available metadata standards to enable researchers to preserve, discover, and reuse research data in repositories. This review provides a broad overview of certain aspects that must be taken into consideration when creating and assessing metadata standards to enhance research data preservation discoverability and reusability strategies. Research papers on metadata standards, research data preservation, discovery and reuse, and repositories published between January 2003 and April 2023 were reviewed from a total of five databases. The review retrieved 1597 papers, and 13 papers were selected in this review. We revealed 13 research articles that explained the creation and application of metadata standards to enhance preservation, discovery, and reuse of research data in repositories. Among them, eight presented the three main types of metadata, descriptive, structural, and administrative, to enable the preservation of research data in data repositories. We noted limited evidence on how these metadata standards can be used to enhance the discovery and reuse of research data in repositories to enable the preservation, discovery, and reuse of research data in repositories. No reviews indicated specific higher education institutions employing metadata standards for the research data created by their researchers. Repository designs and a lack of expertise and technology know-how were among the challenges identified from the reviewed papers. The review has the potential to influence professional practice and decision-making by stakeholders, including researchers, students, librarians, information communication technologists, data managers, private and public organisations, intermediaries, research institutions, and non-profit organizations.

https://doi.org/10.3390/info14080427

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on August 10, 2023August 9, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Metadata, Open Access, Open Science

Paywall: "Proposal for the Publication of Linked Open Bibliographic Data"


The objective of this paper is to analyze the publishing of bibliographic data such as LOD, having as a product the elaboration of theoretical-methodological recommendations for the publication of these data, in an approach based on the ten best practices for publishing LOD, from the World Wide Web Consortium. The starting point was the conduction of a Systematic Review of Literature, where initiatives to publish bibliographic data such as LOD were identified. An empirical study of these institutions was also conducted.

https://doi.org/10.1080/01639374.2023.2234358

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on August 4, 2023August 3, 2023Categories Metadata, Open Access

"Signing Data Citations Enables Data Verification and Citation Persistence"


Increasingly, digital datasets are being published with assigned identifiers, then cited in papers as the basis for repeatable experiments. To help future readers find and verify data, customary citations can be extended with content signatures, which can be introduced without having to replace existing identifier such as DOIs and ARKs. That is, signatures can be seen as complementary identifiers to help keep specific versions of cited data findable and identifiable as they evolve and change locations. For example, if a DOI identifies an evolving dataset, rather than a fixed version — i.e., content drift is expected — the DOI can safely be cited for the sake of attribution, metadata linking, and citation statistics (e.g., by Crossref (https://www.crossref.org) and DataCite (https://datacite.org)), while the content signature helps the reader find the exact content that was cited, possibly with assistance from metadata linked to the DOI. Additionally, a citation that includes both the DOI (for example) and content signature of a dataset creates a fixed mapping between the two identifiers. Then, unintentional content drift by the DOI can be detected and reported, and an alternative location may potentially be discovered by consulting public content signature registries.

https://doi.org/10.1038/s41597-023-02230-y

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on July 11, 2023July 10, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Metadata

Paywall: "Interoperability of Open Science Metadata: What About the Reality?"


This paper leads to identify complementary ways to improve dataset interoperability: (1) to adapt mapping algorithms to the issues raised by metadata schema matching; (2) to adapt metadata schemata, for instance by sharing a core vocabulary and/or reusing existing standards; (3) to combine various trends in a more complex interoperability approach that would also make available and operational the (RDA) crosswalks between schemata and that would promote good practices in metadata labeling and documentation.

https://doi.org/10.1007/978-3-031-33080-3_28

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on June 1, 2023May 31, 2023Categories Metadata, Open Access, Open Science

Common Scholarly Communication Infrastructure Landscape Review


Scholarly communication is a complicated sector, with numerous participants and multiple mechanisms for communicating and reviewing materials created in an increasing variety of formats by researchers across the globe.[1] In turn, the researcher who seeks to use the products of this system wishes to discover, access, and use relevant and trustworthy materials as effortlessly as possible. The work of driving efficiency into this complex sector while bringing its multiple strands together seamlessly for the reader (or, increasingly, for a computational user) rests on a foundation of infrastructure, much of it shared across multiple publishers. In this landscape review, we seek to provide a high-level overview of the shared infrastructure that supports scholarly communication.

https://doi.org/10.18665/sr.318775

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on April 26, 2023April 25, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, ERM/Discovery Systems, Metadata, Open Access and Other Publishing License Agreements, Publishing, Reports and White Papers, Scholarly Metrics

"The Viability of Using an Open Source Locally Hosted AI for Creating Metadata in Digital Image Collections"


Artificial intelligence (AI) can support metadata creation for images by generating descriptions, titles, and keywords for digital collections in libraries. Many AI options are available, ranging from cloud-based corporate software solutions, including Microsoft Azure Custom Vision and Google Cloud Vision, to open-source locally hosted software packages. This case study examines the feasibility of deploying the open-source, locally hosted AI software, Sheeko, and the accuracy of the descriptions generated for images using two of the pre-trained models. The study aims to ascertain if Sheeko’s AI would be a viable solution for producing metadata in the form of descriptions, or titles for digital collections in Libraries and Cultural Resources at the University of Calgary.

https://journal.code4lib.org/articles/17186

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on April 24, 2023April 23, 2023Categories Artificial Intelligence/Robots, Digital Archives and Special Collections, Digital Media, Metadata

Digital Scholarship Releases New PDF Versions of Its Bibliographies

This spring, Digital Scholarship’s bibliographies in the HTML format were reformatted as single page files with internal navigation. This included all bibliographies that were in HTML format only as well as the HTML versions of paperback books. These new PDFs are in a 12 point font and are designed for printing; however, they also have live links for immediate access. There were no content changes. For a list of all Digital Scholarship publications, see the site map.

Academic Library as Scholarly Publisher Bibliography, v. 3

Altmetrics Bibliography

Digital Curation and Preservation Bibliography, v. 2

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works, 2012 Supplement

Digital Curation Resource Guide

Electronic Theses and Dissertations Bibliography, v.7

E-science and Academic Libraries Bibliography

Google Books Bibliography, v. 7

Institutional Repository Bibliography, v. 4

Open Access Bibliography: Liberating Scholarly Literature with E-Prints and Open Access Journals

Open Access Journals Bibliography

Open Access Webliography

Research Data Curation Bibliography, v. 10

Research Data Publication and Citation Bibliography

Research Data Sharing and Reuse Bibliography

Scholarly Electronic Publishing Bibliography, v. 80

Transforming Peer Review Bibliography

Transforming Scholarly Publishing through Open Access: A Bibliography

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on April 5, 2023April 6, 2023Categories Bibliographies, Copyright, Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Digital Scholarship Publications, E-Books, Institutional Repositories, Metadata, Open Access, Open Access and Other Publishing License Agreements, Open Science, Publishing, Research Libraries, Scholarly Books, Scholarly Communication, Scholarly Journals, University Presses

"Scaling Identifiers and their Metadata to Gigascale: An Architecture to Tackle the Challenges of Volume and Variety"


Persistent identifiers are applied to an ever-increasing variety of research objects, including software, samples, models, people, instruments, grants, and projects, and there is a growing need to apply identifiers at a finer and finer granularity. Unfortunately, the systems developed over two decades ago to manage identifiers and the metadata describing the identified objects no longer scale. Communities working with physical samples have grappled with these three challenges of the increasing volume, variety, and variability of identified objects for many years. To address this dual challenge, the IGSN 2040 project explored how metadata and catalogues for physical samples could be shared at the scale of billions of samples across an ever-growing variety of users and disciplines. In this paper, we focus on how we scale identifiers and their describing metadata to billions of objects and who the actors involved with this system are. Our analysis of these requirements resulted in the definition of a minimum viable product and the design of an architecture that not only addresses the challenges of increasing volume and variety but, more importantly, is easy to implement because it reuses commonly used Web components. Our solution is based on a Web architectural model that utilises Schema.org, JSON-LD, and sitemaps. Applying these commonly used architectural patterns on the internet allows us to not only handle increasing variety but also enable better compliance with the FAIR Guiding Principles.

http://doi.org/10.5334/dsj-2023-005

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Avatar photoAuthor Charles W. BaileyPosted on April 5, 2023April 4, 2023Categories Data Curation, Open Data, and Research Data Management, Digital Curation & Digital Preservation, Metadata

Posts pagination

Previous page Page 1 Page 2 Page 3 … Page 14 Next page

DigitalKoans Overview

DigitalKoans provides news and commentary on digital copyright, digital curation, digital repository, open access, research data management, scholarly communication, and other digital information issues. It is also available via an RSS feed.

A Digital Scholarship publication. Digital Scholarship is a noncommercial publisher and it accepts no advertising. Charles W. Bailey, Jr. is the publisher of Digital Scholarship.

Copyright © 2005-2025 by Charles W. Bailey, Jr. This work is licensed under a Creative Commons Attribution 4.0 International License.

Search

Categories

  • Academic Libraries (91)
  • ALA (58)
  • Alerts (18)
  • Announcements (233)
  • ARL Libraries (684)
  • Artificial Intelligence/Robots (562)
  • Author Rights (62)
  • Bibliographies (151)
  • Cloud Computing/SaaS (48)
  • Copyright (1,442)
  • Creative Commons/Open Licenses (150)
  • Current News: DigitalKoans Twitter Updates (425)
  • Cyberinfrastructure/E-Science (92)
  • Data and Text Mining (35)
  • Data Curation, Open Data, and Research Data Management (1,491)
  • Digital Archives and Special Collections (256)
  • Digital Art (43)
  • Digital Asset Management Systems (29)
  • Digital Commons (11)
  • Digital Copyright Wars (567)
  • Digital Culture (189)
  • Digital Curation & Digital Preservation (1,758)
  • Digital Curation News (420)
  • Digital Humanities (282)
  • Digital Libraries (184)
  • Digital Library Jobs (4,540)
  • Digital Media (87)
  • Digital Presses (47)
  • Digital Repositories (844)
  • Digital Rights Management (38)
  • Digital Scholarship Publications (200)
  • Digitization (231)
  • Disciplinary Archives (84)
  • DSpace (88)
  • DuraSpace (26)
  • E-Books (536)
  • E-Journal Management and Publishing Systems (25)
  • E-Journals (107)
  • E-Prints (271)
  • E-Reserves (39)
  • Electronic Resources (98)
  • Electronic Resources Jobs (249)
  • Electronic Theses and Dissertations (ETDs) (75)
  • Emerging Technologies (148)
  • EPrints (46)
  • ERM/Discovery Systems (18)
  • Federated Searching (12)
  • Fedora (82)
  • General (13)
  • Grants and Government Funding (254)
  • Higher Education (2)
  • Higher Education Budget Cuts (11)
  • ILS/LSP (35)
  • Information Schools (55)
  • Information Technology (104)
  • Institutional Repositories (659)
  • Internet Regulation (231)
  • Last Week's DigitalKoan's Tweets (24)
  • Learning Objects (14)
  • Legislation and Government Regulation (462)
  • Libraries (368)
  • Library IT Jobs (1,571)
  • Library Publishing (40)
  • Linking, Linked Data, and Semantic Web (48)
  • Mass Digitizaton (244)
  • Metadata (336)
  • Museums (64)
  • Net Neutrality (169)
  • OAI-ORE (23)
  • OAI-PMH (41)
  • Obituaries (22)
  • OCLC (61)
  • OPACs (28)
  • Open Access (3,347)
  • Open Access and Other Publishing License Agreements (417)
  • Open Educational Resources (72)
  • Open Science (667)
  • Open Source Software (300)
  • Other News (2)
  • P2P File Sharing (64)
  • People in the News (221)
  • Print-on-Demand (16)
  • Privacy (144)
  • Public Domain (98)
  • Publishing (3,438)
  • Reports and White Papers (644)
  • Research Libraries (1,394)
  • Research Library Dean and Associate Dean Jobs (27)
  • Research Tools (32)
  • Scholarly Books (521)
  • Scholarly Communication (1,083)
  • Scholarly Journals (2,422)
  • Scholarly Metrics (300)
  • Search Engines and Discovery Systems (356)
  • Security (82)
  • Self-Archiving (604)
  • Serials Crisis (97)
  • Social Media (222)
  • Software Curation and Preservation (59)
  • Standards (89)
  • Texas Academic Libraries (37)
  • Texas Digital Library (8)
  • University Presses (139)
  • Virtual Reality (24)
  • Webliographies (13)
  • Weblogs/Websites (16)

Archives

  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • May 2011
  • April 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • October 2006
  • September 2006
  • August 2006
  • July 2006
  • June 2006
  • May 2006
  • April 2006
  • March 2006
  • February 2006
  • January 2006
  • December 2005
  • November 2005
  • October 2005
  • September 2005
  • August 2005
  • July 2005
  • June 2005
  • May 2005
  • April 2005
DigitalKoans Proudly powered by WordPress