"Computational Intelligence to Aid Text File Format Identification"

Santhilata Kuppili Venkata and Alex Green have self-archived "Computational Intelligence to Aid Text File Format Identification."

Here's an excerpt:

One of the challenges faced in digital preservation is to identify the file types when the files can be opened with simple text editors and their extensions are unknown. The problem gets complicated when the file passes through the test of human readability, but would not make sense how to put to use! The Text File Format Identification (TFFI) project was initiated at The National Archives to identify file types from plain text file contents with the help of computing intelligence models. A methodology that takes help of AI and machine learning to automate the process was successfully tested and implemented on the test data. The prototype developed as a proof of concept has achieved up to 98.58% of accuracy in detecting five file formats.

Research Data Curation Bibliography, Version 10 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Digital Curation at Work: Modeling Workflows for Digital Archival Materials"

Colin Post et al. have self-archived "Digital Curation at Work: Modeling Workflows for Digital Archival Materials."

Here's an excerpt:

This paper describes and compares digital curation workflows from 12 cultural heritage institutions that vary in size, nature of digital collections, available resources, and level of development of digital curation activities.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Copyright and Digital Collections: A Data Driven Roadmap for Rights Statement Success"

Sara R. Benson and Hannah Stitzlein have published "Copyright and Digital Collections: A Data Driven Roadmap for Rights Statement Success" in ACRL 2019 Proceedings.

Here's an excerpt:

The two questions that ultimately guided this research were: What are the challenges that metadata practitioners face when implementing standardized rights statements? And, for institutions that have implemented standardized rights statements, what made them successful? The authors began the investigation to fill in the practical gaps of the previous studies, and to determine if barriers to implementing standardized rights statements was due to a lack of copyright knowledge and/or access to a copyright professional, or if there were resource barriers limiting the ability to begin implementation.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Never Best Practices: Born-Digital Audiovisual Preservation"

Julia Kim, Rebecca Fraimow and Erica Titkemeyer have published "Never Best Practices: Born-Digital Audiovisual Preservation" in Code4Lib Journal.

Here's an excerpt:

The sheer conditionality of [born-digital audiovisual file preservation] recommendations leaves practitioners mired in a sea of questions as they struggle to set realistically adhered to policies for their institutions. Should files be accepted as-is, or transcoded to an open and standardized format? When is transcoding to a preservation file specification worth the extra storage space and staff time? If transcoding, what are the ideal target specifications? When developing policies and workflows for batch transcoding a variety of different formats, each with different technical specifications, how do you make sure that preservation files maintain all the perceptible, let alone "significant" characteristics of the original files?

This paper presents case studies from three institutions—a university special collections library, a federal government department, and a public broadcasting station—demonstrating how the factors listed above might lead to 'tiered' processing and decision-making around 'good enough' practices for the preservation of born-digital a/v files.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Reconfiguration of the Archive as Data to Be Mined"

Michael Moss et al. have published "The Reconfiguration of the Archive as Data to Be Mined" in Archivaria.

Here's an excerpt:

This article discusses changing practices brought about by the move to online digital records, the impact these are having on the way history is written, and the way in which archivists are responding (and will need to respond in the future). We argue that digital administrative records are surrounded by other sources—online newspapers and social media—and that the huge volume of digital records alters the way historians read material. This will require a shift in approach from archivists, who will need to view archives as collections of data to be mined and not as texts to be read. . . . While grappling with these issues, archivists will also need to recognize that the future record will be as much about sound and vision as about text.

Academic Library as Scholarly Publisher Bibliography | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Digital Data Archives as Knowledge Infrastructures: Mediating Data Sharing and Reuse"

Christine L. Borgman et al. have self-archived "Digital Data Archives as Knowledge Infrastructures: Mediating Data Sharing and Reuse."

Here's an excerpt:

Digital archives are the preferred means for open access to research data. They play essential roles in knowledge infrastructures—robust networks of people, artifacts, and institutions—but little is known about how they mediate information exchange between stakeholders. We open the "black box" of data archives by studying DANS, the Data Archiving and Networked Services institute of The Netherlands, which manages 50+ years of data from the social sciences, humanities, and other domains. Our interviews, weblogs, ethnography, and document analyses reveal that a few large contributors provide a steady flow of content, but most are academic researchers who submit datasets infrequently and often restrict access to their files. Consumers are a diverse group that overlaps minimally with contributors. Archivists devote about half their time to aiding contributors with curation processes and half to assisting consumers. Given the diversity and infrequency of usage, human assistance in curation and search remains essential. DANS' knowledge infrastructure encompasses public and private stakeholders who contribute, consume, harvest, and serve their data—many of whom did not exist at the time the DANS collections originated—reinforcing the need for continuous investment in digital data archives as their communities, technologies, and services evolve.

Academic Library as Scholarly Publisher Bibliography | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"ARCHANGEL: Trusted Archives of Digital Public Documents"

John Collomosse, have self-archived "ARCHANGEL: Trusted Archives of Digital Public Documents."

Here's an excerpt:

We present ARCHANGEL; a de-centralised platform for ensuring the long-term integrity of digital documents stored within public archives. Document integrity is fundamental to public trust in archives. Yet currently that trust is built upon institutional reputation—trust at face value in a centralised authority, like a national government archive or University. ARCHANGEL proposes a shift to a technological underscoring of that trust, using distributed ledger technology (DLT) to cryptographically guarantee the provenance, immutability and so the integrity of archived documents. We describe the ARCHANGEL architecture, and report on a prototype of that architecture build over the Ethereum infrastructure. We report early evaluation and feedback of ARCHANGEL from stakeholders in the research data archives space.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Digital Archives as Big Data"

Luis Martinez-Uribe has self-archived "Digital Archives as Big Data."

Here's an excerpt:

Digital archives contribute to Big data. Combining social network analysis, coincidence analysis, data reduction, and visual analytics leads to better characterize topics over time, publishers' main themes and best authors of all times, according to the British newspaper The Guardian and from the 3 million records of the British National Bibliography.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Massive Newspaper Migration—Moving 22 Million Records from CONTENTdm to Solphal"

Alan Witkowski et al. have published "Massive Newspaper Migration—Moving 22 Million Records from CONTENTdm to Solphal" in D-Lib Magazine.

Utah Digital Newspapers is a pioneering digital newspapers program at the University of Utah J. Willard Marriott Library. Recently, a small project team completed a successful migration away from CONTENTdm onto a home-grown system called Solphal, built using open-source applications. The migration process is detailed along with examples of scripts used to prepare and enhance metadata. Transitioning away from a limiting vendor-based solution to a home-grown system has enabled the Utah Digital Newspapers program to be more responsive to user requests as well as realizing greater efficiencies in hardware and software. The platform has opened up new possibilities for the future as the collection continues to grow.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap