"Current Status of Scientific Data Curation Research and Practices in Mainland China"

Shiyan Ou and Yu Zhou have published "Current Status of Scientific Data Curation Research and Practices in Mainland China" in LIBRES.

Here's an excerpt:

With the rapid growth in the body of scientific data, scientific research depends more and more on finding theories and knowledge from the data, and thus data-intensive scientific discovery has become the fourth paradigm of scientific research. Therefore, it is urgent to develop and adopt methods to support the collection, collation, preservation and utilization of scientific data. This paper provides an overview of scientific data curation research and practices in mainland China. Firstly, it reviews Chinese research articles on data curation and outlines the research status and progress in this area. Secondly, it surveys existing scientific data repositories or platforms in mainland China, and analyzes the gaps between China's and other countries' data curation practices.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Digitized Archival Document Trustworthiness Scale"

Devan Ray Donaldson has published "The Digitized Archival Document Trustworthiness Scale" in the International Journal of Digital Curation.

Here's an excerpt:

Designated communities are central to validation of preservation. If a designated community is able to understand and use information found within a digital repository, the assumption is that the information has been properly preserved. As judging the trustworthiness of information requires at least some level of understanding of that information, this paper presents results of a study aimed at developing a tool for measuring designated community members' perceptions of trustworthiness for preserved information found within a digital repository. The study focuses on genealogists at the Washington State Digital Archives who routinely interact with digitized genealogical records, including digitized marriage, death, and birth records. Results of the study include construction of an original Digitized Archival Document Trustworthiness Scale (DADTS). DADTS is a ready-made tool for digital curators to use to measure the trustworthiness perceptions of their designated community members. Implications of this study include the feasibility of engaging members of a designated community in the construction of a scale for measuring trustworthiness perception, thereby providing deeper insight into the understandability and usability of preserved information by that designated community.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Two Reports on Disk Image Formats from the Harvard Library Digital Preservation Program

The Harvard Library Digital Preservation Program has released Disk Image Content Model and Metadata Analysis ACTIVITY 1: Comparative Format Matrix Analysis and Disk Image Content Model and Metadata Analysis ACTIVITY 2: Metadata Analysis

Here's an excerpt from the announcement:

Harvard Library collections include a variety of computer media that will be imaged using forensic disk imaging techniques and preserved in the Library's preservation and access repository—the Digital Repository Service (DRS). As a first step towards providing support for this material in the DRS, the Library contracted AVPreserve in late 2015 to assist with the analysis. The goals of the analysis were:

  • Recommended disk image formats to accept and prefer for the DRS
  • Recommended technical metadata schema(s) to use for disk image file formats
  • DRS content models for these objects
  • Recommendations for enhancing Harvard Library's FITS tool to better support these objects

See also: Disk Image Format Matrix spreadsheet.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Durability and Fragility of Knowledge Infrastructures: Lessons Learned from Astronomy"

Christine L. Borgman, Peter T. Darch, Ashley E. Sands, and Milena S. Golshan have self-archived "The Durability and Fragility of Knowledge Infrastructures: Lessons Learned from Astronomy."

Here's an excerpt:

Infrastructures are not inherently durable or fragile, yet all are fragile over the long term. Durability requires care and maintenance of individual components and the links between them. Astronomy is an ideal domain in which to study knowledge infrastructures, due to its long history, transparency, and accumulation of observational data over a period of centuries. Research reported here draws upon a long-term study of scientific data practices to ask questions about the durability and fragility of infrastructures for data in astronomy. Methods include interviews, ethnography, and document analysis. As astronomy has become a digital science, the community has invested in shared instruments, data standards, digital archives, metadata and discovery services, and other relatively durable infrastructure components. Several features of data practices in astronomy contribute to the fragility of that infrastructure. These include different archiving practices between ground- and space-based missions, between sky surveys and investigator-led projects, and between observational and simulated data. Infrastructure components are tightly coupled, based on international agreements. However, the durability of these infrastructures relies on much invisible work—cataloging, metadata, and other labor conducted by information professionals. Continual investments in care and maintenance of the human and technical components of these infrastructures are necessary for sustainability.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Happy Beta Release Day, Omeka S!!"

The Roy Rosenzweig Center for History and New Media, George Mason University has released "Happy Beta Release Day, Omeka S!!."

Here's an excerpt:

Omeka S is the next-generation, open source web-publishing platform that is fully integrated into the scholarly communications ecosystem and designed to serve the needs of medium to large institutional users who wish to launch, monitor, and upgrade many sites from a single installation.

Though Omeka S is a completely new software package, it shares the same goals and principles of Omeka Classic that users have come to love: a commitment to cost-effective deployment and design, an intuitive user interface, open access to data and resources, and interoperability through standardized data.

Created with funding from The Andrew W. Mellon Foundation and the Institute of Museum and Library Services, Omeka S is engineered to ease the burdens of administrators who want to make it possible for their end-user communities to easily build their own sites that showcase digital cultural heritage materials.

See also: Omeka S Beta Technical Specs.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Provenance in Support of ANDS’ Four Transformations"

Andrew E. Treloar and Mingfang Wu have published "Provenance in Support of ANDS' Four Transformations" in the International Journal of Digital Curation.

Here's an excerpt:

This article introduces the provenance activities that are being carried out at the Australia National Data Services (ANDS). Since its beginning, ANDS has been promoting four data transformations so that Australia's research data become more valuable and reusable by researchers. Among many other activities that enable the four transformations, ANDS has been encouraging ANDS partners to capture and describe rich context at the time when a data collection is created. In 2015, ANDS funded a number of external projects that had provenance components. In addition, ANDS is working on the interoperability between the schema that is used by the ANDS research data registration and discovery service – Research Data Australia (RDA) – and the W3C recommended provenance standard, Provenance Ontology (PROV-O), and investigating how to enrich the schema to access provenance information. The article concludes by discussing the lessons we learnt and our future planned activity.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"OSS4EVA: Using Open-Source Tools to Fulfill Digital Preservation Requirements"

Marty Gengenbach et al. have published "OSS4EVA: Using Open-Source Tools to Fulfill Digital Preservation Requirements" in Code4Lib Journal.

Here's an excerpt:

This paper builds on the findings of a workshop held at the 2015 International Conference on Digital Preservation (iPRES), entitled, "Using Open-Source Tools to Fulfill Digital Preservation Requirements" (OSS4PRES hereafter). This day-long workshop brought together participants from across the library and archives community, including practitioners, proprietary vendors, and representatives from open-source projects. The resulting conversations were surprisingly revealing: while OSS' significance within the preservation landscape was made clear, participants noted that there are a number of roadblocks that discourage or altogether prevent its use in many organizations. Overcoming these challenges will be necessary to further widespread, sustainable OSS adoption within the digital preservation community. This article will mine the rich discussions that took place at OSS4PRES to (1) summarize the workshop's key themes and major points of debate, (2) provide a comprehensive analysis of the opportunities, gaps, and challenges that using OSS entails at a philosophical, institutional, and individual level, and (3) offer a tangible set of recommendations for future work designed to broaden community engagement and enhance the sustainability of open source initiatives, drawing on both participants' experience as well as additional research.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Cobweb: Collaborative Collection Development for Web Archives"

The California Digital Library has released "Cobweb: Collaborative Collection Development for Web Archives."

Here's an excerpt:

A partnership between the CDL, Harvard Library, and UCLA Library has been award funding from IMLS to create Cobweb, a collaborative collection development platform for web archiving, https://github.com/CobwebOrg/cobweb.

See also the grant proposal.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Organizational Assessment Frameworks for Digital Preservation: A Literature Review and Mapping"

Emily Maemura et al. have self-archived "Organizational Assessment Frameworks for Digital Preservation: A Literature Review and Mapping."

Here's an excerpt:

As the field of digital preservation matures, there is an increasing need to systematically assess an organization's abilities to achieve its digital preservation goals, and a wide variety of assessment tools have been created for this purpose. To map the landscape of research in this area, evaluate the current maturity of knowledge on this central question in DP and provide direction for future research, this paper reviews assessment frameworks in digital preservation through a systematic literature search and categorizes the literature by type of research. The analysis shows that publication output around assessment in digital preservation has increased markedly over time, but most existing work focuses on developing new models rather than rigorous evaluation and validation of existing frameworks.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"From Plan to Action: Successful Data Management Plan Implementation in a Multidisciplinary Project"

Margaret H. Burnette, Sarah C. Williams, and Heidi J. Imker have published "From Plan to Action: Successful Data Management Plan Implementation in a Multidisciplinary Project" in the Journal of eScience Librarianship.

Here's an excerpt:

A case study was designed to gather insights from the research group through semi-structured interviews. Questions focused on which of the recommended data management strategies were adopted and how those strategies affected the project in terms of cost, time, effectiveness, and long-term data use.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Campus Support Systems for Technical Researchers Navigating Big Data Ethics"

Bonnie Tijerina has published "Campus Support Systems for Technical Researchers Navigating Big Data Ethics" in EDUCAUSE Review.

Here's an excerpt:

A team at Data & Society recently conducted interviews and campus visits with computer science researchers and librarians at eight U.S. universities to examine the role of research librarians in assisting technical researchers as they navigate emerging issues of privacy, ethics, and equitable access to data at different phases of the research process.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Research Data Management in Social Sciences and Humanities: A Survey at the University of Lille (France)"

Joachim Schöpfel and Hélène Prost have published "Research Data Management in Social Sciences and Humanities: A Survey at the University of Lille (France)" in LIBREAS.

Here's an excerpt:

The paper presents results from a campus-wide survey at the University of Lille (France) on research data management in social sciences and humanities. The survey received 270 responses, equivalent to 15% of the whole sample of scientists, scholars, PhD students, administrative and technical staff (research management, technical support services); all disciplines were represented. The responses show a wide variety of practice and usage. The results are discussed regarding job status and disciplines and compared to other surveys. Four groups can be distinguished, i.e. pioneers (20-25%), motivated (25-30%), unaware (30%) and reluctant (5-10%). Finally, the next steps to improve the research data management on the campus are presented.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Pathways of Research Software Preservation: An Educational and Planning Resource for Service Development"

Fernando Rios has published "The Pathways of Research Software Preservation: An Educational and Planning Resource for Service Development" in D-Lib Magazine.

Here's an excerpt:

Research communities, funders, publishers, and academic libraries have put much effort towards ensuring that research data are preserved. However, the same level of attention has not been given to the associated software used to process and analyze it. As a guide to those tasked with preserving research outputs, a novel visual representation of preservation approaches relevant to research software, termed the Pathways of Research Software Preservation, is presented. The Pathways are discussed in the context of service development within the Data Management Services group at Johns Hopkins University.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Towards Narrowing the Curation Gap—Theoretical Considerations and Lessons Learned from Decades of Practice"

Ana Sesartić, Andreas Fischlin, and Matthias Töwe ave published "Towards Narrowing the Curation Gap-Theoretical Considerations and Lessons Learned from Decades of Practice" in the ISPRS International Journal of Geo-Information.

Here's an excerpt:

Research as a digital enterprise has created new, often poorly addressed challenges for the management and curation of research to ensure continuity, transparency, and accountability. There is a common misunderstanding that curation can be considered at a later point in the research cycle or delegated or that it is too burdensome or too expensive due to a lack of efficient tools. This creates a curation gap between research practice and curation needs. We argue that this gap can be narrowed if curators provide attractive support that befits research needs and if researchers consistently manage their work according to generic concepts consistently from the beginning. A rather uniquely long-term case study demonstrates how such concepts have helped to pragmatically implement a research practice intentionally using only minimalist tools for sustained, self-contained archiving since 1989. The paper sketches the concepts underlying three core research activities. (i) handling of research data, (ii) reference management as part of scholarly publishing, and (iii) advancing theories through modelling and simulation. These concepts represent a universally transferable best research practice, while technical details are obviously prone to continuous change. We hope it stimulates researchers to manage research similarly and that curators gain a better understanding of the curation challenges research practice actually faces.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Academic Data Librarian Profession in Canada: History and Future Directions"

S. Vincent Gray and Elizabeth Hill have self-archived "The Academic Data Librarian Profession in Canada: History and Future Directions."

Here's an excerpt:

From the 1970s onward, Canadians have been active in developing services and establishing structures to support the dissemination of data. In recent years the academic data profession in Canada has largely developed around access to data from the national statistics agency, Statistics Canada, and around the services which have been developed to permit access to these data. This chapter will provide a historical background for these activities and explain how current and emerging trends continue to affect the profession.

Research Data Curation Bibliography, Version 6. Over 560 works. Over 200 works added. Live links. Selected abstracts. OA. CC-BY License. Covers topics such as research data creation, acquisition, metadata, repositories, provenance, management, policies, support services, funding agency requirements, peer review, publication, citation, sharing, reuse, and preservation.

"Scholarly Communication and Data"

Hailey Mooney has self-archived "Scholarly Communication and Data."

Here's an excerpt:

The purpose of this chapter is to provide foundational knowledge for the data librarian by developing an understanding of the place of data within the current paradigm of networked digital scholarly communication. This includes defining the nature of data and data publications, examining the open science movement and its effects on data sharing, and delving into the challenges inherent to the wider integration of data into the scholarly communication system and the academic library

Research Data Curation Bibliography, Version 6. Over 560 works. Over 200 works added. Live links. Selected abstracts. OA. CC-BY License. Covers topics such as research data creation, acquisition, metadata, repositories, provenance, management, policies, support services, funding agency requirements, peer review, publication, citation, sharing, reuse, and preservation.

Preserving Transactional Data

The Digital Preservation Coalition, UK Data Service, and Charles Beagrie Ltd. have released Preserving Transactional Data .

Here's an excerpt from the announcement:

This report tackles the requirements for preserving transactional data and the accompanying challenges facing companies and institutions that aim to re-use these data for analysis or research, presenting the issues and strategies which emphasize preservation practices that facilitate re-use and reproducibility.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Revisiting the Data Lifecycle with Big Data Curation"

Line Pouchard has published "Revisiting the Data Lifecycle with Big Data Curation" in the International Journal of Digital Curation.

Here's an excerpt:

As science becomes more data-intensive and collaborative, researchers increasingly use larger and more complex data to answer research questions. The capacity of storage infrastructure, the increased sophistication and deployment of sensors, the ubiquitous availability of computer clusters, the development of new analysis techniques, and larger collaborations allow researchers to address grand societal challenges in a way that is unprecedented. In parallel, research data repositories have been built to host research data in response to the requirements of sponsors that research data be publicly available. Libraries are re-inventing themselves to respond to a growing demand to manage, store, curate and preserve the data produced in the course of publicly funded research. As librarians and data managers are developing the tools and knowledge they need to meet these new expectations, they inevitably encounter conversations around Big Data. This paper explores definitions of Big Data that have coalesced in the last decade around four commonly mentioned characteristics: volume, variety, velocity, and veracity. We highlight the issues associated with each characteristic, particularly their impact on data management and curation. We use the methodological framework of the data life cycle model, assessing two models developed in the context of Big Data projects and find them lacking. We propose a Big Data life cycle model that includes activities focused on Big Data and more closely integrates curation with the research life cycle. These activities include planning, acquiring, preparing, analyzing, preserving, and discovering, with describing the data and assuring quality being an integral part of each activity. We discuss the relationship between institutional data curation repositories and new long-term data resources associated with high performance computing centers, and reproducibility in computational science. We apply this model by mapping the four characteristics of Big Data outlined above to each of the activities in the model. This mapping produces a set of questions that practitioners should be asking in a Big Data project

The article is under a Creative Commons Attribution 2.0 UK: England & Wales License.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Research Data Curation Bibliography, Version 6

Digital Scholarship has released Version 6 of the Research Data Curation Bibliography. This selective bibliography includes over 560 English-language articles, books, and technical reports that are useful in understanding the curation of digital research data in academic and other research institutions. Over 200 new works have been added to the bibliography since version five.

The Research Data Curation Bibliography covers topics such as research data creation, acquisition, metadata, repositories, provenance, management, policies, support services, funding agency requirements, peer review, publication, citation, sharing, reuse, and preservation.

Most sources have been published from January 2009 through May 2016; however, a limited number of earlier key sources are also included. The bibliography includes links to freely available versions of included works. If such versions are unavailable, links to the publishers' descriptions are provided.

Abstracts are included in this bibliography if a work is under a Creative Commons Attribution License (BY and national/international variations), a Creative Commons public domain dedication (CC0), or a Creative Commons Public Domain Mark and this is clearly indicated in the work.

The Research Data Curation Bibliography is under a Creative Commons Attribution 4.0 International License.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Digital Curation and Digital Stewardship Certificate Programs

The following universities offer digital curation and digital stewardship certificate programs:

This digital preservation certificate program may also be of interest:

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Report of the Summit on Digital Curation in Art Museums

Johns Hopkins University has released the Report of the Summit on Digital Curation in Art Museums.

Here's an excerpt:

In October of 2015, Johns Hopkins University (JHU) Museum Studies Program convened a group of cultural heritage professionals to discuss digital curation, its integration into the art museum community, and the role the JHU Program in Digital Curation might play in this effort. Attendees included representatives from museums, libraries, archives, foundations, and the JHU Museum Studies Program.

The meeting took place over two days. The first day and a half included a series of short presentations that addressed innovative projects; infrastructure, staffing and workflows; digital curation tools; curatorial considerations; internships, residencies and research opportunities; and local and international collaborations. . . .

Breakout sessions on the last afternoon moved the discussions from conceptual to pragmatic.

See also: Storified Tweets from Summit.

Digital Curation and Digital Preservation Works | Charles W. Bailey, Jr., Digital Artist | Digital Scholarship | Digital Scholarship Sitemap

"How to Party Like it’s 1999: Emulation for Everyone"

Dianne Dietrich, Julia Kim, Morgan McKeehan, and Alison Rhonemus have published "How to Party Like it's 1999: Emulation for Everyone" in the Code4Lib Journal.

Here's an excerpt:

Emulated access of complex media has long been discussed, but there are very few instances in which complex, interactive, born-digital emulations are available to researchers. New York Public Library has made 1980-90's era video games from 5.25? floppy disks in the Timothy Leary Papers accessible via a DosBox emulator. These games appear in various stages of development and display the work of at least four of Leary's collaborators on the games. 56 disk images from the Leary Papers are currently emulated in the reading room. New York University has made late 1990s-mid 2000's era Photoshop files from the Jeremy Blake Papers accessible to researchers. The Blake Papers include over 300 pieces of media. Cornell University Library was awarded a grant from the NEH to analyze approximately 100 born-digital artworks created for CD-ROM from the Rose Goldsen Archive of New Media Art to develop preservation workflows, access strategies, and metadata frameworks. Rhizome has undertaken a number of emulation projects as a major part of its preservation strategy for born-digital artworks. In cooperation with the University of Freiburg in Germany, Rhizome recently restored several digital artworks for public access using a cloud-based emulation framework. This framework (bwFLA) has been designed to facilitate the reenactments of software on a large scale, for internal use or public access. This paper will guide readers through how to implement emulation. Each of the institutions weigh in on oddities and idiosyncrasies they encountered throughout the process—from accession to access.

Digital Scholarship | Digital Scholarship Sitemap

"Calculating All that Jazz: Accurately Predicting Digital Storage Needs Utilizing Digitization Parameters for Analog Audio and Still Image Files"

Krista White has published "Calculating All that Jazz: Accurately Predicting Digital Storage Needs Utilizing Digitization Parameters for Analog Audio and Still Image Files" in Library Resources & Technical Services.

Here's an excerpt:

Much has been written about digitization projects over the last two decades; digital storage has been highlighted as a central feature of any digitization project, especially the need to purchase additional storage mechanisms to house digitized collections. What is missing from the library science literature is a method for reliably calculating digital storage needs on the basis of parameters for digitizing analog materials such as documents, photographs, and sound recordings in older formats.

Digital Scholarship | Digital Scholarship Sitemap

"Fulfill Your Digital Preservation Goals with a Budget Studio"

Yongli Zhou has published "Fulfill Your Digital Preservation Goals with a Budget Studio" in Information Technology and Libraries.

Here's an excerpt:

In order to fulfill digital preservation goals, many institutions use high-end scanners for in-house scanning of historical print and oversize materials. However, high-end scanners' prices do not fit in many small institutions' budget. As digital single-lens reflex (DSLR) camera technologies advance and camera prices drop quickly, a budget photography studio can help to achieve institutions' preservation goals. This paper compares images delivered by a high-end overhead scanner and a consumer level DSLR camera, discusses pros and cons of using each method, demonstrates how to set up a cost efficient shooting studio, and presents a budget estimate for a studio.

Digital Scholarship | Digital Scholarship Sitemap

"Migrating 2 and 3D Datasets: Preserving AutoCAD at the Archaeology Data Service"

Katie Green, Kieron Niven, and Georgina Field have published "Migrating 2 and 3D Datasets: Preserving AutoCAD at the Archaeology Data Service" in the ISPRS International Journal of Geo-Information.

Here's an excerpt:

The lessons learnt during the largescale CAD migration process presented in this paper provide an important insight into the digital preservation component of Research Data Management practice.

While the overall migration process presented in this paper was not a strict migration according to the OAIS model and in many cases essentially involved "re-archiving" data, the exercise itself was necessary for the long-term preservation of the data and was undertaken in such a way as to achieve the best possible outcome for both the ADS and data consumers. While elements of the process were both laborious and time consuming (and therefore costly), as a result of having to reassess original files in the SIP, this highlights the benefits of normalizing data at the point of ingest and the production of homogenous AIPs to stable, reliable standards and formats, reaffirming the importance of professional Research Data Management and preservation practices.

Digital Scholarship | Digital Scholarship Sitemap