UNC at Chapel Hill Offers Post-Masters Certificate in Data Curation

The School of Information and Library Science at the University of North Carolina at Chapel Hill is now offering a Post-Masters Certificate in Data Curation.

Here's an excerpt from the announcement:

With a two-week intensive kick-off on the UNC at Chapel Hill campus during summer session (May 2013), the remainder of the program will be taught online and includes guided projects that arise from a student's work experience. The 30 credit program can be completed in two years.

Defined by Drs. Helen Tibbo, alumni distinguished professor, and Christopher (Cal) Lee, associate professor at SILS, "Digital/data curation involves selection and appraisal by creators and archivists; evolving provision of intellectual access; redundant storage; data transformations; and, for some materials a commitment to long-term preservation. Digital/data curation is stewardship that provides for the reproducibility and re-use of authentic digital data and other digital assets. Development of trustworthy and durable digital repositories; principles of sound metadata creation and capture; use of open standards for file formats and data encoding; and the promotion of information management literacy are all essential to the longevity of digital resources and the success of curation efforts."

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works Cover

|Digital Scholarship |

DuraSpace Gets $861,000 Grant to Develop DuraCloud Data Services

DuraSpace has received a two-year $861,000 grant from the Gordon and Betty Moore Foundation to develop DuraCloud data services.

Here's an excerpt from the press release:

Currently, DuraCloud provides a reliable way to preserve and archive research materials in the cloud, a solution developed within the academic community for academic institutions. During the next phase of DuraCloud development, additional applications, features, and services will be built to extend the cloud in order to facilitate data archiving and content management. DuraSpace offers DuraCloud as a software as a service that enables archiving, preserving, and managing institutional content using cloud storage and intends to expand its service offerings in the next phase of development.

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works Cover

|Digital Scholarship |

Curating for Quality: Ensuring Data Quality to Enable New Science

The UNC School of Information & Library Science has released Curating for Quality: Ensuring Data Quality to Enable New Science.

Here's an excerpt:

The National Science Foundation sponsored a workshop on September 10 and 11, 2012, in Arlington, Virginia on "Curating for Quality: Ensuring Data Quality to Enable New Science." Individuals from government, academic and industry settings gathered to discuss issues, strategies and priorities for ensuring quality in collections of data. This workshop aimed to define data quality research issues and potential solutions. The workshop objectives were organized into four clusters: (1) data quality criteria and contexts, (2) human and institutional factors, (3) tools for effective and painless curation, and (4) metrics for data quality. . . .

The workshop identified several key challenges that include:

  • selection strategies—how to determine what is most valuable to preserve
  • how much and which context to include—how to insure that data is interpretable and usable in the future, what metadata to include
  • tools and techniques to support painless curation—creating and sharing tools and techniques that apply across disciplines
  • cost and accountability models—how to balance selection, context decisions with cost constraints.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Intellectual Property Rights for Digital Preservation

The Digital Preservation Coalition has released Intellectual Property Rights for Digital Preservation.

Here's an excerpt:

While a number of legal issues colour contemporary approaches to, and practices of, digital preservation, it is arguable that intellectual property law, represented principally by copyright and its related rights, has been by far the most dominant, and often intractable, influence. It is thus essential for those engaging in digital preservation to understand the letter of the law as it applies to digital preservation, but equally important to be able to identify and implement practical and pragmatic strategies for handling legal risks relating to intellectual property rights in the pursuit of preservation objectives. . . .

This report is aimed primarily at depositors, archivists and researchers/re-users of digital works, but will provide a concise introduction to the subject matter for policymakers and the general public.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Committee Formed to Examine National-Scale Higher Education Digital Projects

The Council on Library and Information Resources and Vanderbilt University have formed the Committee on Coherence at Scale for Higher Education to examine national-scale higher education digital projects.

Here's an excerpt from the press release:

The group, called the Committee on Coherence at Scale for Higher Education, comprises college and university presidents and provosts, deans, university librarians, and association heads. The committee will provide the leadership necessary to ensure that these projects are designed and developed as elements of a larger and encompassing digital environment. . . .

The committee will focus on research and analysis of the large projects and their correlation; initial costs, operating costs and business plans for sustainability; and benefits and transformational aspects. Examples of these projects include the Hathi Trust, the Digital Public Library of America, the Digital Preservation Network, and data curation centers. Results of the committee's work will be publicized regularly.

| Digital Curation Resource Guide | Digital Scholarship |

"A Sample of Research Data Curation and Management Courses"

Andrew T. Creamer et al. have published "A Sample of Research Data Curation and Management Courses" in the latest issue of the Journal of eScience Librarianship.

Here's an excerpt:

This paper identifies a sample of research data curation and management courses available at American Library Association-accredited Library and Information Science (LIS) Programs in North America. . . .

Only 13 (22%) of LIS programs currently offer a course focused on the management and curation of research data. . . .

Although the literature supports LIS professionals adopting new roles and engaging in eScience and data management, most LIS data-related programs do not have a separate course solely focused on research data management. More LIS programs will need to adapt their curricula in order to help students and practicing professionals develop the needed competencies in research data curation and management.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

California Digital Library and Partners Launch DataUp Data Management Tool

The California Digital Library and its partners have launched the DataUp data management tool.

Here's an excerpt from the press release:

Researchers struggling to meet new data management requirements from funders, journals and their own institutions now can use the DataUp Web application and a Microsoft Excel add-in to document and archive their tabular data. . . .

The DataUp add-in operates within a program many researchers already use: Microsoft Excel. The Web application allows users to upload tabular data in either Excel format or comma-separated value (CSV) format. Both the add-in and the Web application allow users to:

  • Perform a "best practices check" to ensure data are well-formatted and organized
  • Create standardized metadata, or a description of the data, using a wizard-style template
  • Retrieve a unique identifier for their dataset from their data repository
  • Post their datasets and associated metadata to the repository.

Although hundreds of data repositories are available for archiving, many scientific researchers either are unaware of their existence or do not know how to access them. One of the major outcomes of the DataUp project is the ONEShare repository, created specifically for DataUp, where users can deposit tabular data and metadata directly from the tool.

An added advantage of ONEShare is its connection to the DataONE network of repositories. DataONE links existing data centers and enables users to search for data across participating repositories by using a single search interface. Data deposited into ONEShare will be indexed and made available by any DataONE user, facilitating collaboration and enabling data re-use.

| Research Data Curation Bibliography | Digital Scholarship |

"LOCKSS Boxes in the Cloud"

David S. H. Rosenthal and Daniel L. Vargas have self-archived "LOCKSS Boxes in the Cloud" at the LOCKSS website.

Here's an excerpt:

The 30-year history of raw disk costs shows a drop of at least 30% per year. The history of cloud storage costs from commercial providers shows that they drop at most 3% per year. Until there is a radical change in one or other of these cost curves it clear that cloud storage is not even close to cost-competitive with local disk storage for long-term preservation purposes in general, and LOCKSS boxes in particular.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

"Academic Libraries as Data Quality Hubs"

Michael Joseph Giarlo has self-archived a preprint of "Academic Libraries as Data Quality Hubs" in ScholarSphere.

Here's an excerpt:

This position paper argues that academic libraries have a critical role to play serving as data quality hubs on campus, based on the need for increased data quality for "e-science" and on academic libraries' record of providing digital curation and preservation services. Scientific data are shown to be sufficiently at risk to demonstrate a clear niche for such services to be provided. Data quality measurements are defined, and digital curation processes are explained and mapped to these measurements in order to establish that academic libraries already have sufficient competencies "in-house" to provide data quality services. Opportunities for improvement and challenges are identified as areas that are fruitful for future research and exploration.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

"The Data Conservancy Instance: Infrastructure and Organizational Services for Research Data Curation"

Matthew S. Mayernik, G. Sayeed Choudhury, Tim DiLauro, Elliot Metsger, Barbara Pralle, Mike Rippin, and Ruth Duerr have published "The Data Conservancy Instance: Infrastructure and Organizational Services for Research Data Curation" in the latest issue of D-LIB Magazine.

Here's an excerpt:

Digital research data can only be managed and preserved over time through a sustained institutional commitment. Research data curation is a multi-faceted issue, requiring technologies, organizational structures, and human knowledge and skills to come together in complementary ways. This article provides a high-level description of the Data Conservancy Instance, an implementation of infrastructure and organizational services for data collection, storage, preservation, archiving, curation, and sharing. While comparable to institutional repository systems and disciplinary data repositories in some aspects, the DC Instance is distinguished by featuring a data-centric architecture, discipline-agnostic data model, and a data feature extraction framework that facilitates data integration and cross-disciplinary queries. The Data Conservancy Instance is intended to support, and be supported by, a skilled data curation staff, and to facilitate technical, financial, and human sustainability of organizational data curation services. The Johns Hopkins University Data Management Services (JHU DMS) are described as an example of how the Data Conservancy Instance can be deployed.

| Digital Curation Resource Guide | Digital Scholarship |

Digital Preservation: Swatting the Long Tail of Digital Media: A Call for Collaboration

OCLC Research has released Swatting the Long Tail of Digital Media: A Call for Collaboration.

Here's an excerpt:

It is difficult to do much with digital media unless you can read its content and transfer that content to more stable media. Few institutions can be expected to manage all media types. In order to make real progress in preserving and providing access to born-digital content, libraries and archives need to leverage specialized resources and expertise across the community. In this paper I posit the need for SWAT (software and workstations for antiquated technology) sites: organizations or institutions that are willing to put their expertise to use for the benefit of the broader community by providing specialized services to institutions with limited resources.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Digital Curation and the Cloud: Final Report

JISC has released Digital Curation and the Cloud: Final Report. This is a revised version of the draft report that was released earlier this year.

Here's an excerpt:

Digital curation involves a wide range of activities, many of which may be suitable for deployment within a cloud environment. These range from infrequent, resource-intensive tasks which will benefit from the ability to rapidly provision resources, to day-to-day collaborative activities which can be facilitated by networked cloud services. Associated benefits are offset by risks such as loss of data or service level, legal and governance incompatibilities and transfer bottlenecks. There is considerable variability across both risks and benefits according to the service and deployment models being adopted and the context in which activities are performed. Some risks, such as legal liabilities, are mitigated by the use of alternatives, for example, private cloud models, but this is typically at the expense of benefits such as resource elasticity and economies of scale.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Key Digital Preservation Standard Updated: Open Archival Information System (OAIS)

ISO has published ISO 14721:2012: Space Data and Information Transfer Systems—Open Archival Information System (OAIS)—Reference Model. A PDF version with marked changes is available from the Consultative Committee for Space Data Systems.

Here's an excerpt:

This reference model:

  • provides a framework for the understanding and increased awareness of archival concepts needed for Long Term digital information preservation and access;
  • provides the concepts needed by non-archival organizations to be effective participants in the preservation process;
  • provides a framework, including terminology and concepts, for describing and comparing architectures and operations of existing and future Archives;
  • provides a framework for describing and comparing different Long Term Preservation strategies and techniques;
  • provides a basis for comparing the data models of digital information preserved by Archives and for discussing how data models and the underlying information may change over time;
  • provides a framework that may be expanded by other efforts to cover Long Term Preservation of information that is NOT in digital form (e.g., physical media and physical samples);

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

You’ve Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media

OCLC Research has released You've Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media,

Here's an excerpt from the announcement:

You've Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media is intended for anyone who doesn't know where to begin in managing born-digital materials. It errs on the side of simplicity and describes what is truly necessary to start managing born-digital content on physical media, and it presents a list of the basic steps without expanding on archival theory or the use of particular software tools. It does not assume that policies are in place or that those performing the tasks are familiar with traditional archival practices, nor does it assume that significant IT support is available.

Read more about it at "Defining 'Born Digital': An Essay by Ricky Erway, OCLC Research."

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Best Practices for Citability of Data and Evolving Roles in Scholarly Communication

Opportunities for Data Exchange has released Best Practices for Citability of Data and Evolving Roles in Scholarly Communication.

Here's an excerpt:

This report sets out the current thinking on data citation best practice and presents the results of a survey of librarians asking how new support roles could and should be developed. The findings presented here build on the extensive desk research carried out for the report "Integration of Data and Publication" (Reilly, Schallier, Schrimpf, Smit, & Wilkinson, Sept 2011), which identified that data citation was an area of opportunity for both researchers and libraries. That report also recounted the findings of a workshop held at the LIBER 2011 Conference in Barcelona. . . .This previous work is supported here with further information gathered through extensive desk research, structured interviews and an online survey of LIBER members to explore best practice in data citation and evolving support roles for libraries.

| Research Data Curation Bibliography | Digital Scholarship |

Sharing Research Data: Compilation of Results on Drivers and Barriers and New Opportunities

Opportunities for Data Exchange has released Compilation of Results on Drivers and Barriers and New Opportunities.

Here's an excerpt:

Opportunities for Data Exchange (ODE) is a FP7 Project carried out by members of the Alliance for Permanent Access (APA), which is gathering evidence to support strategic investment in the emerging e-Infrastructure for data sharing, re-use and preservation. The ODE Conceptual Model has been developed within the Project to characterise the process of data sharing and the factors which give rise to variations in data sharing for different parties involved. Within the overall Conceptual Model there can be identified models of process, of context, and of drivers, barriers and enablers. The Conceptual Model has been evolved on the basis of existing knowledge and expertise, and draws on research conducted both outside of the ODE Project and in earlier stages of the Project itself (Sections 1-2).

| Research Data Curation Bibliography | Digital Scholarship |

Digital Preservation: SiteStory Released

Herbert van de Sompel has announced the release of SiteStory.

Here's an excerpt:

I am very pleased to announce the open source release of our SiteStory transactional web archiving solution. The solution is compatible with the Memento "Time Travel for the Web" framework and its current implementation can be used to archive Apache web servers.

Read more about it at Memento: Adding Time to the Web.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Minimum Digitization Capture Recommendations (Draft)

The Association for Library Collections and Technical Services Preservation and Reformatting Section has released a draft of its Minimum Digitization Capture Recommendations. The comment period ends on 12/31/2012.

Here's an excerpt:

This document was created as a guideline for libraries digitizing content with the objective of producing a product that will not be re-digitized at a later point. Institutions can feel secure that if an item has been digitized at, or above, these specifications, they can depend on it to continue to be viable in the future. These guidelines only speak to the technical specifications of the digitized content itself and not to the larger issue of digitally preserving said content. In some cases, institutions may want to request a digital copy to preserve themselves further safeguarding materials by preserving them in multiple locations.

| Digital Curation Resource Guide | Digital Scholarship |

Aligning National Approaches to Digital Preservation

The Educopia Institute has released Aligning National Approaches to Digital Preservation.

Here's an excerpt:

On May 23-25, 2011, more than 125 delegates from more than 20 countries gathered in Tallinn, Estonia, for the "Aligning National Approaches to Digital Preservation" conference. . . .

This publication contains a collection of peer-reviewed essays that were developed by conference panels and attendees in the months following ANADP. Rather than simply chronicling the event, the volume intends to broaden and deepen its impact by reflecting on the ANADP presentations and conversations and establishing a set of starting points for building a greater alignment across digital preservation initiatives. Above all, it highlights the need for strategic international collaborations to support the preservation of our collective cultural memory.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

"De-Mystifying the Data Management Requirements of Research Funders"

Dianne Dietrich, Trisha Adamus, Alison Miner, and Gail Steinhart have published "De-Mystifying the Data Management Requirements of Research Funders" in the latest issue of Issues in Science and Technology Librarianship.

Here's an excerpt:

Research libraries have sought to apply their information management expertise to the management of digital research data. This focus has been spurred in part by the policies of two major funding agencies in the United States, which require grant recipients make research outputs, including publications and research data, openly available. As many academic libraries are beginning to offer or are already offering assistance in writing and implementing data management plans, it is important to consider how best to support researchers. Our research examined the current data management requirements of major US funding agencies to better understand data management requirements facing researchers and the implications for libraries offering data management services for researchers.

| Research Data Curation Bibliography | Digital Scholarship |

Testing Software Tools of Potential Interest for Digital Preservation Activities at the National Library of Australia

The National Library of Australia has released Testing Software Tools of Potential Interest for Digital Preservation Activities at the National Library of Australia.

Here's an excerpt:

Four file format identification tools were tested: File Investigator Engine, Outside-In File ID, FIDO and file/libmagic. This represents a mix of commercial and open source tools. The results were analysed from the point of view of comparing the tools to determine the extent of coverage and the level of agreement between them.

Five metadata extraction tools were tested: File Investigator Engine, Exiftool, MediaInfo, pdfinfo and Apache Tika. The results were analysed in terms of the number and range of metadata items extracted for specific file subsets.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Digital Curation Resource Guide

Digital Scholarship has released the Digital Curation Resource Guide.

This resource guide presents over 200 selected English-language websites and documents that are useful in understanding and conducting digital curation. It covers academic programs, discussion lists and groups, glossaries, file formats and guidelines, metadata standards and vocabularies, models, organizations, policies, research data management, serials and blogs, services and vendor software, software and tools, and training. It is available under a Creative Commons Attribution-NonCommercial 3.0 Unported License.

The Digital Curation Resource Guide complements the Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works, which was released in June.

It is also available as an EPUB file (see How to Read EPUB Files).