Long-Term Sustainability of Data Archives: EUDAT Sustainability Plan

The EUDAT project has released the EUDAT Sustainability Plan.

Here's an excerpt:

We survey the current provision of infrastructure and long-term data archival services in Europe and review recent efforts to assess the costs involved in preserving research data (Chapters 1 to 4). To focus and constrain sustainability planning, we introduce a number of candidate guiding principles for EUDAT (Chapter 5) and suggest an overall logical model of its future shape, and a number of possible mechanisms for realising this model (Chapter 6). We discuss possible mechanisms to define levels of service and provide funding for a future EUDAT CDI, and introduce our intent to measure actual costs of delivering EUDAT services through an activity-based cost modelling exercise (Chapters 7 and 8).

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works Cover

| Digital Scholarship | Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works |

Status and Outlook for University of Michigan Research Profile Data Strategy

Natsuko Nicholls has self-archived Status and Outlook for University of Michigan Research Profile Data Strategy in Deep Blue.

Here's an excerpt:

My investigation into various faculty expertise efforts and activities across institutions shows that many universities have not yet developed or adopted a centralized, comprehensive university-wide system for expertise data collection and activity reporting. There is still substantial variation in procedures across departments and colleges within institutions and considerable duplication of effort across campus units. However, it is indeed the recent trend that many institutions—including the University of Michigan—have actively engaged in campus-wide discussions about research profile data curation needs, concluding that a more centralized system would provide incentives for timely data-entry, guarantee currency of the expertise data, and increase overall efficiency and data quality. This study also sheds light on the role of the academic library as an important stakeholder in expertise data collection and management. My findings suggest that various attributes of an academic library make it an ideal driver for research profile data management. The academic library is a strong resource for information technology expertise as well as information management and dissemination at any institution. Further, it tends to be a neutral and trusted entity, especially with employees who regularly engage with researchers and have a good understanding of the academic landscape and the needs of the research community. In addition to providing an overview of the research landscape where profiling needs are quickly rising and where benefits from a well-managed profile data system are widely understood, this study also illuminates the conventional use of expertise databases and research networking/discovery tools as well as Current Research Information Systems (CRIS).

| Research Data Curation Bibliography | Digital Scholarship |

"Where Have All the Games Gone? An Exploratory Study of Digital Game Preservation"

Joanna Barwick has self-archived her doctoral thesis, "Where Have All the Games Gone? An Exploratory Study of Digital Game Preservation," in the Loughborough University Institutional Repository.

Here's an excerpt:

Investigating the relationship of games to culture; reviewing current preservation activities and drawing conclusions about the value of digital games and the significance of their preservation were the study's objectives. These have been achieved through interviews with key stakeholders—the academic community, as potential users of collections; memory institutions, as potential keepers of collections; fan-based game preservation experts; and representatives from the games industry. In addition to this, case studies of key game preservation activities were explored.

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works Cover

| Digital Scholarship | Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works |

"Digital Curation in the Academic Library Job Market"

Jeonghyun Kim, Edward Warga, and William Moen have published "Digital Curation in the Academic Library Job Market" in ASIST 2012: Proceedings of the 75th ASIS&T Annual Meeting.

Here's an excerpt:

This study of job advertisements for academic library positions is one activity of a current capacity building project, Information: Curate, Archive, Manage, Preserve (iCAMP). In this project, we are developing a four-course masters level curriculum for digital curation and data management. It deploys a competency-based curriculum approach (Moen, Kim, Warga, Wakefield, & Halbert, 2011). This analysis of job advertisements was carried out to identify and define knowledge, skills, and abilities as a part of the competency development process.

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works Cover

|Digital Scholarship |

"Context and Its Role in the Digital Preservation of Cultural Objects"

Joan E. Beaudoin has published "Context and Its Role in the Digital Preservation of Cultural Objects" in the latest issue of D-Lib Magazine.

Here's an excerpt:

In discussions surrounding digital preservation, context—those properties of an object related to its creation and preservation that make the object's origins, composition, and purpose clear—has been identified as a critical aspect of preservation metadata. Understanding a cultural object's context, in as much detail as possible, is necessary to the successful future use of that object, regardless of its form. The necessity of capturing data about the creation of digital resources and the technical details of the preservation process, has generally been agreed. Capturing many other contextual aspects—such as utility, history, curation, authenticity—that would certainly contribute to successful retrieval, assessment, management, access, and use of preserved digital content, has not been adequately addressed or codified. Recording these aspects of contextual information is especially important for physical objects that are digitally preserved, and thereby removed from their original setting. This paper investigates the various discussions in the literature surrounding contextual information, and then presents a framework which makes explicit the various dimensions of context which have been identified as useful for digital preservation efforts, and offers a way to ensure the capture those aspects of an object's context that are often missed.

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works Cover

|Digital Scholarship |

UNC at Chapel Hill Offers Post-Masters Certificate in Data Curation

The School of Information and Library Science at the University of North Carolina at Chapel Hill is now offering a Post-Masters Certificate in Data Curation.

Here's an excerpt from the announcement:

With a two-week intensive kick-off on the UNC at Chapel Hill campus during summer session (May 2013), the remainder of the program will be taught online and includes guided projects that arise from a student's work experience. The 30 credit program can be completed in two years.

Defined by Drs. Helen Tibbo, alumni distinguished professor, and Christopher (Cal) Lee, associate professor at SILS, "Digital/data curation involves selection and appraisal by creators and archivists; evolving provision of intellectual access; redundant storage; data transformations; and, for some materials a commitment to long-term preservation. Digital/data curation is stewardship that provides for the reproducibility and re-use of authentic digital data and other digital assets. Development of trustworthy and durable digital repositories; principles of sound metadata creation and capture; use of open standards for file formats and data encoding; and the promotion of information management literacy are all essential to the longevity of digital resources and the success of curation efforts."

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works Cover

|Digital Scholarship |

DuraSpace Gets $861,000 Grant to Develop DuraCloud Data Services

DuraSpace has received a two-year $861,000 grant from the Gordon and Betty Moore Foundation to develop DuraCloud data services.

Here's an excerpt from the press release:

Currently, DuraCloud provides a reliable way to preserve and archive research materials in the cloud, a solution developed within the academic community for academic institutions. During the next phase of DuraCloud development, additional applications, features, and services will be built to extend the cloud in order to facilitate data archiving and content management. DuraSpace offers DuraCloud as a software as a service that enables archiving, preserving, and managing institutional content using cloud storage and intends to expand its service offerings in the next phase of development.

Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works Cover

|Digital Scholarship |

Curating for Quality: Ensuring Data Quality to Enable New Science

The UNC School of Information & Library Science has released Curating for Quality: Ensuring Data Quality to Enable New Science.

Here's an excerpt:

The National Science Foundation sponsored a workshop on September 10 and 11, 2012, in Arlington, Virginia on "Curating for Quality: Ensuring Data Quality to Enable New Science." Individuals from government, academic and industry settings gathered to discuss issues, strategies and priorities for ensuring quality in collections of data. This workshop aimed to define data quality research issues and potential solutions. The workshop objectives were organized into four clusters: (1) data quality criteria and contexts, (2) human and institutional factors, (3) tools for effective and painless curation, and (4) metrics for data quality. . . .

The workshop identified several key challenges that include:

  • selection strategies—how to determine what is most valuable to preserve
  • how much and which context to include—how to insure that data is interpretable and usable in the future, what metadata to include
  • tools and techniques to support painless curation—creating and sharing tools and techniques that apply across disciplines
  • cost and accountability models—how to balance selection, context decisions with cost constraints.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Intellectual Property Rights for Digital Preservation

The Digital Preservation Coalition has released Intellectual Property Rights for Digital Preservation.

Here's an excerpt:

While a number of legal issues colour contemporary approaches to, and practices of, digital preservation, it is arguable that intellectual property law, represented principally by copyright and its related rights, has been by far the most dominant, and often intractable, influence. It is thus essential for those engaging in digital preservation to understand the letter of the law as it applies to digital preservation, but equally important to be able to identify and implement practical and pragmatic strategies for handling legal risks relating to intellectual property rights in the pursuit of preservation objectives. . . .

This report is aimed primarily at depositors, archivists and researchers/re-users of digital works, but will provide a concise introduction to the subject matter for policymakers and the general public.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Committee Formed to Examine National-Scale Higher Education Digital Projects

The Council on Library and Information Resources and Vanderbilt University have formed the Committee on Coherence at Scale for Higher Education to examine national-scale higher education digital projects.

Here's an excerpt from the press release:

The group, called the Committee on Coherence at Scale for Higher Education, comprises college and university presidents and provosts, deans, university librarians, and association heads. The committee will provide the leadership necessary to ensure that these projects are designed and developed as elements of a larger and encompassing digital environment. . . .

The committee will focus on research and analysis of the large projects and their correlation; initial costs, operating costs and business plans for sustainability; and benefits and transformational aspects. Examples of these projects include the Hathi Trust, the Digital Public Library of America, the Digital Preservation Network, and data curation centers. Results of the committee's work will be publicized regularly.

| Digital Curation Resource Guide | Digital Scholarship |

"A Sample of Research Data Curation and Management Courses"

Andrew T. Creamer et al. have published "A Sample of Research Data Curation and Management Courses" in the latest issue of the Journal of eScience Librarianship.

Here's an excerpt:

This paper identifies a sample of research data curation and management courses available at American Library Association-accredited Library and Information Science (LIS) Programs in North America. . . .

Only 13 (22%) of LIS programs currently offer a course focused on the management and curation of research data. . . .

Although the literature supports LIS professionals adopting new roles and engaging in eScience and data management, most LIS data-related programs do not have a separate course solely focused on research data management. More LIS programs will need to adapt their curricula in order to help students and practicing professionals develop the needed competencies in research data curation and management.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

California Digital Library and Partners Launch DataUp Data Management Tool

The California Digital Library and its partners have launched the DataUp data management tool.

Here's an excerpt from the press release:

Researchers struggling to meet new data management requirements from funders, journals and their own institutions now can use the DataUp Web application and a Microsoft Excel add-in to document and archive their tabular data. . . .

The DataUp add-in operates within a program many researchers already use: Microsoft Excel. The Web application allows users to upload tabular data in either Excel format or comma-separated value (CSV) format. Both the add-in and the Web application allow users to:

  • Perform a "best practices check" to ensure data are well-formatted and organized
  • Create standardized metadata, or a description of the data, using a wizard-style template
  • Retrieve a unique identifier for their dataset from their data repository
  • Post their datasets and associated metadata to the repository.

Although hundreds of data repositories are available for archiving, many scientific researchers either are unaware of their existence or do not know how to access them. One of the major outcomes of the DataUp project is the ONEShare repository, created specifically for DataUp, where users can deposit tabular data and metadata directly from the tool.

An added advantage of ONEShare is its connection to the DataONE network of repositories. DataONE links existing data centers and enables users to search for data across participating repositories by using a single search interface. Data deposited into ONEShare will be indexed and made available by any DataONE user, facilitating collaboration and enabling data re-use.

| Research Data Curation Bibliography | Digital Scholarship |

"LOCKSS Boxes in the Cloud"

David S. H. Rosenthal and Daniel L. Vargas have self-archived "LOCKSS Boxes in the Cloud" at the LOCKSS website.

Here's an excerpt:

The 30-year history of raw disk costs shows a drop of at least 30% per year. The history of cloud storage costs from commercial providers shows that they drop at most 3% per year. Until there is a radical change in one or other of these cost curves it clear that cloud storage is not even close to cost-competitive with local disk storage for long-term preservation purposes in general, and LOCKSS boxes in particular.

| Digital Curation and Preservation Bibliography 2010 | Digital Scholarship |

"Academic Libraries as Data Quality Hubs"

Michael Joseph Giarlo has self-archived a preprint of "Academic Libraries as Data Quality Hubs" in ScholarSphere.

Here's an excerpt:

This position paper argues that academic libraries have a critical role to play serving as data quality hubs on campus, based on the need for increased data quality for "e-science" and on academic libraries' record of providing digital curation and preservation services. Scientific data are shown to be sufficiently at risk to demonstrate a clear niche for such services to be provided. Data quality measurements are defined, and digital curation processes are explained and mapped to these measurements in order to establish that academic libraries already have sufficient competencies "in-house" to provide data quality services. Opportunities for improvement and challenges are identified as areas that are fruitful for future research and exploration.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

"The Data Conservancy Instance: Infrastructure and Organizational Services for Research Data Curation"

Matthew S. Mayernik, G. Sayeed Choudhury, Tim DiLauro, Elliot Metsger, Barbara Pralle, Mike Rippin, and Ruth Duerr have published "The Data Conservancy Instance: Infrastructure and Organizational Services for Research Data Curation" in the latest issue of D-LIB Magazine.

Here's an excerpt:

Digital research data can only be managed and preserved over time through a sustained institutional commitment. Research data curation is a multi-faceted issue, requiring technologies, organizational structures, and human knowledge and skills to come together in complementary ways. This article provides a high-level description of the Data Conservancy Instance, an implementation of infrastructure and organizational services for data collection, storage, preservation, archiving, curation, and sharing. While comparable to institutional repository systems and disciplinary data repositories in some aspects, the DC Instance is distinguished by featuring a data-centric architecture, discipline-agnostic data model, and a data feature extraction framework that facilitates data integration and cross-disciplinary queries. The Data Conservancy Instance is intended to support, and be supported by, a skilled data curation staff, and to facilitate technical, financial, and human sustainability of organizational data curation services. The Johns Hopkins University Data Management Services (JHU DMS) are described as an example of how the Data Conservancy Instance can be deployed.

| Digital Curation Resource Guide | Digital Scholarship |

Digital Preservation: Swatting the Long Tail of Digital Media: A Call for Collaboration

OCLC Research has released Swatting the Long Tail of Digital Media: A Call for Collaboration.

Here's an excerpt:

It is difficult to do much with digital media unless you can read its content and transfer that content to more stable media. Few institutions can be expected to manage all media types. In order to make real progress in preserving and providing access to born-digital content, libraries and archives need to leverage specialized resources and expertise across the community. In this paper I posit the need for SWAT (software and workstations for antiquated technology) sites: organizations or institutions that are willing to put their expertise to use for the benefit of the broader community by providing specialized services to institutions with limited resources.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Digital Curation and the Cloud: Final Report

JISC has released Digital Curation and the Cloud: Final Report. This is a revised version of the draft report that was released earlier this year.

Here's an excerpt:

Digital curation involves a wide range of activities, many of which may be suitable for deployment within a cloud environment. These range from infrequent, resource-intensive tasks which will benefit from the ability to rapidly provision resources, to day-to-day collaborative activities which can be facilitated by networked cloud services. Associated benefits are offset by risks such as loss of data or service level, legal and governance incompatibilities and transfer bottlenecks. There is considerable variability across both risks and benefits according to the service and deployment models being adopted and the context in which activities are performed. Some risks, such as legal liabilities, are mitigated by the use of alternatives, for example, private cloud models, but this is typically at the expense of benefits such as resource elasticity and economies of scale.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Key Digital Preservation Standard Updated: Open Archival Information System (OAIS)

ISO has published ISO 14721:2012: Space Data and Information Transfer Systems—Open Archival Information System (OAIS)—Reference Model. A PDF version with marked changes is available from the Consultative Committee for Space Data Systems.

Here's an excerpt:

This reference model:

  • provides a framework for the understanding and increased awareness of archival concepts needed for Long Term digital information preservation and access;
  • provides the concepts needed by non-archival organizations to be effective participants in the preservation process;
  • provides a framework, including terminology and concepts, for describing and comparing architectures and operations of existing and future Archives;
  • provides a framework for describing and comparing different Long Term Preservation strategies and techniques;
  • provides a basis for comparing the data models of digital information preserved by Archives and for discussing how data models and the underlying information may change over time;
  • provides a framework that may be expanded by other efforts to cover Long Term Preservation of information that is NOT in digital form (e.g., physical media and physical samples);

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

You’ve Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media

OCLC Research has released You've Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media,

Here's an excerpt from the announcement:

You've Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media is intended for anyone who doesn't know where to begin in managing born-digital materials. It errs on the side of simplicity and describes what is truly necessary to start managing born-digital content on physical media, and it presents a list of the basic steps without expanding on archival theory or the use of particular software tools. It does not assume that policies are in place or that those performing the tasks are familiar with traditional archival practices, nor does it assume that significant IT support is available.

Read more about it at "Defining 'Born Digital': An Essay by Ricky Erway, OCLC Research."

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |

Best Practices for Citability of Data and Evolving Roles in Scholarly Communication

Opportunities for Data Exchange has released Best Practices for Citability of Data and Evolving Roles in Scholarly Communication.

Here's an excerpt:

This report sets out the current thinking on data citation best practice and presents the results of a survey of librarians asking how new support roles could and should be developed. The findings presented here build on the extensive desk research carried out for the report "Integration of Data and Publication" (Reilly, Schallier, Schrimpf, Smit, & Wilkinson, Sept 2011), which identified that data citation was an area of opportunity for both researchers and libraries. That report also recounted the findings of a workshop held at the LIBER 2011 Conference in Barcelona. . . .This previous work is supported here with further information gathered through extensive desk research, structured interviews and an online survey of LIBER members to explore best practice in data citation and evolving support roles for libraries.

| Research Data Curation Bibliography | Digital Scholarship |

Sharing Research Data: Compilation of Results on Drivers and Barriers and New Opportunities

Opportunities for Data Exchange has released Compilation of Results on Drivers and Barriers and New Opportunities.

Here's an excerpt:

Opportunities for Data Exchange (ODE) is a FP7 Project carried out by members of the Alliance for Permanent Access (APA), which is gathering evidence to support strategic investment in the emerging e-Infrastructure for data sharing, re-use and preservation. The ODE Conceptual Model has been developed within the Project to characterise the process of data sharing and the factors which give rise to variations in data sharing for different parties involved. Within the overall Conceptual Model there can be identified models of process, of context, and of drivers, barriers and enablers. The Conceptual Model has been evolved on the basis of existing knowledge and expertise, and draws on research conducted both outside of the ODE Project and in earlier stages of the Project itself (Sections 1-2).

| Research Data Curation Bibliography | Digital Scholarship |

Digital Preservation: SiteStory Released

Herbert van de Sompel has announced the release of SiteStory.

Here's an excerpt:

I am very pleased to announce the open source release of our SiteStory transactional web archiving solution. The solution is compatible with the Memento "Time Travel for the Web" framework and its current implementation can be used to archive Apache web servers.

Read more about it at Memento: Adding Time to the Web.

| Digital Curation Bibliography: Preservation and Stewardship of Scholarly Works | Digital Scholarship |