"From Passive to Active, from Generic to Focussed: How Can an Institutional Data Archive Remain Relevant in a Rapidly Evolving Landscape?"

Maria J. Cruz et al. have published "From Passive to Active, From Generic to Focussed: How Can an Institutional Data Archive Remain Relevant in a Rapidly Evolving Landscape? " in the International Journal of Digital Curation.

Here's an excerpt:

Founded in 2008 as an initiative of the libraries of three of the four technical universities in the Netherlands, the 4TU.Centre for Research Data (4TU.Research Data) has provided a fully operational, cross-institutional, long-term archive since 2010, storing data from all subjects in applied sciences and engineering. Presently, over 90% of the data in the archive is geoscientific data coded in netCDF (Network Common Data Form)—a data format and data model that, although generic, is mostly used in climate, ocean and atmospheric sciences. In this practice paper, we explore the question of how 4TU.Research Data can stay relevant and forward-looking in a rapidly evolving research data management landscape. In particular, we describe the motivation behind this question and how we propose to address it.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Certification for Trustworthy Digital Repositories: "CoreTrustSeal: From Academic Collaboration to Sustainable Services"

Hervé L'Hours et al. have published "CoreTrustSeal: From Academic Collaboration to Sustainable Services" in IASSIST Quarterly.

Here's an excerpt:

National and international digital repositories must design and deliver sustainable services as a foundation for a range of scientific and data management infrastructures while reducing costs and avoiding duplication of effort. The CoreTrustSeal, launched in 2017, defines requirements and offers core level certification for Trustworthy Digital Repositories (TDR) holding data for long-term preservation. This paper traces the journey of the CoreTrustSeal through the Data Seal of Approval (DSA), ICSU World Data System (WDS), Research Data Alliance (RDA) working groups, and community engagement, towards becoming a sustainable service supporting global data infrastructure. We outline the design and delivery of the service, current activities, the benefits of certification to a range of communities, and future plans and challenges. As well as providing a historical narrative and current and future perspectives the CoreTrustSeal experience offers lessons for those developing standards and best practices, or seeking to develop cooperative and community-driven efforts which bridge data curation across academic disciplines and the governmental and private sectors.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Evaluating Zotero, SHERPA/RoMEO, and Unpaywall in an Institutional Repository Workflow "

Ashley D. R. Sergiadis has self-archived "Evaluating Zotero, SHERPA/RoMEO, and Unpaywall in an Institutional Repository Workflow."

Here's an excerpt:

East Tennessee State University developed a workflow to add journal publications to their institutional repository and faculty profiles using three tools: Zotero for entering metadata, SHERPA/RoMEO for checking copyright permissions, and Unpaywall for locating full-text documents. This study evaluates availability and accuracy of the information and documents provided by Zotero, SHERPA/RoMEO, and Unpaywall for journal publications in four disciplines.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Good News from Flickr about 500 Million CC and Public Domain Images: "Update on Creative Commons Licenses and ‘In Memoriam’ Accounts"

Flickr has released "Update on Creative Commons Licenses and 'In Memoriam' Accounts."

Here's an excerpt:

When we recently announced updates to Flickr Free accounts, we stated that freely licensed public photos (Creative Commons, public domain, U.S. government works, etc.) as of November 1, 2018 in excess of the free account limit would not be deleted. . . .

In this spirit, today we're going further and now protecting all public, freely licensed images on Flickr, regardless of the date they were uploaded. . . .

In conjunction with this announcement, we've disabled bulk license change tools in the Settings, the Camera Roll, and the Organizr for Flickr Free accounts. . . . Any member (Free or Pro) can still change the license of any of their photos on the photo page.

In memoriam accounts will preserve all public content in a deceased member's account, even if their Pro subscription lapses.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Ecosystem of Repository Migration"

Juliet L. Hardesty and Nicholas Homenda have published "The Ecosystem of Repository Migration" in Publications.

Here's an excerpt:

Indiana University was an early adopter of the Fedora repository, developing it as a home for heterogeneous digital library content from a variety of collections with unique content models. After joining the Hydra Project, now known as Samvera, in 2012, development progressed on a variety of applications that formed the foundation for digital library services using the Fedora 4 repository. These experiences have shaped migration planning to move from Fedora 3 to Fedora 4 for this large and inclusive set of digital content. Moving to Fedora 4 is not just a repository change; it is an ecosystem shift. End user interfaces for access, management systems for collection managers, and data structures are all impacted. This article shares what Indiana University has learned about migrating to Fedora 4 to help others work through their own migration considerations. This article is also meant to inspire the Fedora repository development community to offer ways to further ease migration work, sustaining Fedora users moving forward, and inviting new Fedora users to try the software and become involved in the community.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Bringing Citations and Usage Metrics Together to Make Data Count"

Helena Cousijn et al. have published "Bringing Citations and Usage Metrics Together to Make Data Count" in Data Science Journal.

Here's an excerpt:

Over the last years, many organizations have been working on infrastructure to facilitate sharing and reuse of research data. This means that researchers now have ways of making their data available, but not necessarily incentives to do so. Several Research Data Alliance (RDA) working groups have been working on ways to start measuring activities around research data to provide input for new Data Level Metrics (DLMs). These DLMs are a critical step towards providing researchers with credit for their work. In this paper, we describe the outcomes of the work of the Scholarly Link Exchange (Scholix) working group and the Data Usage Metrics working group. The Scholix working group developed a framework that allows organizations to expose and discover links between articles and datasets, thereby providing an indication of data citations. The Data Usage Metrics group works on a standard for the measurement and display of Data Usage Metrics. Here we explain how publishers and data repositories can contribute to and benefit from these initiatives. Together, these contributions feed into several hubs that enable data repositories to start displaying DLMs. Once these DLMs are available, researchers are in a better position to make their data count and be rewarded for their work.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Quality Issues of CRIS [Current Research Information System] Data: An Exploratory Investigation with Universities from Twelve Countries"

Otmane Azeroual and Joachim Schöpfel have published "Quality Issues of CRIS Data: An Exploratory Investigation with Universities from Twelve Countries" in Publications.

Here's an excerpt:

Collecting, integrating, storing and analyzing data in a database system is nothing new in itself. To introduce a current research information system (CRIS) means that scientific institutions must provide the required information on their research activities and research results at a high quality. A one-time cleanup is not sufficient; data must be continuously curated and maintained. Some data errors (such as missing values, spelling errors, inaccurate data, incorrect formatting, inconsistencies, etc.) can be traced across different data sources and are difficult to find. Small mistakes can make data unusable, and corrupted data can have serious consequences. The sooner quality issues are identified and remedied, the better. For this reason, new techniques and methods of data cleansing and data monitoring are required to ensure data quality and its measurability in the long term. This paper examines data quality issues in current research information systems and introduces new techniques and methods of data cleansing and data monitoring with which organizations can guarantee the quality of their data.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"When a Repository Is Not Enough: Redesigning a Digital Ecosystem to Serve Scholarly Communication"

Robin R. Sewell et al. have published "When a Repository Is Not Enough: Redesigning a Digital Ecosystem to Serve Scholarly Communication" in the Journal of Librarianship and Scholarly Communication.

Here's an excerpt:

INTRODUCTION Our library's digital asset management system (DAMS) was no longer meeting digital asset management requirements or expanding scholarly communication needs. We formed a multiunit task force (TF) to (1) survey and identify existing and emerging institutional needs; (2) research available DAMS (open source and proprietary) and assess their potential fit; and (3) deploy software locally for in-depth testing and evaluation. DESCRIPTION OF PROGRAM We winnowed a field of 25 potential DAMS down to 5 for deployment and evaluation. The process included selection and identification of test collections and the creation of a multipart task based rubric based on library and campus needs assessments. Time constraints and DAMS deployment limitations prompted a move toward a new evaluation iteration: a shorter criteria-based rubric. LESSONS LEARNED We discovered that no single DAMS was "just right," nor was any single DAMS a static product. Changing and expanding scholarly communication and digital needs could only be met by the more flexible approach offered by a multicomponent digital asset management ecosystem (DAME), described in this study. We encountered obstacles related to testing complex, rapidly evolving software available in a range of configurations and flavors (including tiers of vendor-hosted functionality) and time and capacity constraints curtailed in-depth testing. While we anticipate long-term benefits from "going further together" by including university-wide representation in the task force, there were trade-offs in distributing responsibilities and diffusing priorities. NEXT STEPS Shifts in scholarly communication at multiple levels—institutional, regional, consortial, national, and international—have already necessitated continual review and adjustment of our digital systems.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Improving the Discoverability and Web Impact of Open Repositories: Techniques and Evaluation"

George Macgregor has published "Improving the Discoverability and Web Impact of Open Repositories: Techniques and Evaluation" in Code4Lib Journal.

Here's an excerpt:

In this contribution we experiment with a suite of repository adjustments and improvements performed on Strathprints, the University of Strathclyde, Glasgow, institutional repository powered by EPrints 3.3.13. These adjustments were designed to support improved repository web visibility and user engagement, thereby improving usage. Although the experiments were performed on EPrints it is thought that most of the adopted improvements are equally applicable to any other repository platform. Following preliminary results reported elsewhere, and using Strathprints as a case study, this paper outlines the approaches implemented, reports on comparative search traffic data and usage metrics, and delivers conclusions on the efficacy of the techniques implemented. The evaluation provides persuasive evidence that specific enhancements to technical aspects of a repository can result in significant improvements to repository visibility, resulting in a greater web impact and consequent increases in content usage. COUNTER usage grew by 33% and traffic to Strathprints from Google and Google Scholar was found to increase by 63% and 99% respectively. Other insights from the evaluation are also explored. The results are likely to positively inform the work of repository practitioners and open scientists.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Data Discovery Paradigms: User Requirements and Recommendations for Data Repositories"

Mingfang Wu et al. have published "Data Discovery Paradigms: User Requirements and Recommendations for Data Repositories" in Data Science Journal (CC BY 4.0).

Here's an excerpt:

As data repositories make more data openly available it becomes challenging for researchers to find what they need either from a repository or through web search engines. This study attempts to investigate data users’ requirements and the role that data repositories can play in supporting data discoverability by meeting those requirements. We collected 79 data discovery use cases (or data search scenarios), from which we derived nine functional requirements for data repositories through qualitative analysis. We then applied usability heuristic evaluation and expert review methods to identify best practices that data repositories can implement to meet each functional requirement. We propose the following ten recommendations for data repository operators to consider for improving data discoverability and user’s data search experience:

1. Provide a range of query interfaces to accommodate various data search behaviours.

2. Provide multiple access points to find data.

3. Make it easier for researchers to judge relevance, accessibility and reusability of a data collection from a search summary.

4. Make individual metadata records readable and analysable.

5. Enable sharing and downloading of bibliographic references.

6. Expose data usage statistics.

7. Strive for consistency with other repositories.

8. Identify and aggregate metadata records that describe the same data object.

9. Make metadata records easily indexed and searchable by major web search engines.

10. Follow API search standards and community adopted vocabularies for interoperability.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap