Current State and Future Directions for Open Repositories in Europe


In January 2023, OpenAIRE, LIBER, SPARC Europe, and COAR launched a joint strategy aimed at strengthening the European repository network. As a first step, a survey of the European repository landscape was undertaken in February-March 2023. The survey found that, collectively, European repositories acquire, preserve and provide open access to tens or possibly hundreds of millions of valuable research outputs and represent critical, not-for-profit infrastructure in the European open science landscape. They are used for sharing articles that may be pay-walled in published journals, but also for providing access to a large variety of other types of research outputs including research data, theses/dissertations, conference papers, preprints, code, and so on.

However, in order to ensure the European repository network is fit for purpose and able to support the evolving needs of the research community, the survey also identified three areas in particular that could be strengthened: maintaining up-to-date, highly functioning software platforms; applying consistent and comprehensive good practices in terms of metadata, preservation, and usage statistics; and gaining appropriate visibility in the scholarly ecosystem.

Despite the challenges, the current climate offers exciting opportunities for repositories. Many funders are actively promoting the repository route for articles because of their role in supporting equitable access to content (i.e. no fees to access or deposit). The value proposition for open science is growing and repositories are increasingly recognised as the main mechanism for collecting and providing access to a wide range of other research outputs. Add to this, the nascent, but growing, interest in the publish-review-curate model in which repositories have a central function, and it seems they are well placed to expand their current role in the ecosystem.

https://doi.org/10.5281/zenodo.10255559

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Jupyter Notebooks and Institutional Repositories: A Landscape Analysis of Realities, Opportunities and Paths Forward"


Jupyter Notebooks are important outputs of modern scholarship, though the longevity of these resources within the broader scholarly record is still unclear. Communities and their creators have yet to holistically understand creation, access, sharing and preservation of computational notebooks, and such notebooks have yet to be designated a proper place among institutional repositories or other preservation environments as first class scholarly digital assets. Before this can happen, repository managers and curators need to have the appropriate tools, schemas and best practices to maximize the benefit of notebooks within their repository landscape and environments.

This paper explores the landscape of Jupyter notebooks today, and focuses on the opportunities and challenges related to bringing Jupyter Notebooks into institutional repositories. We explore the extent to which Jupyter Notebooks are currently accessioned into institutional repositories, and how metadata schemas like CodeMeta might facilitate their adoption. We also discuss characteristics of Jupyter Notebooks created by researchers at the National Center for Atmospheric Research, to provide additional insight into how to assess and accession Jupyter Notebooks and related resources into an institutional repository.

https://journal.code4lib.org/articles/17751

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Islandora for Archival Access and Discovery"


This article is a case study describing the implementation of Islandora 2 to create a public online portal for the discovery, access, and use of archives and special collections materials at the University of Nevada, Las Vegas. The authors will explain how the goal of providing users with a unified point of access across diverse data (including finding aids, digital objects, and agents) led to the selection of Islandora 2 and they will discuss the benefits and challenges of using this open source software. They will describe the various steps of implementation, including custom development, migration from CONTENTdm, integration with ArchivesSpace, and developing new skills and workflows to use Islandora most effectively. As hindsight always provides additional perspective, the case study will also offer reflection on lessons learned since the launch, insights on open-source repository sustainability, and priorities for future development.

https://journal.code4lib.org/articles/17929

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Open Access Movement in the Scholarly World: Pathways for Libraries in Developing Countries"


Open access is a scholarly publishing model that has emerged as an alternative to traditional subscription-based journal publishing. This study explores the adoption of the open access movement worldwide and the role that libraries can play in addressing those factors which are slowing its progress within developing countries. The study has drawn upon both qualitative data from a focused literature review and quantitative data from major open access platforms. The results indicate that while the open access movement is steadily gaining acceptance worldwide, the progress in developing countries within geographical areas such as Africa, Asia and Oceania is quite a bit slower. Two significant factors are the cost of publishing fees and the lack of institutional open access mandates and policies to encourage uptake. The study provides suggested strategies for academic libraries to help overcome current challenges.

https://doi.org/10.1177/01655515231202758

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"US Repository Network Launches Pilot to Enhance Discoverability of Open Access Content in Repositories"


In November, the US Repository Network (USRN) will launch a pilot project aimed at improving the discoverability of articles in repositories. This pilot project involves the use of services from CORE, a not-for-profit aggregator based at Open University in the UK, to evaluate and improve local repository practices. Additional technical support will be provided by Antleaf Ltd.

As part of the project, CORE will aggregate the metadata and full text of articles from a subset of US repositories, allowing them to be findable through a centralized discovery service with prominent links back to the original full text of the repository. At the same time, the project will assess current practices related to metadata quality, the tracking of Open Access deposits, the use of PIDs, technical support for OAI-PMH, and the adoption of more recent protocols, such as FAIR Signposting. At the level of the centralized aggregation, CORE will enrich the existing US metadata with information from its larger international aggregation. A Dashboard service for participating institutions will be provided, enabling them to assess, validate and monitor their practices.

https://tinyurl.com/2utfpvj3

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"FAIR EVA: Bringing Institutional Multidisciplinary Repositories into the FAIR Picture"


The FAIR Principles are a set of good practices to improve the reproducibility and quality of data in an Open Science context. Different sets of indicators have been proposed to evaluate the FAIRness of digital objects, including datasets that are usually stored in repositories or data portals. However, indicators like those proposed by the Research Data Alliance are provided from a high-level perspective that can be interpreted and they are not always realistic to particular environments like multidisciplinary repositories. This paper describes FAIR EVA, a new tool developed within the European Open Science Cloud context that is oriented to particular data management systems like open repositories, which can be customized to a specific case in a scalable and automatic environment. It aims to be adaptive enough to work for different environments, repository software and disciplines, taking into account the flexibility of the FAIR Principles. As an example, we present DIGITAL.CSIC repository as the first target of the tool, gathering the particular needs of a multidisciplinary institution as well as its institutional repository.

https://doi.org/10.1038/s41597-023-02652-8

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Where Is All the Research Software? An Analysis of Software in UK Academic Repositories"


This research examines the prevalence of research software as independent records of output within UK academic institutional repositories (IRs). There has been a steep decline in numbers of research software submissions to the UK’s Research Excellence Framework from 2008 to 2021, but there has been no investigation into whether and how the official academic IRs have affected the low return rates. In what we believe to be the first such census of its kind, we queried the 182 online repositories of 157 UK universities. Our findings show that the prevalence of software within UK Academic IRs is incredibly low. Fewer than 28% contain software as recognised academic output. Of greater concern, we found that over 63% of repositories do not currently record software as a type of research output and that several Universities appeared to have removed software as a defined type from default settings of their repository. We also explored potential correlations, such as being a member of the Russell group, but found no correlation between these metadata and prevalence of records of software. Finally, we discuss the implications of these findings with regards to the lack of recognition of software as a discrete research output in institutions, despite the opposite being mandated by funders, and we make recommendations for changes in policies and operating procedures.

https://doi.org/10.7717/peerj-cs.1546

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"DSpace 7 Benefits: Is It Worth Upgrading?"


With the release of DSpace version 7, a natural question that arises is whether the new version offers enough new functionalities to motivate system administrators to upgrade. This paper briefly describes the most important changes, including new features and bug fixes, included in DSpace 7.4 and prior minor versions. The next parts of this paper explore our estimate that there are several thousand DSpace-based systems globally that will likely have to be upgraded in the near future. The main reason for this need is that older versions of DSpace (including 5.x) have reached the end of their developer support period or are reaching it in mid-2023. Based on our own upgrade experience, we propose suggestions and recommendations on migrating from the previous DSpace 6.3-based environment to the new one in a case study that concludes this article.

https://tinyurl.com/32t7ac9m

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The State of Scientific PDF Accessibility in Repositories: A Survey in Switzerland"


This survey analyzes the quality of the portable document format (PDF) documents in online repositories in Switzerland, examining their accessibility for people with visual impairments. Two minimal accessibility features were analysed: the PDFs had to have tags and a hierarchical heading structure. The survey also includes interviews with the managers or heads of multiple Swiss universities’ repositories . . . An analysis of interviewee responses indicates an overall lack of awareness of PDF accessibility, and shows that online repositories currently have no concrete plans to address the issue. This paper concludes by presenting a set of recommendations for online repositories to improve the accessibility of their PDF documents.

https://doi.org/10.1002/leap.1581

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Proactive Institutional Repository Collection Development Techniques: Archiving Gold Open Access Articles and Metadata Retrieved with Web Scraping"


This article describes a method for copying open access articles and corresponding descriptive metadata from open repositories for archiving in an institutional repository using Beautiful Soup and Selenium as web scraping tools. This method quickly added hundreds of articles to an IR without relying on faculty participation or consulting publisher policies, increasing repository downloads and usage.

https://doi.org/10.1080/01930826.2023.2240190

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Medical Institutional Repositories in Libraries (MIRL) Symposium: A Blueprint Designed in Response to a Community of Practice Need"


Background: Health sciences libraries in medical schools, academic health centers, health care networks, and hospitals have established institutional repositories (IRs) to showcase their research achievements, increase visibility, expand the reach of institutional scholarship, and disseminate unique content. Newer roles for IRs include publishing open access journals, tracking researcher productivity, and serving as repositories for data sharing. Many repository managers oversee their IR with limited assistance from others at their institution. Therefore, IR practitioners find it valuable to network and learn from colleagues at other institutions.

Case Presentation: This case report describes the genesis and implementation of a new initiative specifically designed for a health sciences audience: the Medical Institutional Repositories in Libraries (MIRL) Symposium. Six medical librarians from hospitals and academic institutions in the U.S. organized the inaugural symposium held virtually in November 2021. The goal was to fill a perceived gap in conference programming for IR practitioners in health settings. Themes of the 2021 and subsequent 2022 symposium included IR management, increasing readership and engagement, and platform migration. Post-symposium surveys were completed by 73/238 attendees (31%) in 2021 and by 62/180 (34%) in 2022. Feedback was overwhelmingly positive.

Discussion: Participant responses in post-symposium surveys rated MIRL highly. The MIRL planning group intends to continue the symposium and hopes MIRL will steadily evolve, build community among IR practitioners in the health sciences, and expand the conversation around best practices for digital archiving of institutional content. The implementation design of MIRL serves as a blueprint for collaboratively bringing together a professional community of practice.

https://doi.org/10.5195/jmla.2023.1503

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Is There a Case for Accepting Machine Translated Scholarly Content in Repositories?"


Multilingualism is a critical characteristic of a healthy, inclusive, and diverse research communications landscape. However, multilingualism presents a particular challenge for the discovery of research outputs. Although researchers and other information seekers may only be able to read in one or two languages, they may want to know about all the relevant research in their area, regardless of the language in which it is published. Conversely, information seekers may want to discover research outputs in their own language(s) more easily. To facilitate this, COAR Task Force on Supporting Multilingualism and non-English Content in Repositories has been developing and promoting good practices for repositories in managing multilingual and non-English content. In the course of our work, the topic of machine translation (MT) has sparked a heated discussion within the Task Group and we would like to share with you the nature of this discussion.

https://bit.ly/42D1nbF

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"What’s Missing? The Role of Community Colleges in Building a More Inclusive Institutional Repository Landscape"


The precise number of community college communities with access to an IR is unknown and certainly higher than ten, but uptake is low. As a result, the rich intellectual outputs generated at these institutions are not openly shared. Repositories provide community college communities with the ability to read content they would not otherwise have access to, but to fulfill the original purposes of open access to "share the learning of the rich with the poor and the poor with the rich," it’s imperative that the faculty and students at community colleges are recognized as contributors to the scholarly communications landscape and empowered to disseminate their works, via repositories, to the larger knowledge ecosystem

https://doi.org/10.5860/crln.84.4.173

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Study on the Readiness of Research Data and Literature Repositories to Facilitate Compliance With the Open Science Horizon Europe MGA Requirement

In this study we analysed 220 repositories and, via a structured methodology, we identified 165 trusted repositories and tested their readiness to facilitate the compliance with the HE MGA Open Science requirements.

We show that it is not straightforward to assess whether a given repository is suitable to facilitate compliance with the HE MGA requirements. This is mainly due to varying interpretations of definitions and requirements, whether information on repository specifications is publicly available, and the high level of technical expertise needed to assess all requirements.

We highlight that repository registries, such as FAIRsharing, re3data or the CoreTrustSeal (CTS) website, are not sufficient on their own to assess the readiness of repositories to facilitate compliance with the HE MGA requirements, as the definition of what constitutes a trusted repository is subtle and varied and needs to be carefully interpreted and applied to repositories. This is also the case for related concepts such as community endorsement or for policy requirements in terms of preservation, curation and security of the repository contents.

https://doi.org/10.5281/zenodo.7728016

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Forthcoming: Discoverability in Digital Repositories: Systems, Perspectives, and User Studies


It examines discoverability in digital repositories from both user and system perspectives by exploring how users access content (including their search patterns and habits, need for digital content, effects of outreach, or integration with Wikipedia and other web-based tools) and how systems support or prevent discoverability through the structure or quality of metadata, system interfaces, exposure to search engines or lack thereof, and integration with library discovery tools.

bit.ly/3XbbRvT

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Ten Recommended Practices for Managing Preprints in Generalist and Institutional Repositories"


Currently, there are numerous gaps in geographic and domain coverage and some authors will choose to deposit their research outputs into another type of repository, such as an institutional or generalist repository. . . . To address these gaps, a COAR-ASAPbio Working Group on Preprint in Repositories identified ten recommended practices for managing preprints across three areas: linking, discovery, and editorial processes. While we acknowledge that many of these practices are not currently in use by institutional and generalist repositories, we hope that these recommendations will encourage repositories around the world that collect preprints to begin to apply them locally.

https://cutt.ly/R0gursT

Full report

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Free and Open-Source Automated Open Access Preprint Harvesting"


Universities are attempting to ensure that all of their research is publicly accessible because of funding mandates. Many universities have established campus open access (OA) repositories but are struggling with how to upload millions of manuscripts under numerous license agreements while also linking metadata to make them discoverable. To do this manually requires around 15 minutes per manuscript from an experienced librarian. The time and cost to do this campus-wide is prohibitive. To radically reduce the time and costs of this process and to harvest all past work, this article reports on the development and testing of a free and open source (FOSS) JavaScript-based application, aperta-accessum, which does the following: 1) harvests names and emails from a department’s faculty webpage; 2) identifies scholars’ Open Researcher and Contributor Identifiers (ORCID iDs); 3) obtains digital object identifiers (DOIs) of publications for each scholar; 4) checks for existing copies in an institution’s OA repository; 5) identifies the legal opportunities to provide OA versions of all of the articles not already in the OA repository; 6) sends authors emails requesting a simple upload of author manuscripts; and 7) adds link-harvested metadata from DOIs with uploaded preprints into a bepress repository; the code can be modified for additional repositories. The results of this study show that, in the administrative time needed to make a single document OA manually, aperta-accessum can process approximately five entire departments worth of peer-reviewed articles. Following best practices discussed, it is clear that this open-source OA harvester enables institutional library’s stewardship of OA knowledge on a mass scale for radically reduced costs.

https://doi.org/10.31274/jlsc.14421

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Paywall: "Expanding Your Institutional Repository: Librarians Working with Faculty"


Since a successful institutional repository will contain a higher percentage of the contributors’ materials, we implemented a system to upload faculty publications more effectively to our academic library’s institutional repository.. . . The success of this method is indicated by the increase in articles that have been uploaded to our institutional repository; as a result of the implementation of this program, the number of publications in our university’s institutional repository by these authors has increased 174 %.

https://doi.org/10.1016/j.acalib.2022.102628

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Investments in Open: Association of Research Libraries US University Member Expenditures on Services, Collections, Staff, and Infrastructure in Support of Open Scholarship


In total, 46 of the 102 institutions provided full or partial results. Summary results are divided into the following categories: read-and-publish or transitional agreements, article processing charges (APC) or OA funds, non-APC-based OA publishing models, institutional repository services, OA journal hosting and publishing services, and open monographs.

The survey found that the total aggregate spending on open access for all 46 responding libraries was $32 million USD, with an average expenditure per institution of $785,940. This represents an average of 2.26% of the total library budget spent on open, ranging from 0.19% to 11.02% across respondent libraries. As a portion of the total amount of expenses spent on OA infrastructure, the majority of funds are invested in read-and-publish agreements (~$20 million) followed by institutional repository infrastructure with investments of 17% of total OA expenses (~$5 million) across the 46 institutions.

https://cutt.ly/nMuAMbT

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"De Gruyter and Ubiquity Join Forces"


Ubiquity was founded by researchers in order to accelerate change towards open access and open science in 2012. Ubiquity publishes gold and diamond open access journals and books through its imprint Ubiquity Press, and supports 33 independent university presses with publishing services. Along with these partners, Ubiquity currently provides over 800 open access journals and more than 2,800 open access books. Ubiquity extended its services in 2021 with the launch of its institutional repositories platform, adding capacity to drive green open access and the dissemination of all research outputs, such as preprints and data. . . .

By acquiring and investing in Ubiquity, De Gruyter will grow its existing open access and service business further and help the Ubiquity team reach their goals as an open research publisher and provider of open publishing services. As part of De Gruyter, Ubiquity will continue pursuing its mission to make quality open access publishing affordable and retain a high degree of independence to do so. The Ubiquity team and CEO and founder Brian Hole will keep working from their London office and remotely to continue their successful journey of researcher-led publishing.

https://cutt.ly/yNyI6sK

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Increasing the Reuse of Data through FAIR-enabling the Certification of Trustworthy Digital Repositories"


To address this gap the FAIRsFAIR project developed a number of tools and resources that facilitate the assessment of FAIR-enabling practices at the repository level as well as the FAIRness of datasets within them. These include the CoreTrustSeal+FAIRenabling Capability Maturity model (CTS+FAIR CapMat), a FAIR-Enabling Trustworthy Digital Repositories-Capability Maturity Self-Assessment template, and F-UJI, a web-based tool designed to assess the FAIRness of research data objects.

https://doi.org/10.2218/ijdc.v17i1.852

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"FAIREST: A Framework for Assessing Research Repositories "

"In this article, we introduce the FAIREST principles, a framework inspired by the well-known FAIR principles, but designed to provide a set of metrics for assessing and selecting solutions for creating digital repositories for research artefacts. The goal is to support decision makers in choosing such a solution when planning for a repository, especially at an institutional level.. . . We further describe an assessment of 11 widespread solutions, with the goal to provide an overview of the current landscape of research data repository solutions, identifying gaps and research challenges to be addressed."

https://doi.org/10.1162/dint_a_00159