In January 2023, OpenAIRE, LIBER, SPARC Europe, and COAR launched a joint strategy aimed at strengthening the European repository network. As a first step, a survey of the European repository landscape was undertaken in February-March 2023. The survey found that, collectively, European repositories acquire, preserve and provide open access to tens or possibly hundreds of millions of valuable research outputs and represent critical, not-for-profit infrastructure in the European open science landscape. They are used for sharing articles that may be pay-walled in published journals, but also for providing access to a large variety of other types of research outputs including research data, theses/dissertations, conference papers, preprints, code, and so on.
However, in order to ensure the European repository network is fit for purpose and able to support the evolving needs of the research community, the survey also identified three areas in particular that could be strengthened: maintaining up-to-date, highly functioning software platforms; applying consistent and comprehensive good practices in terms of metadata, preservation, and usage statistics; and gaining appropriate visibility in the scholarly ecosystem.
Despite the challenges, the current climate offers exciting opportunities for repositories. Many funders are actively promoting the repository route for articles because of their role in supporting equitable access to content (i.e. no fees to access or deposit). The value proposition for open science is growing and repositories are increasingly recognised as the main mechanism for collecting and providing access to a wide range of other research outputs. Add to this, the nascent, but growing, interest in the publish-review-curate model in which repositories have a central function, and it seems they are well placed to expand their current role in the ecosystem.
Jupyter Notebooks are important outputs of modern scholarship, though the longevity of these resources within the broader scholarly record is still unclear. Communities and their creators have yet to holistically understand creation, access, sharing and preservation of computational notebooks, and such notebooks have yet to be designated a proper place among institutional repositories or other preservation environments as first class scholarly digital assets. Before this can happen, repository managers and curators need to have the appropriate tools, schemas and best practices to maximize the benefit of notebooks within their repository landscape and environments.
This paper explores the landscape of Jupyter notebooks today, and focuses on the opportunities and challenges related to bringing Jupyter Notebooks into institutional repositories. We explore the extent to which Jupyter Notebooks are currently accessioned into institutional repositories, and how metadata schemas like CodeMeta might facilitate their adoption. We also discuss characteristics of Jupyter Notebooks created by researchers at the National Center for Atmospheric Research, to provide additional insight into how to assess and accession Jupyter Notebooks and related resources into an institutional repository.
This article is a case study describing the implementation of Islandora 2 to create a public online portal for the discovery, access, and use of archives and special collections materials at the University of Nevada, Las Vegas. The authors will explain how the goal of providing users with a unified point of access across diverse data (including finding aids, digital objects, and agents) led to the selection of Islandora 2 and they will discuss the benefits and challenges of using this open source software. They will describe the various steps of implementation, including custom development, migration from CONTENTdm, integration with ArchivesSpace, and developing new skills and workflows to use Islandora most effectively. As hindsight always provides additional perspective, the case study will also offer reflection on lessons learned since the launch, insights on open-source repository sustainability, and priorities for future development.
Open access is a scholarly publishing model that has emerged as an alternative to traditional subscription-based journal publishing. This study explores the adoption of the open access movement worldwide and the role that libraries can play in addressing those factors which are slowing its progress within developing countries. The study has drawn upon both qualitative data from a focused literature review and quantitative data from major open access platforms. The results indicate that while the open access movement is steadily gaining acceptance worldwide, the progress in developing countries within geographical areas such as Africa, Asia and Oceania is quite a bit slower. Two significant factors are the cost of publishing fees and the lack of institutional open access mandates and policies to encourage uptake. The study provides suggested strategies for academic libraries to help overcome current challenges.
In November, the US Repository Network (USRN) will launch a pilot project aimed at improving the discoverability of articles in repositories. This pilot project involves the use of services from CORE, a not-for-profit aggregator based at Open University in the UK, to evaluate and improve local repository practices. Additional technical support will be provided by Antleaf Ltd.
As part of the project, CORE will aggregate the metadata and full text of articles from a subset of US repositories, allowing them to be findable through a centralized discovery service with prominent links back to the original full text of the repository. At the same time, the project will assess current practices related to metadata quality, the tracking of Open Access deposits, the use of PIDs, technical support for OAI-PMH, and the adoption of more recent protocols, such as FAIR Signposting. At the level of the centralized aggregation, CORE will enrich the existing US metadata with information from its larger international aggregation. A Dashboard service for participating institutions will be provided, enabling them to assess, validate and monitor their practices.
The FAIR Principles are a set of good practices to improve the reproducibility and quality of data in an Open Science context. Different sets of indicators have been proposed to evaluate the FAIRness of digital objects, including datasets that are usually stored in repositories or data portals. However, indicators like those proposed by the Research Data Alliance are provided from a high-level perspective that can be interpreted and they are not always realistic to particular environments like multidisciplinary repositories. This paper describes FAIR EVA, a new tool developed within the European Open Science Cloud context that is oriented to particular data management systems like open repositories, which can be customized to a specific case in a scalable and automatic environment. It aims to be adaptive enough to work for different environments, repository software and disciplines, taking into account the flexibility of the FAIR Principles. As an example, we present DIGITAL.CSIC repository as the first target of the tool, gathering the particular needs of a multidisciplinary institution as well as its institutional repository.
This research examines the prevalence of research software as independent records of output within UK academic institutional repositories (IRs). There has been a steep decline in numbers of research software submissions to the UK’s Research Excellence Framework from 2008 to 2021, but there has been no investigation into whether and how the official academic IRs have affected the low return rates. In what we believe to be the first such census of its kind, we queried the 182 online repositories of 157 UK universities. Our findings show that the prevalence of software within UK Academic IRs is incredibly low. Fewer than 28% contain software as recognised academic output. Of greater concern, we found that over 63% of repositories do not currently record software as a type of research output and that several Universities appeared to have removed software as a defined type from default settings of their repository. We also explored potential correlations, such as being a member of the Russell group, but found no correlation between these metadata and prevalence of records of software. Finally, we discuss the implications of these findings with regards to the lack of recognition of software as a discrete research output in institutions, despite the opposite being mandated by funders, and we make recommendations for changes in policies and operating procedures.
Since 2016, the [MSUL] digital repository has been using Faceted Application of Subject Terminology (FAST) subject headings as its primary subject vocabulary. . . The MSUL FAST use case presents some challenges that are not addressed by existing MARC-focused FAST tools. This paper will outline the MSUL digital repository team’s justification for including FAST headings in the digital repository as well as workflows for adding FAST headings to Metadata Object Description Schema (MODS) metadata, their maintenance, and utilization for discovery.
This paper discusses the challenges of ensuring discoverability of Open Educational Resources (OER) in the absence of clear standards for sharing them. Despite the efforts of librarians and instructors to create a wealth of OER, discoverability remains limited and often relegated to a list of links on a LibGuide. The authors address this challenge by highlighting technical and descriptive barriers to OER discoverability. The authors then describe the development of a hybrid metadata standard for OER and its deployment through the institutional repository. Although provisional, this approach ensures that OER records can be adapted to future metadata standards and exported to third-party repositories. This paper underscores the importance of developing an effective metadata standard for OER to ensure their discoverability for learners and educators.
With the release of DSpace version 7, a natural question that arises is whether the new version offers enough new functionalities to motivate system administrators to upgrade. This paper briefly describes the most important changes, including new features and bug fixes, included in DSpace 7.4 and prior minor versions. The next parts of this paper explore our estimate that there are several thousand DSpace-based systems globally that will likely have to be upgraded in the near future. The main reason for this need is that older versions of DSpace (including 5.x) have reached the end of their developer support period or are reaching it in mid-2023. Based on our own upgrade experience, we propose suggestions and recommendations on migrating from the previous DSpace 6.3-based environment to the new one in a case study that concludes this article.
This survey analyzes the quality of the portable document format (PDF) documents in online repositories in Switzerland, examining their accessibility for people with visual impairments. Two minimal accessibility features were analysed: the PDFs had to have tags and a hierarchical heading structure. The survey also includes interviews with the managers or heads of multiple Swiss universities’ repositories . . . An analysis of interviewee responses indicates an overall lack of awareness of PDF accessibility, and shows that online repositories currently have no concrete plans to address the issue. This paper concludes by presenting a set of recommendations for online repositories to improve the accessibility of their PDF documents.
This article describes a method for copying open access articles and corresponding descriptive metadata from open repositories for archiving in an institutional repository using Beautiful Soup and Selenium as web scraping tools. This method quickly added hundreds of articles to an IR without relying on faculty participation or consulting publisher policies, increasing repository downloads and usage.
Background: Health sciences libraries in medical schools, academic health centers, health care networks, and hospitals have established institutional repositories (IRs) to showcase their research achievements, increase visibility, expand the reach of institutional scholarship, and disseminate unique content. Newer roles for IRs include publishing open access journals, tracking researcher productivity, and serving as repositories for data sharing. Many repository managers oversee their IR with limited assistance from others at their institution. Therefore, IR practitioners find it valuable to network and learn from colleagues at other institutions.
Case Presentation: This case report describes the genesis and implementation of a new initiative specifically designed for a health sciences audience: the Medical Institutional Repositories in Libraries (MIRL) Symposium. Six medical librarians from hospitals and academic institutions in the U.S. organized the inaugural symposium held virtually in November 2021. The goal was to fill a perceived gap in conference programming for IR practitioners in health settings. Themes of the 2021 and subsequent 2022 symposium included IR management, increasing readership and engagement, and platform migration. Post-symposium surveys were completed by 73/238 attendees (31%) in 2021 and by 62/180 (34%) in 2022. Feedback was overwhelmingly positive.
Discussion: Participant responses in post-symposium surveys rated MIRL highly. The MIRL planning group intends to continue the symposium and hopes MIRL will steadily evolve, build community among IR practitioners in the health sciences, and expand the conversation around best practices for digital archiving of institutional content. The implementation design of MIRL serves as a blueprint for collaboratively bringing together a professional community of practice.
Multilingualism is a critical characteristic of a healthy, inclusive, and diverse research communications landscape. However, multilingualism presents a particular challenge for the discovery of research outputs. Although researchers and other information seekers may only be able to read in one or two languages, they may want to know about all the relevant research in their area, regardless of the language in which it is published. Conversely, information seekers may want to discover research outputs in their own language(s) more easily. To facilitate this, COAR Task Force on Supporting Multilingualism and non-English Content in Repositories has been developing and promoting good practices for repositories in managing multilingual and non-English content. In the course of our work, the topic of machine translation (MT) has sparked a heated discussion within the Task Group and we would like to share with you the nature of this discussion.
The precise number of community college communities with access to an IR is unknown and certainly higher than ten, but uptake is low. As a result, the rich intellectual outputs generated at these institutions are not openly shared. Repositories provide community college communities with the ability to read content they would not otherwise have access to, but to fulfill the original purposes of open access to "share the learning of the rich with the poor and the poor with the rich," it’s imperative that the faculty and students at community colleges are recognized as contributors to the scholarly communications landscape and empowered to disseminate their works, via repositories, to the larger knowledge ecosystem
This spring, Digital Scholarship’s bibliographies in the HTML format were reformatted as single page files with internal navigation. This included all bibliographies that were in HTML format only as well as the HTML versions of paperback books. These new PDFs are in a 12 point font and are designed for printing; however, they also have live links for immediate access. There were no content changes. For a list of all Digital Scholarship publications, see the site map.
In this study we analysed 220 repositories and, via a structured methodology, we identified 165 trusted repositories and tested their readiness to facilitate the compliance with the HE MGA Open Science requirements.
We show that it is not straightforward to assess whether a given repository is suitable to facilitate compliance with the HE MGA requirements. This is mainly due to varying interpretations of definitions and requirements, whether information on repository specifications is publicly available, and the high level of technical expertise needed to assess all requirements.
We highlight that repository registries, such as FAIRsharing, re3data or the CoreTrustSeal (CTS) website, are not sufficient on their own to assess the readiness of repositories to facilitate compliance with the HE MGA requirements, as the definition of what constitutes a trusted repository is subtle and varied and needs to be carefully interpreted and applied to repositories. This is also the case for related concepts such as community endorsement or for policy requirements in terms of preservation, curation and security of the repository contents.
It examines discoverability in digital repositories from both user and system perspectives by exploring how users access content (including their search patterns and habits, need for digital content, effects of outreach, or integration with Wikipedia and other web-based tools) and how systems support or prevent discoverability through the structure or quality of metadata, system interfaces, exposure to search engines or lack thereof, and integration with library discovery tools.
Currently, there are numerous gaps in geographic and domain coverage and some authors will choose to deposit their research outputs into another type of repository, such as an institutional or generalist repository. . . . To address these gaps, a COAR-ASAPbio Working Group on Preprint in Repositories identified ten recommended practices for managing preprints across three areas: linking, discovery, and editorial processes. While we acknowledge that many of these practices are not currently in use by institutional and generalist repositories, we hope that these recommendations will encourage repositories around the world that collect preprints to begin to apply them locally.
Since a successful institutional repository will contain a higher percentage of the contributors’ materials, we implemented a system to upload faculty publications more effectively to our academic library’s institutional repository.. . . The success of this method is indicated by the increase in articles that have been uploaded to our institutional repository; as a result of the implementation of this program, the number of publications in our university’s institutional repository by these authors has increased 174 %.
In total, 46 of the 102 institutions provided full or partial results. Summary results are divided into the following categories: read-and-publish or transitional agreements, article processing charges (APC) or OA funds, non-APC-based OA publishing models, institutional repository services, OA journal hosting and publishing services, and open monographs.
The survey found that the total aggregate spending on open access for all 46 responding libraries was $32 million USD, with an average expenditure per institution of $785,940. This represents an average of 2.26% of the total library budget spent on open, ranging from 0.19% to 11.02% across respondent libraries. As a portion of the total amount of expenses spent on OA infrastructure, the majority of funds are invested in read-and-publish agreements (~$20 million) followed by institutional repository infrastructure with investments of 17% of total OA expenses (~$5 million) across the 46 institutions.
Ubiquity was founded by researchers in order to accelerate change towards open access and open science in 2012. Ubiquity publishes gold and diamond open access journals and books through its imprint Ubiquity Press, and supports 33 independent university presses with publishing services. Along with these partners, Ubiquity currently provides over 800 open access journals and more than 2,800 open access books. Ubiquity extended its services in 2021 with the launch of its institutional repositories platform, adding capacity to drive green open access and the dissemination of all research outputs, such as preprints and data. . . .
By acquiring and investing in Ubiquity, De Gruyter will grow its existing open access and service business further and help the Ubiquity team reach their goals as an open research publisher and provider of open publishing services. As part of De Gruyter, Ubiquity will continue pursuing its mission to make quality open access publishing affordable and retain a high degree of independence to do so. The Ubiquity team and CEO and founder Brian Hole will keep working from their London office and remotely to continue their successful journey of researcher-led publishing.
To address this gap the FAIRsFAIR project developed a number of tools and resources that facilitate the assessment of FAIR-enabling practices at the repository level as well as the FAIRness of datasets within them. These include the CoreTrustSeal+FAIRenabling Capability Maturity model (CTS+FAIR CapMat), a FAIR-Enabling Trustworthy Digital Repositories-Capability Maturity Self-Assessment template, and F-UJI, a web-based tool designed to assess the FAIRness of research data objects.
"In this article, we introduce the FAIREST principles, a framework inspired by the well-known FAIR principles, but designed to provide a set of metrics for assessing and selecting solutions for creating digital repositories for research artefacts. The goal is to support decision makers in choosing such a solution when planning for a repository, especially at an institutional level.. . . We further describe an assessment of 11 widespread solutions, with the goal to provide an overview of the current landscape of research data repository solutions, identifying gaps and research challenges to be addressed."