"A Scoping Review on the Use and Acceptability of Preprints"


Preprints are open and accessible scientific manuscript or report that has not been submitted to a peer reviewed journal. The value and importance of preprints has grown since its contribution during the public health emergency of the COVID-19 pandemic. Funders and publishers are establishing their position on the use of preprints, in grant applications and publishing models. However, the evidence supporting the use and acceptability of preprints varies across funders, publishers, and researchers. The purpose of this scoping review was to explore the current evidence on the use and acceptability of preprints by publishers, funders, and the research community throughout the research lifecycle.

https://doi.org/10.31235/osf.io/nug4p

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Open Access at a Crossroads: Library Publishing and Bibliodiversity"


The open access movement has gained momentum since the Budapest Open Access Initiative (BOAI) first launched twenty years ago. Notably, there has been a drastic increase in the number of open access articles. Concerns have been raised about equality and diversity issues, however, for researchers without an affiliation (e.g. independent, unemployed and retired researchers) and researchers on the "scientific periphery" who are excluded from the gold open access model. This article argues that the gold open access model is destructive to the knowledge production ecosystem by addressing the importance of bibliodiversity and the ways in which library publishing can contribute to sustainable and equitable knowledge production.

https://doi.org/10.1629/uksg.613

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Academic Publishing and Open Access. What Does Economics Teach Us?


While the gold regime seems the most natural way to achieve open access, a generalized switch to open access may also have undesired consequences: projections indeed suggest that a massive move towards the gold regime would generate an explosion in the amount of APC unless there are controls to limit market power. Beside the sharp increase in APC, the shift to gold open access may create conflicts of interest for publishers given that their income comes from authors and may alter the quality of publications. The green regime, by introducing competition between the journal’s version of an article and a free public version, seems an efficient way to reduce market power while expanding access.

https://shs.hal.science/halshs-04080573

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"To Preprint or Not to Preprint: Experience and Attitudes of Researchers Worldwide"


The pandemic has underlined the significance of open science and spurred further growth of preprinting. Nevertheless, preprinting has been adopted at varying rates across different countries/regions. To investigate researchers’ experience with and attitudes toward preprinting, we conducted a survey of authors of research papers published in 2021 or 2022. We find that respondents in the US and Europe had a higher level of familiarity with and adoption of preprinting than those in China and the rest of the world. Respondents in China were most worried about the lack of recognition for preprinting and the risk of getting scooped. US respondents were very concerned about premature media coverage of preprints, the reliability and credibility of preprints, and public sharing of information before peer review. Respondents identified integration of preprinting in journal submission processes as the most important way to promote preprinting.

https://doi.org/10.55835/6442f782b2b5580ba561406b

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Do Open Access Mandates Work? A Systematized Review of the Literature on Open Access Publishing Rates"


To encourage the sharing of research, various entities—including public and private funders, universities, and academic journals—have enacted open access (OA) mandates or data sharing policies. It is unclear, however, whether these OA mandates and policies increase the rate of OA publishing and data sharing within the research communities impacted by them. A team of librarians conducted a systematized review of the literature to answer this question. A comprehensive search of several scholarly databases and grey literature sources resulted in 4,689 unique citations. However, only five articles met the inclusion criteria and were deemed as having an acceptable risk of bias. This sample showed that although the majority of the mandates described in the literature were correlated with a subsequent increase in OA publishing or data sharing, the presence of various confounders and the differing methods of collecting and analyzing the data used by the studies’ authors made it impossible to establish a causative relationship.

https://doi.org/10.31274/jlsc.15444

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"What’s Missing? The Role of Community Colleges in Building a More Inclusive Institutional Repository Landscape"


The precise number of community college communities with access to an IR is unknown and certainly higher than ten, but uptake is low. As a result, the rich intellectual outputs generated at these institutions are not openly shared. Repositories provide community college communities with the ability to read content they would not otherwise have access to, but to fulfill the original purposes of open access to "share the learning of the rich with the poor and the poor with the rich," it’s imperative that the faculty and students at community colleges are recognized as contributors to the scholarly communications landscape and empowered to disseminate their works, via repositories, to the larger knowledge ecosystem

https://doi.org/10.5860/crln.84.4.173

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Transformation of the Green Road to Open Access"


(1) Background: The 2002 Budapest Open Access Initiative recommended on self-archiving of scientific articles in open repositories as the "green road" to open access. Twenty years later, only one part of the researchers deposits their publications in open repositories; moreover, one part of the repositories’ content is not based on self-archived deposits but on mediated nonfaculty contributions. The purpose of the paper is to provide more empirical evidence on this situation and to assess the impact on the future of the green road. (2) Methods: We analyzed the contributions on the French national HAL repository from more than 1,000 laboratories affiliated to the ten most important French research universities, with a focus on 2020, representing 14,023 contributor accounts and 166,939 deposits. (3) Results: We identified seven different types of contributor accounts, including deposits from nonfaculty staff and import flows from other platforms. Mediated nonfaculty contribution accounts for at least 48% of the deposits. We also identified difference between institutions and disciplines. (4) Conclusions: Our empirical results reveal a transformation of open repositories from self-archiving and direct scientific communication towards research information management. Repositories like HAL are somewhere in the middle of the process. The paper describes data quality as the main issue and major challenge of this transformation.

https://doi.org/10.20944/preprints202302.0268.v1

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Only 10% Fully Understand "Preprint": "Framing COVID-19 Preprint Research as Uncertain: A Mixed-Method Study of Public Reactions"


Unlike hedging, preprint disclosure had no impact on audience message evaluations, nor vaccine attitudes and intentions. In one sense, this is a positive finding in that transparency about preprint status is unlikely to produce negative public reactions. Yet a likely explanation for the null effects is that most participants lacked the knowledge to differentiate between preprints and peer-reviewed research and did not understand this disclosure as an indicator of preliminary science. The qualitative data supported this explanation. When asked how they interpret the term "preprint" when they see it in a scientific news article, participants’ responses indicated that most had a limited understanding of the concept, even among those who received the preprint disclosure message with a brief explanation of the term. In total, only 10% of participants provided definitions of preprint that aligned with those accepted by the scholarly community. Only 15% described the term as an indicator of uncertain or preliminary evidence.

https://doi.org/10.1080/10410236.2023.2164954

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"PreprintMatch: A Tool for Preprint to Publication Detection Shows Global Inequities in Scientific Publication"


Preprints, versions of scientific manuscripts that precede peer review, are growing in popularity. They offer an opportunity to democratize and accelerate research, as they have no publication costs or a lengthy peer review process. Preprints are often later published in peer-reviewed venues, but these publications and the original preprints are frequently not linked in any way. To this end, we developed a tool, PreprintMatch, to find matches between preprints and their corresponding published papers, if they exist. This tool outperforms existing techniques to match preprints and papers, both on matching performance and speed. PreprintMatch was applied to search for matches between preprints (from bioRxiv and medRxiv), and PubMed. The preliminary nature of preprints offers a unique perspective into scientific projects at a relatively early stage, and with better matching between preprint and paper, we explored questions related to research inequity. We found that preprints from low income countries are published as peer-reviewed papers at a lower rate than high income countries (39.6% and 61.1%, respectively), and our data is consistent with previous work that cite a lack of resources, lack of stability, and policy choices to explain this discrepancy. Preprints from low income countries were also found to be published quicker (178 vs 203 days) and with less title, abstract, and author similarity to the published version compared to high income countries. Low income countries add more authors from the preprint to the published version than high income countries (0.42 authors vs 0.32, respectively), a practice that is significantly more frequent in China compared to similar countries. Finally, we find that some publishers publish work with authors from lower income countries more frequently than others.

https://doi.org/10.1371/journal.pone.0281659

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Lack of Sustainability Plans for Preprint Services Risks Their Potential to Improve Science"


Despite successfully building a revenue model that shares the burden between Cornell University, the Simons Foundation and several members and supporters, arXiv’s “funding is still outpaced by [their] growth” – the server hosts over 2 million preprints already and is growing by 10% each year. And while arXiv has been supporting more and more scholars to share and discover preprints, the team behind it has been through significant changes in leadership and is dealing with the urgent need to modernize their 30-year-old technology. As a former Executive Director of arXiv noted, “[arXiv’s success] may not last forever”. Similarly, the recent news that Chan Zuckerberg Initiative has renewed its financial support for the leading preprint servers in biology and medicine, bioRxiv and medRxiv is welcome relief, but this support is temporary, and the team must find a way to continue in the long run.

bit.ly/3y745Ji

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

2.6 Billion Total Downloads: arXiv Annual Report 2022


Our critical priorities during 2022 were to secure additional funding, hire technical and program directors, and ramp up our efforts to modernize arXiv’s software by moving it to the cloud, which will provide better stability, scalability and maintainability. I’m pleased to report that we were able to make significant progress on all of these fronts. arXiv brought in more funding than expected in the form of grants, memberships, and donations, and we hired Stephanie Orphan as program director and Charles Frankston as technical director. Both bring strong and complementary expertise to the team. Moving the technical operations of arXiv—a service with a 30 year history—off of Cornell’s on-premises servers is a major, complicated task. The move to the cloud is currently in progress and on track

bit.ly/41exRsX

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Clarivate: "The Preprint Citation Index: Linking Preprints to the Trusted Web of Science Ecosystem"


After many months of planning, we are launching the Preprint Citation Index, a multidisciplinary collection of preprints from leading repositories that helps researchers stay current with the newest research while maintaining confidence in the resources they rely on. . . . The Preprint Citation Index currently provides nearly two million preprints from arXiv, bioRxiv, chemRxiv, medRxiv and Preprints.org. We plan to add preprints from a dozen additional repositories as well as display open peer reviews on Preprint Citation Index throughout 2023.

bit.ly/3YxPcuw

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Outside the Library: Early Career Researchers and Use of Alternative Information Sources in Pandemic Times"


Presents findings from a study into the attitudes and practices of pandemic-era early career researchers (ECRs) in regard to obtaining access to the formally published scholarly literature, which focused on alternative providers, notably ResearchGate and Sci-Hub. . . . Findings show that alternative providers, as represented by ResearchGate and Sci-Hub, have become established and appear to be gaining ground. However, there are considerable country- and discipline-associated differences.

https://doi.org/10.1002/leap.1522

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"One Size Does Not Fit All: Self-Archiving Personas Based on Federally Funded Researchers at a Mid-Sized Private Institution"


Introduction: This mixed-method study analyzes the self-archiving behaviors and underlying motivations of researchers at an institution very recently recategorized by the Carnegie Classification system from "Doctoral– High Research Activity (R2)" to "Doctoral–Very High Research Activity (R1)." Methods: A quantitative analysis of data provided by CHORUS, a multi-institutional open access (OA) infrastructure project designed to minimize the administrative costs of complying with federal public access mandates, was followed by semi-structured qualitative interviews with researchers to determine the underlying motivations for self-archiving research papers resulting from federal grant support. Results: Fifty-one authors with federal research funding published 71 journal articles; 139 OA versions of these 71 articles were intentionally made available by researchers across nine types of platforms, including and in addition to those provided by publishers. Interviews with 11 investigators revealed motivators such as a dedication to public access to knowledge, learned behaviors in specific disciplines, and enlightened self-interest. Challenges included concern regarding confidentiality, confusion about intellectual property and funder requirements, administrative overhead, and integrity of the scholarly record. Discussion: Despite concerns and a lack of an OA mandate and other drivers more commonly present at larger, more research-intensive universities, several researchers interviewed actively engaged in self-archiving article versions, not always with clear motivations. These findings have implications for both scholarly communications and collection development services. Conclusion: These quantitative and qualitative data informed the creation of three distinct personas intended to help librarians at similar universities design services in a manner that aligns with investigator motivations.

https://doi.org/10.31274/jlsc.13886

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"arXiv Announces New Policy on ChatGPT and Similar Tools"

In view of this, we

  1. continue to require authors to report in their work any significant use of sophisticated tools, such as instruments and software; we now include in particular text-to-text generative AI among those that should be reported consistent with subject standards for methodology.
  2. remind all colleagues that by signing their name as an author of a paper, they each individually take full responsibility for all its contents, irrespective of how the contents were generated. If generative AI language tools generate inappropriate language, plagiarized content, errors, mistakes, incorrect references, or misleading content, and that output is included in scientific works, it is the responsibility of the author(s).
  3. generative AI language tools should not be listed as an author; instead authors should refer to (1).

bit.ly/3wKlx5J

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Model(s) of the Future? Overlay Journals as an Overlooked and Emerging Trend in Scholarly Communication"


Overlay journals, a potentially overlooked model of scholarly communication, have seen a resurgence due to the increasing number of preprint repositories and preprints on coronavirus disease 2019 (COVID-19) related topics. Overlay journals at various stages of maturity were examined for unique characteristics, including whether the authors submitted their article to the journal, whether the peer reviews of the article were published by the overlay journal, and whether the overlay journals took advantage of opportunities for increased discovery. As librarians and researchers seek new, futuristic models for publishing, overlay journals are emerging as an important contribution to scholarly communication.

https://doi.org/10.5206/cjils-rcsib.v45i2.14730

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"A Framework for Improving the Accessibility of Research Papers on arXiv.org"


The research content hosted by arXiv is not fully accessible to everyone due to disabilities and other barriers. This matters because a significant proportion of people have reading and visual disabilities, it is important to our community that arXiv is as open as possible, and if science is to advance, we need wide and diverse participation. In addition, we have mandates to become accessible, and accessible content benefits everyone. In this paper, we will describe the accessibility problems with research, review current mitigations (and explain why they aren’t sufficient), and share the results of our user research with scientists and accessibility experts. Finally, we will present arXiv’s proposed next step towards more open science: offering HTML alongside existing PDF and TeX formats. An accessible HTML version of this paper is also available at https://info.arxiv.org/about/accessibility_research_report.html

https://arxiv.org/abs/2212.07286

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Phase 1 of the NIH Preprint Pilot: Testing the Viability of Making Preprints Discoverable in PubMed Central and PubMed"


Introduction: The National Library of Medicine (NLM) launched a pilot in June 2020 to 1) explore the feasibility and utility of adding preprints to PubMed Central (PMC) and making them discoverable in PubMed and 2) to support accelerated discoverability of NIH-supported research without compromising user trust in NLM’s widely used literature services. Methods: The first phase of the Pilot focused on archiving preprints reporting NIH-supported SARS-CoV-2 virus and COVID-19 research. To launch Phase 1, NLM identified eligible preprint servers and developed processes for identifying NIH-supported preprints within scope in these servers. Processes were also developed for the ingest and conversion of preprints in PMC and to send corresponding records to PubMed. User interfaces were modified for display of preprint records. NLM collected data on the preprints ingested and discovery of preprint records in PMC and PubMed and engaged users through focus groups and a survey to obtain direct feedback on the Pilot and perceptions of preprints. Results: Between June 2020 and June 2022, NLM added more than 3,300 preprint records to PMC and PubMed, which were viewed 4 million times and 3 million times, respectively. Nearly a quarter of preprints in the Pilot were not associated with a peer-reviewed published journal article. User feedback revealed that the inclusion of preprints did not have a notable impact on trust in PMC or PubMed. Discussion: NIH-supported preprints can be identified and added to PMC and PubMed without disrupting existing operations processes. Additionally, inclusion of preprints in PMC and PubMed accelerates discovery of NIH research without reducing trust in NLM literature services. Phase 1 of the Pilot provided a useful testbed for studying NIH investigator preprint posting practices, as well as knowledge gaps among user groups, during the COVID-19 public health emergency, an unusual time with heightened interest in immediate access to research results.

https://doi.org/10.1101/2022.12.12.520156

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Ten Recommended Practices for Managing Preprints in Generalist and Institutional Repositories"


Currently, there are numerous gaps in geographic and domain coverage and some authors will choose to deposit their research outputs into another type of repository, such as an institutional or generalist repository. . . . To address these gaps, a COAR-ASAPbio Working Group on Preprint in Repositories identified ten recommended practices for managing preprints across three areas: linking, discovery, and editorial processes. While we acknowledge that many of these practices are not currently in use by institutional and generalist repositories, we hope that these recommendations will encourage repositories around the world that collect preprints to begin to apply them locally.

https://cutt.ly/R0gursT

Full report

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Comparison of Clinical Study Results Reported in medRxiv Preprints vs Peer-reviewed Journal Articles"


Most clinical studies posted as preprints on medRxiv and subsequently published in peer-reviewed journals had concordant study characteristics, results, and final interpretations. With more than three-fourths of preprints published in journals within 24 months, these results may suggest that many preprints report findings that are consistent with the final peer-reviewed publications.

https://cutt.ly/k0gyQOv

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Evaluation of Publication of COVID-19–Related Articles Initially Presented as Preprints"


In this study, we identified 3343 COVID-19–related preprints posted on medRxiv in 2020. Our March 2022 search indicated that 1712 of those preprints (51.2%) were subsequently published in the peer-reviewed literature; this number increased to 1742 (52.1%) when we repeated the search in October 2022. Not considering January 2020, in which only 1 article on COVID-19 was posted, the rate of subsequent publication in a scientific journal ranged from 43.5% (94 of 216 preprints; observed in March 2020) to 60.6% (177 of 292 preprints posted in August 2020). The Table shows the top 25 of 579 peer-reviewed journals in which these preprints were published; 827 preprints (47.5%) were subsequently published in quartile 1 journals (Figure).

bit.ly/3HprhIq

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Free and Open-Source Automated Open Access Preprint Harvesting"


Universities are attempting to ensure that all of their research is publicly accessible because of funding mandates. Many universities have established campus open access (OA) repositories but are struggling with how to upload millions of manuscripts under numerous license agreements while also linking metadata to make them discoverable. To do this manually requires around 15 minutes per manuscript from an experienced librarian. The time and cost to do this campus-wide is prohibitive. To radically reduce the time and costs of this process and to harvest all past work, this article reports on the development and testing of a free and open source (FOSS) JavaScript-based application, aperta-accessum, which does the following: 1) harvests names and emails from a department’s faculty webpage; 2) identifies scholars’ Open Researcher and Contributor Identifiers (ORCID iDs); 3) obtains digital object identifiers (DOIs) of publications for each scholar; 4) checks for existing copies in an institution’s OA repository; 5) identifies the legal opportunities to provide OA versions of all of the articles not already in the OA repository; 6) sends authors emails requesting a simple upload of author manuscripts; and 7) adds link-harvested metadata from DOIs with uploaded preprints into a bepress repository; the code can be modified for additional repositories. The results of this study show that, in the administrative time needed to make a single document OA manually, aperta-accessum can process approximately five entire departments worth of peer-reviewed articles. Following best practices discussed, it is clear that this open-source OA harvester enables institutional library’s stewardship of OA knowledge on a mass scale for radically reduced costs.

https://doi.org/10.31274/jlsc.14421

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"eLife Reviewed Preprints: Interview with Fiona Hutton"


How is the new publishing model similar to or different from older publishing models based on preprints combined with peer review (e.g. Copernicus, F1000)? There are three main differences. 1) Peer review and assessment at eLife continues to be organised by an editorial team made up of academic experts and led by an Editor-in-Chief, Deputy Editors, Senior Editors, and a Board of Reviewing Editors via a consultative peer-review model already known as one of the most constructive for authors in the industry. 2) The addition of an eLife assessment is a further crucial part of our model, distinctive from what others are doing—it is a key addition to our public peer reviews and it enables readers to understand the context of the work, the significance of the research and the strength of the evidence. 3) We are no longer making accept/reject decisions based on peer review—authors will choose if and when to produce a Version of Record at any point following the review process.

https://cutt.ly/gMPfonI

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Paywall: "Expanding Your Institutional Repository: Librarians Working with Faculty"


Since a successful institutional repository will contain a higher percentage of the contributors’ materials, we implemented a system to upload faculty publications more effectively to our academic library’s institutional repository.. . . The success of this method is indicated by the increase in articles that have been uploaded to our institutional repository; as a result of the implementation of this program, the number of publications in our university’s institutional repository by these authors has increased 174 %.

https://doi.org/10.1016/j.acalib.2022.102628

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |