"Unreviewed Science in the News: The Evolution of Preprint Media Coverage from 2014-2021"


It has been argued that preprint coverage during the COVID-19 pandemic constituted a paradigm shift in journalism norms and practices. This study examines whether, in what ways, and to what extent this is the case using a sample of 11,538 preprints posted on four preprint servers—bioRxiv, medRxiv, arXiv, and SSRN—that received coverage in 94 English-language media outlets between 2014-2021. We compared mentions of these preprints with mentions of a comparison sample of 397,446 peer reviewed research articles indexed in the Web of Science to identify changes in the share of media coverage that mentioned preprints before and during the pandemic. We found that preprint media coverage increased at a slow but steady rate pre-pandemic, then spiked dramatically. This increase applied only to COVID-19-related preprints, with minimal or no change in coverage of preprints on other topics. In addition, the rise in preprint coverage was most pronounced among health and medicine-focused media outlets, which barely covered preprints before the pandemic but mentioned more COVID-19 preprints than outlets focused on any other topic. These results suggest that the growth in coverage of preprints seen during the pandemic period may imply a shift in journalistic norms, including a changing outlook on reporting preliminary, unvetted research.

https://doi.org/10.1101/2023.07.10.548392

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: The Strategic Marketing of Science, Technology, and Medical Journals: A Business History of a Dynamic Marketplace, 2000–2020


This book analyzes the various economic and marketing strategies utilized by the five major STM commercial scholarly journal publishers since 2000. This period has witnessed tremendous economic, marketing, and technological growth including the migration from a print only to a hybrid publishing format. With this growth, the industry has also seen the rise of open access publishing, copyright challenges by websites such as Sci-Hub, the emergence of sharing platforms such as ResearchGate and Academia.edu, as well as the impact of Plan S on publishers, universities, and authors.. . . Scrutinizing the different managerial, marketing, technology, and economic-financial strategies crafted by scholarly journal publishers between 2000-2020, this book offers a comprehensive assessment of the industry’s attempts to identify, understand, cope with, and minimize or defeat the herculean threats to its business model.

https://tinyurl.com/5n6rd8xy

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Status of Open Access Repositories in the Field of Technology: Insights from OpenDOAR"


The study found that 125 nations contributed a total of 4,045 repositories in the field of research, with the USA leading the list with the most repositories. Maximum repositories were operated by institutions having multidisciplinary approaches. The DSpace and Eprints were the preferred software types for repositories. The preferred upload content by contributors was "research articles" and "electronic thesis and dissertations."

https://doi.org/10.1108/IDD-11-2022-0119

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Japanese Preprint Server: "Guest Post — A Year of Jxiv — Warming the Preprints Stone"


However, this anomaly was corrected with the launch in March 2022 of Jxiv — the first fully-fledged Japanese-born preprint server — by the Japan Science and Technology Agency (JST), one of the largest public funders of research in the country that sits under the administrative and policy behemoth, the Ministry of Education, Culture, Sports, Science and Technology (MEXT). . . . JST also manages J-STAGE, the national online platform for Japanese journals launched in 1999, which hosts more than 3,500 journals containing almost 5.38 million articles, as well as J-STAGE Data launched in 2020.

https://tinyurl.com/388vd3y3

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"A Scoping Review on the Use and Acceptability of Preprints"


Preprints are open and accessible scientific manuscript or report that has not been submitted to a peer reviewed journal. The value and importance of preprints has grown since its contribution during the public health emergency of the COVID-19 pandemic. Funders and publishers are establishing their position on the use of preprints, in grant applications and publishing models. However, the evidence supporting the use and acceptability of preprints varies across funders, publishers, and researchers. The purpose of this scoping review was to explore the current evidence on the use and acceptability of preprints by publishers, funders, and the research community throughout the research lifecycle.

https://doi.org/10.31235/osf.io/nug4p

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Open Access at a Crossroads: Library Publishing and Bibliodiversity"


The open access movement has gained momentum since the Budapest Open Access Initiative (BOAI) first launched twenty years ago. Notably, there has been a drastic increase in the number of open access articles. Concerns have been raised about equality and diversity issues, however, for researchers without an affiliation (e.g. independent, unemployed and retired researchers) and researchers on the "scientific periphery" who are excluded from the gold open access model. This article argues that the gold open access model is destructive to the knowledge production ecosystem by addressing the importance of bibliodiversity and the ways in which library publishing can contribute to sustainable and equitable knowledge production.

https://doi.org/10.1629/uksg.613

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Academic Publishing and Open Access. What Does Economics Teach Us?


While the gold regime seems the most natural way to achieve open access, a generalized switch to open access may also have undesired consequences: projections indeed suggest that a massive move towards the gold regime would generate an explosion in the amount of APC unless there are controls to limit market power. Beside the sharp increase in APC, the shift to gold open access may create conflicts of interest for publishers given that their income comes from authors and may alter the quality of publications. The green regime, by introducing competition between the journal’s version of an article and a free public version, seems an efficient way to reduce market power while expanding access.

https://shs.hal.science/halshs-04080573

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"To Preprint or Not to Preprint: Experience and Attitudes of Researchers Worldwide"


The pandemic has underlined the significance of open science and spurred further growth of preprinting. Nevertheless, preprinting has been adopted at varying rates across different countries/regions. To investigate researchers’ experience with and attitudes toward preprinting, we conducted a survey of authors of research papers published in 2021 or 2022. We find that respondents in the US and Europe had a higher level of familiarity with and adoption of preprinting than those in China and the rest of the world. Respondents in China were most worried about the lack of recognition for preprinting and the risk of getting scooped. US respondents were very concerned about premature media coverage of preprints, the reliability and credibility of preprints, and public sharing of information before peer review. Respondents identified integration of preprinting in journal submission processes as the most important way to promote preprinting.

https://doi.org/10.55835/6442f782b2b5580ba561406b

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Do Open Access Mandates Work? A Systematized Review of the Literature on Open Access Publishing Rates"


To encourage the sharing of research, various entities—including public and private funders, universities, and academic journals—have enacted open access (OA) mandates or data sharing policies. It is unclear, however, whether these OA mandates and policies increase the rate of OA publishing and data sharing within the research communities impacted by them. A team of librarians conducted a systematized review of the literature to answer this question. A comprehensive search of several scholarly databases and grey literature sources resulted in 4,689 unique citations. However, only five articles met the inclusion criteria and were deemed as having an acceptable risk of bias. This sample showed that although the majority of the mandates described in the literature were correlated with a subsequent increase in OA publishing or data sharing, the presence of various confounders and the differing methods of collecting and analyzing the data used by the studies’ authors made it impossible to establish a causative relationship.

https://doi.org/10.31274/jlsc.15444

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Transformation of the Green Road to Open Access"


(1) Background: The 2002 Budapest Open Access Initiative recommended on self-archiving of scientific articles in open repositories as the "green road" to open access. Twenty years later, only one part of the researchers deposits their publications in open repositories; moreover, one part of the repositories’ content is not based on self-archived deposits but on mediated nonfaculty contributions. The purpose of the paper is to provide more empirical evidence on this situation and to assess the impact on the future of the green road. (2) Methods: We analyzed the contributions on the French national HAL repository from more than 1,000 laboratories affiliated to the ten most important French research universities, with a focus on 2020, representing 14,023 contributor accounts and 166,939 deposits. (3) Results: We identified seven different types of contributor accounts, including deposits from nonfaculty staff and import flows from other platforms. Mediated nonfaculty contribution accounts for at least 48% of the deposits. We also identified difference between institutions and disciplines. (4) Conclusions: Our empirical results reveal a transformation of open repositories from self-archiving and direct scientific communication towards research information management. Repositories like HAL are somewhere in the middle of the process. The paper describes data quality as the main issue and major challenge of this transformation.

https://doi.org/10.20944/preprints202302.0268.v1

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Only 10% Fully Understand "Preprint": "Framing COVID-19 Preprint Research as Uncertain: A Mixed-Method Study of Public Reactions"


Unlike hedging, preprint disclosure had no impact on audience message evaluations, nor vaccine attitudes and intentions. In one sense, this is a positive finding in that transparency about preprint status is unlikely to produce negative public reactions. Yet a likely explanation for the null effects is that most participants lacked the knowledge to differentiate between preprints and peer-reviewed research and did not understand this disclosure as an indicator of preliminary science. The qualitative data supported this explanation. When asked how they interpret the term "preprint" when they see it in a scientific news article, participants’ responses indicated that most had a limited understanding of the concept, even among those who received the preprint disclosure message with a brief explanation of the term. In total, only 10% of participants provided definitions of preprint that aligned with those accepted by the scholarly community. Only 15% described the term as an indicator of uncertain or preliminary evidence.

https://doi.org/10.1080/10410236.2023.2164954

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"PreprintMatch: A Tool for Preprint to Publication Detection Shows Global Inequities in Scientific Publication"


Preprints, versions of scientific manuscripts that precede peer review, are growing in popularity. They offer an opportunity to democratize and accelerate research, as they have no publication costs or a lengthy peer review process. Preprints are often later published in peer-reviewed venues, but these publications and the original preprints are frequently not linked in any way. To this end, we developed a tool, PreprintMatch, to find matches between preprints and their corresponding published papers, if they exist. This tool outperforms existing techniques to match preprints and papers, both on matching performance and speed. PreprintMatch was applied to search for matches between preprints (from bioRxiv and medRxiv), and PubMed. The preliminary nature of preprints offers a unique perspective into scientific projects at a relatively early stage, and with better matching between preprint and paper, we explored questions related to research inequity. We found that preprints from low income countries are published as peer-reviewed papers at a lower rate than high income countries (39.6% and 61.1%, respectively), and our data is consistent with previous work that cite a lack of resources, lack of stability, and policy choices to explain this discrepancy. Preprints from low income countries were also found to be published quicker (178 vs 203 days) and with less title, abstract, and author similarity to the published version compared to high income countries. Low income countries add more authors from the preprint to the published version than high income countries (0.42 authors vs 0.32, respectively), a practice that is significantly more frequent in China compared to similar countries. Finally, we find that some publishers publish work with authors from lower income countries more frequently than others.

https://doi.org/10.1371/journal.pone.0281659

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Lack of Sustainability Plans for Preprint Services Risks Their Potential to Improve Science"


Despite successfully building a revenue model that shares the burden between Cornell University, the Simons Foundation and several members and supporters, arXiv’s “funding is still outpaced by [their] growth” – the server hosts over 2 million preprints already and is growing by 10% each year. And while arXiv has been supporting more and more scholars to share and discover preprints, the team behind it has been through significant changes in leadership and is dealing with the urgent need to modernize their 30-year-old technology. As a former Executive Director of arXiv noted, “[arXiv’s success] may not last forever”. Similarly, the recent news that Chan Zuckerberg Initiative has renewed its financial support for the leading preprint servers in biology and medicine, bioRxiv and medRxiv is welcome relief, but this support is temporary, and the team must find a way to continue in the long run.

bit.ly/3y745Ji

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

2.6 Billion Total Downloads: arXiv Annual Report 2022


Our critical priorities during 2022 were to secure additional funding, hire technical and program directors, and ramp up our efforts to modernize arXiv’s software by moving it to the cloud, which will provide better stability, scalability and maintainability. I’m pleased to report that we were able to make significant progress on all of these fronts. arXiv brought in more funding than expected in the form of grants, memberships, and donations, and we hired Stephanie Orphan as program director and Charles Frankston as technical director. Both bring strong and complementary expertise to the team. Moving the technical operations of arXiv—a service with a 30 year history—off of Cornell’s on-premises servers is a major, complicated task. The move to the cloud is currently in progress and on track

bit.ly/41exRsX

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Clarivate: "The Preprint Citation Index: Linking Preprints to the Trusted Web of Science Ecosystem"


After many months of planning, we are launching the Preprint Citation Index, a multidisciplinary collection of preprints from leading repositories that helps researchers stay current with the newest research while maintaining confidence in the resources they rely on. . . . The Preprint Citation Index currently provides nearly two million preprints from arXiv, bioRxiv, chemRxiv, medRxiv and Preprints.org. We plan to add preprints from a dozen additional repositories as well as display open peer reviews on Preprint Citation Index throughout 2023.

bit.ly/3YxPcuw

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Outside the Library: Early Career Researchers and Use of Alternative Information Sources in Pandemic Times"


Presents findings from a study into the attitudes and practices of pandemic-era early career researchers (ECRs) in regard to obtaining access to the formally published scholarly literature, which focused on alternative providers, notably ResearchGate and Sci-Hub. . . . Findings show that alternative providers, as represented by ResearchGate and Sci-Hub, have become established and appear to be gaining ground. However, there are considerable country- and discipline-associated differences.

https://doi.org/10.1002/leap.1522

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"One Size Does Not Fit All: Self-Archiving Personas Based on Federally Funded Researchers at a Mid-Sized Private Institution"


Introduction: This mixed-method study analyzes the self-archiving behaviors and underlying motivations of researchers at an institution very recently recategorized by the Carnegie Classification system from "Doctoral– High Research Activity (R2)" to "Doctoral–Very High Research Activity (R1)." Methods: A quantitative analysis of data provided by CHORUS, a multi-institutional open access (OA) infrastructure project designed to minimize the administrative costs of complying with federal public access mandates, was followed by semi-structured qualitative interviews with researchers to determine the underlying motivations for self-archiving research papers resulting from federal grant support. Results: Fifty-one authors with federal research funding published 71 journal articles; 139 OA versions of these 71 articles were intentionally made available by researchers across nine types of platforms, including and in addition to those provided by publishers. Interviews with 11 investigators revealed motivators such as a dedication to public access to knowledge, learned behaviors in specific disciplines, and enlightened self-interest. Challenges included concern regarding confidentiality, confusion about intellectual property and funder requirements, administrative overhead, and integrity of the scholarly record. Discussion: Despite concerns and a lack of an OA mandate and other drivers more commonly present at larger, more research-intensive universities, several researchers interviewed actively engaged in self-archiving article versions, not always with clear motivations. These findings have implications for both scholarly communications and collection development services. Conclusion: These quantitative and qualitative data informed the creation of three distinct personas intended to help librarians at similar universities design services in a manner that aligns with investigator motivations.

https://doi.org/10.31274/jlsc.13886

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"arXiv Announces New Policy on ChatGPT and Similar Tools"

In view of this, we

  1. continue to require authors to report in their work any significant use of sophisticated tools, such as instruments and software; we now include in particular text-to-text generative AI among those that should be reported consistent with subject standards for methodology.
  2. remind all colleagues that by signing their name as an author of a paper, they each individually take full responsibility for all its contents, irrespective of how the contents were generated. If generative AI language tools generate inappropriate language, plagiarized content, errors, mistakes, incorrect references, or misleading content, and that output is included in scientific works, it is the responsibility of the author(s).
  3. generative AI language tools should not be listed as an author; instead authors should refer to (1).

bit.ly/3wKlx5J

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Model(s) of the Future? Overlay Journals as an Overlooked and Emerging Trend in Scholarly Communication"


Overlay journals, a potentially overlooked model of scholarly communication, have seen a resurgence due to the increasing number of preprint repositories and preprints on coronavirus disease 2019 (COVID-19) related topics. Overlay journals at various stages of maturity were examined for unique characteristics, including whether the authors submitted their article to the journal, whether the peer reviews of the article were published by the overlay journal, and whether the overlay journals took advantage of opportunities for increased discovery. As librarians and researchers seek new, futuristic models for publishing, overlay journals are emerging as an important contribution to scholarly communication.

https://doi.org/10.5206/cjils-rcsib.v45i2.14730

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"A Framework for Improving the Accessibility of Research Papers on arXiv.org"


The research content hosted by arXiv is not fully accessible to everyone due to disabilities and other barriers. This matters because a significant proportion of people have reading and visual disabilities, it is important to our community that arXiv is as open as possible, and if science is to advance, we need wide and diverse participation. In addition, we have mandates to become accessible, and accessible content benefits everyone. In this paper, we will describe the accessibility problems with research, review current mitigations (and explain why they aren’t sufficient), and share the results of our user research with scientists and accessibility experts. Finally, we will present arXiv’s proposed next step towards more open science: offering HTML alongside existing PDF and TeX formats. An accessible HTML version of this paper is also available at https://info.arxiv.org/about/accessibility_research_report.html

https://arxiv.org/abs/2212.07286

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Phase 1 of the NIH Preprint Pilot: Testing the Viability of Making Preprints Discoverable in PubMed Central and PubMed"


Introduction: The National Library of Medicine (NLM) launched a pilot in June 2020 to 1) explore the feasibility and utility of adding preprints to PubMed Central (PMC) and making them discoverable in PubMed and 2) to support accelerated discoverability of NIH-supported research without compromising user trust in NLM’s widely used literature services. Methods: The first phase of the Pilot focused on archiving preprints reporting NIH-supported SARS-CoV-2 virus and COVID-19 research. To launch Phase 1, NLM identified eligible preprint servers and developed processes for identifying NIH-supported preprints within scope in these servers. Processes were also developed for the ingest and conversion of preprints in PMC and to send corresponding records to PubMed. User interfaces were modified for display of preprint records. NLM collected data on the preprints ingested and discovery of preprint records in PMC and PubMed and engaged users through focus groups and a survey to obtain direct feedback on the Pilot and perceptions of preprints. Results: Between June 2020 and June 2022, NLM added more than 3,300 preprint records to PMC and PubMed, which were viewed 4 million times and 3 million times, respectively. Nearly a quarter of preprints in the Pilot were not associated with a peer-reviewed published journal article. User feedback revealed that the inclusion of preprints did not have a notable impact on trust in PMC or PubMed. Discussion: NIH-supported preprints can be identified and added to PMC and PubMed without disrupting existing operations processes. Additionally, inclusion of preprints in PMC and PubMed accelerates discovery of NIH research without reducing trust in NLM literature services. Phase 1 of the Pilot provided a useful testbed for studying NIH investigator preprint posting practices, as well as knowledge gaps among user groups, during the COVID-19 public health emergency, an unusual time with heightened interest in immediate access to research results.

https://doi.org/10.1101/2022.12.12.520156

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Ten Recommended Practices for Managing Preprints in Generalist and Institutional Repositories"


Currently, there are numerous gaps in geographic and domain coverage and some authors will choose to deposit their research outputs into another type of repository, such as an institutional or generalist repository. . . . To address these gaps, a COAR-ASAPbio Working Group on Preprint in Repositories identified ten recommended practices for managing preprints across three areas: linking, discovery, and editorial processes. While we acknowledge that many of these practices are not currently in use by institutional and generalist repositories, we hope that these recommendations will encourage repositories around the world that collect preprints to begin to apply them locally.

https://cutt.ly/R0gursT

Full report

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Comparison of Clinical Study Results Reported in medRxiv Preprints vs Peer-reviewed Journal Articles"


Most clinical studies posted as preprints on medRxiv and subsequently published in peer-reviewed journals had concordant study characteristics, results, and final interpretations. With more than three-fourths of preprints published in journals within 24 months, these results may suggest that many preprints report findings that are consistent with the final peer-reviewed publications.

https://cutt.ly/k0gyQOv

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Evaluation of Publication of COVID-19–Related Articles Initially Presented as Preprints"


In this study, we identified 3343 COVID-19–related preprints posted on medRxiv in 2020. Our March 2022 search indicated that 1712 of those preprints (51.2%) were subsequently published in the peer-reviewed literature; this number increased to 1742 (52.1%) when we repeated the search in October 2022. Not considering January 2020, in which only 1 article on COVID-19 was posted, the rate of subsequent publication in a scientific journal ranged from 43.5% (94 of 216 preprints; observed in March 2020) to 60.6% (177 of 292 preprints posted in August 2020). The Table shows the top 25 of 579 peer-reviewed journals in which these preprints were published; 827 preprints (47.5%) were subsequently published in quartile 1 journals (Figure).

bit.ly/3HprhIq

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |