"Evaluating the Ability of Open-Source Artificial Intelligence to Predict Accepting-Journal Impact Factor and Eigenfactor Score Using Academic Article Abstracts: Cross-sectional Machine Learning Analysis"


Objective:

We sought to evaluate the performance of open-source artificial intelligence to predict the impact factor or Eigenfactor score tertile using academic article abstracts.

Methods:

PubMed-indexed articles published between 2016 and 2021 were identified with the Medical Subject Headings (MeSH) terms "ophthalmology," "radiology," and "neurology." Journals, titles, abstracts, author lists, and MeSH terms were collected. Journal impact factor and Eigenfactor scores were sourced from the 2020 Clarivate Journal Citation Report. The journals included in the study were allocated percentile ranks based on impact factor and Eigenfactor scores, compared with other journals that released publications in the same year. All abstracts were preprocessed, which included the removal of the abstract structure, and combined with titles, authors, and MeSH terms as a single input. The input data underwent preprocessing with the inbuilt ktrain Bidirectional Encoder Representations from Transformers (BERT) preprocessing library before analysis with BERT. Before use for logistic regression and XGBoost models, the input data underwent punctuation removal, negation detection, stemming, and conversion into a term frequency-inverse document frequency array. Following this preprocessing, data were randomly split into training and testing data sets with a 3:1 train:test ratio. Models were developed to predict whether a given article would be published in a first, second, or third tertile journal (0-33rd centile, 34th-66th centile, or 67th-100th centile), as ranked either by impact factor or Eigenfactor score. BERT, XGBoost, and logistic regression models were developed on the training data set before evaluation on the hold-out test data set. The primary outcome was overall classification accuracy for the best-performing model in the prediction of accepting journal impact factor tertile.

Results:

There were 10,813 articles from 382 unique journals. The median impact factor and Eigenfactor score were 2.117 (IQR 1.102-2.622) and 0.00247 (IQR 0.00105-0.03), respectively. The BERT model achieved the highest impact factor tertile classification accuracy of 75.0%, followed by an accuracy of 71.6% for XGBoost and 65.4% for logistic regression. Similarly, BERT achieved the highest Eigenfactor score tertile classification accuracy of 73.6%, followed by an accuracy of 71.8% for XGBoost and 65.3% for logistic regression.

Conclusions:

Open-source artificial intelligence can predict the impact factor and Eigenfactor score of accepting peer-reviewed journals. Further studies are required to examine the effect on publication success and the time-to-publication of such recommender systems.

https://doi.org/10.2196/42789

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"PreprintMatch: A Tool for Preprint to Publication Detection Shows Global Inequities in Scientific Publication"


Preprints, versions of scientific manuscripts that precede peer review, are growing in popularity. They offer an opportunity to democratize and accelerate research, as they have no publication costs or a lengthy peer review process. Preprints are often later published in peer-reviewed venues, but these publications and the original preprints are frequently not linked in any way. To this end, we developed a tool, PreprintMatch, to find matches between preprints and their corresponding published papers, if they exist. This tool outperforms existing techniques to match preprints and papers, both on matching performance and speed. PreprintMatch was applied to search for matches between preprints (from bioRxiv and medRxiv), and PubMed. The preliminary nature of preprints offers a unique perspective into scientific projects at a relatively early stage, and with better matching between preprint and paper, we explored questions related to research inequity. We found that preprints from low income countries are published as peer-reviewed papers at a lower rate than high income countries (39.6% and 61.1%, respectively), and our data is consistent with previous work that cite a lack of resources, lack of stability, and policy choices to explain this discrepancy. Preprints from low income countries were also found to be published quicker (178 vs 203 days) and with less title, abstract, and author similarity to the published version compared to high income countries. Low income countries add more authors from the preprint to the published version than high income countries (0.42 authors vs 0.32, respectively), a practice that is significantly more frequent in China compared to similar countries. Finally, we find that some publishers publish work with authors from lower income countries more frequently than others.

https://doi.org/10.1371/journal.pone.0281659

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Is Writing a Book Chapter Still a Waste of Time?"


How has digital open access transformed academic communication for the better? LSE Press’s Editor in Chief, Patrick Dunleavy, explores the impact of chapters in edited books. Once the Cinderella of academic publishing, doomed to obscurity under paywall books’ formal and de facto access restrictions, chapters in books are, thanks to digital open access, once again rivalling journal articles in their visibility to academic communities, their usefulness as teaching resources, and in their ability to tackle innovative and state of-the-art topics.

bit.ly/3KYRMq6

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Revisiting Methodology for Identifying Open Access Advantages"


This study revisited the methodology for identifying the effects of open access and revealed the causes for contradictory conclusions using four indices for journals that transitioned from subscription to open access. . . . Although the aggregated data of the eight journals indicated that open access had a positive effect, the effect varied across journals. A few journals produced different results between the two citation scores as well as between citation scores and number of citations or articles. Furthermore, a publisher’s choice of which journal to shift to open access influenced their performance after the shift.

https://doi.org/10.1007/s12109-023-09946-0

| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Research Data Curation and Management Works |
| Digital Scholarship |

Paywall: "Open Data and the 2023 NIH Data Management and Sharing Policy"


As the largest public funder of biomedical research in the world, the National Institutes of Health’s (NIH) new Data Management and Sharing (DMS) Policy is a large step toward shifting the culture of medical research toward a broader sharing of scientific data. . . . This article will serve as a primer on open data, data sharing, the NIH’s DMS Policy and its implications, and how librarians can support researchers in this landscape.

https://doi.org/10.1080/02763869.2023.2168103

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Lack of Sustainability Plans for Preprint Services Risks Their Potential to Improve Science"


Despite successfully building a revenue model that shares the burden between Cornell University, the Simons Foundation and several members and supporters, arXiv’s “funding is still outpaced by [their] growth” – the server hosts over 2 million preprints already and is growing by 10% each year. And while arXiv has been supporting more and more scholars to share and discover preprints, the team behind it has been through significant changes in leadership and is dealing with the urgent need to modernize their 30-year-old technology. As a former Executive Director of arXiv noted, “[arXiv’s success] may not last forever”. Similarly, the recent news that Chan Zuckerberg Initiative has renewed its financial support for the leading preprint servers in biology and medicine, bioRxiv and medRxiv is welcome relief, but this support is temporary, and the team must find a way to continue in the long run.

bit.ly/3y745Ji

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Bye, Bye  Big Deal: "Indispensable or Unnecessary?: A Data-Driven Appraisal of Post-cancellation Access Rights"


When breaking out of ‘big deals’, some libraries and consortia have found that they can save money by negotiating away post-cancellation access (PCA) to subscribed resources after the subscription concludes. Using subscription data regarding major publisher contracts at several US research libraries, this article reviews options around PCA for libraries and presents a model for assigning a value to PCA content when negotiating a renewal contract.

https://doi.org/10.1629/uksg.601

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"With New Model Language, Library E-book Bills Are Back"


The revised language, developed with support from nascent library advocacy group Library Futures, takes a "regulate " rather than "mandate " approach. In other words, unlike Maryland’s law, which would have required publishers to offer license agreements to libraries "on reasonable terms " for digital books that were available to consumers, the new legislative language instead focuses regulating the terms of agreements. Key to the revised bill’s effectiveness is language that would render unenforceable any license term that "precludes, limits, or restricts" libraries from performing their traditional, core mission.

bit.ly/3y42wfh

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Importance of Copyright and Shared Norms for Credit in Open Educational Resources"


Open Educational Resources (OER) are reducing barriers to education while allowing creators the opportunity to share their work with the world and continue owning copyright of their work. To support new authors and adaptors in the OER space, we provide an overview of common considerations that creators and adaptors of OER should make with respect to issues related to copyright in the context of OER. Further, and importantly, a challenge in the OER space is ensuring that original creators receive appropriate credit for their work, while also respecting the credit of those who have adapted work. Thus, in addition to providing important considerations when it comes to the creation of open access works, we propose shared norms for ensuring appropriate attribution and credit for creators and adaptors of OER.

https://doi.org/10.3389/feduc.2022.1069388

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"China and Open Access"


In December 2022, the International Association of STM Publishers and the China Association for Science and Technology (CAST) released a report: Open Access Publishing in China. The report is openly available in both English and Chinese. This interview with Mark Robertson, consultant to the STM Association on the project, highlights the findings of the report and their implications for the scholarly publishing industry as well as providing background on the STM/CAST collaboration.

bit.ly/3kHUuW5

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "An Investigation of Gold Open Access Publications of STEM Faculty at a Public University in the United States"


This study investigated Gold Open Access journal publication by science and engineering faculty at the authors’ university from 2013 to 2022. Specifically, did Gold Open Access (OA) by these faculty increase, and did the publication rate vary between disciplines? The authors found that Gold OA publication increased by 176% over the past 10 years, and that an important factor was the Libraries’ creation of an Open Access Publishing Fund in 2017.

https://doi.org/10.1080/0194262X.2023.2175103

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Future of the Monograph in the Arts, Humanities and Social Sciences: Publisher Perspectives on a Transitioning Format"


A web-based survey of academic publishers was undertaken in 2021 by a team at Oxford International Centre for Publishing into the state of monograph publication in the arts, humanities, and social sciences. 25 publishing organisations responded, including many of the larger presses, representing approximately 75% of monograph output. Responses to the survey showed that the Covid 19 pandemic has accelerated the existing trend from print to digital dissemination and that Open Access (OA) titles receive substantially greater levels of usage than those published traditionally. Responses also showed that for most publishers OA publication stands at under 25% of output and that fewer than 10% of authors enquire about OA publication options. Continuing problem areas highlighted by respondents were the clearing of rights for OA publication and the standardisation of title and usage metadata. All responding organisations confirmed that they expect to be publishing monographs in ten years’ time, but that they anticipate the format and/or the model will be different, with open access expected to play a key part in the future, perhaps in the context of a mixed economy of OA and ‘toll access’ publication.

https://doi.org/10.1007/s12109-023-09937-1

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Changes in the Absolute Numbers and Proportions of Open Access Articles from 2000 to 2021 Based on the Web of Science Core Collection: A Bibliometric Study"


Purpose:

The ultimate goal of current open access (OA) initiatives is for library services to use OA resources. This study aimed to assess the infrastructure for OA scholarly information services by tabulating the number and proportion of OA articles in a literature database.

Method:

We measured the absolute numbers and proportions of OA articles at different time points across various disciplines based on the Web of Science (WoS) database.

Results:

The number (proportion) of available OA articles between 2000 and 2021 in the WoS database was 12 million (32.4%). The number (proportion) of indexed OA articles in 1 year was 0.15 million (14.6%) in 2000 and 1.5 million (48.0%) in 2021. The proportion of OA by subject categories in the cumulative data was the highest in the multidisciplinary category (2000–2021, 79%; 2021, 89%), high in natural sciences (2000–2021, 21%–46%; 2021, 41%–62%) and health and medicine (2000–2021, 37%–40%; 2021, 52%–60%), and low in social sciences and others (2000–2021, 23%–32%; 2021, 36%–44%), engineering (2000–2021, 17%–33%; 2021, 31%–39%) and humanities and arts (2000–2021, 11%–22%; 2021, 28%–38%).

Conclusion:

Our study confirmed that increasingly many OA research papers have been published in the last 20 years, and the recent data show considerable promise for better services in the future. The proportions of OA articles differed among scholarly disciplines, and designing library services necessitates several considerations with regard to the customers’ demands, available OA resources, and strategic approaches to encourage the use of scholarly OA articles.

https://doi.org/10.6087/kcse.296

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Librarians and Academic Libraries’ Role in Promoting Open Access: What Needs to Change? "


Profound changes due to Open-Access (OA) publications lead to organizational changes in universities and libraries. This study examined Israeli librarians’ perceptions regarding their role and the academic library’s role in promoting OA-publication, including the barriers, challenges, needs and requirements necessary to promote OA publishing. Lack of a budget for OA-agreements and cooperation with university management, and researchers’ unawareness of OA were among the most prominent barriers. Librarians see great importance in their role of advising researchers regarding OA. However, they insisted on a regulated OA-policy at the national and institutional levels, which would strengthen their status as change-leaders of the OA-movement.

https://doi.org/10.31235/osf.io/shqnv

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Benefits of Open Access (OA) to Researchers from Lower-Income Countries: Tracing Evidence through an Analysis of Reference Patterns"


Making scientific literature freely available to everyone is a main objective of the open access (OA) movement. This may be of particular importance to researchers in lower-income countries, where access to literature is often hindered by high subscription costs. This study addresses this issue by analyzing reference lists of the world’s output of scientific publications over time. The core issues addressed include whether researchers from lower-income countries refer to fewer previous publications when they publish and how this pattern develops over time. Moreover, whether researchers from lower-income countries rely more on literature that is openly available through different OA routes than other researchers is explored. The study shows that the proportion of OA references increases over time for all publications and country groups. However, the main finding is that publications from lower-income countries have a higher growth rate of OA references. This suggests that an increase in OA publishing has been particularly beneficial to researchers in lower-income countries.

https://doi.org/10.31235/osf.io/ecgzh

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"How and Why Do Researchers Reference Data? A Study of Rhetorical Features and Functions of Data References in Academic Articles"


Data reuse is a common practice in the social sciences. While published data play an essential role in the production of social science research, they are not consistently cited, which makes it difficult to assess their full scholarly impact and give credit to the original data producers. Furthermore, it can be challenging to understand researchers’ motivations for referencing data. Like references to academic literature, data references perform various rhetorical functions, such as paying homage, signaling disagreement, or drawing comparisons. This paper studies how and why researchers reference social science data in their academic writing. We develop a typology to model relationships between the entities that anchor data references, along with their features (access, actions, locations, styles, types) and functions (critique, describe, illustrate, interact, legitimize). We illustrate the use of the typology by coding multidisciplinary research articles (n=30) referencing social science data archived at the Inter-university Consortium for Political and Social Research (ICPSR). We show how our typology captures researchers’ interactions with data and purposes for referencing data. Our typology provides a systematic way to document and analyze researchers’ narratives about data use, extending our ability to give credit to data that support research.

https://arxiv.org/abs/2302.08477

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Library Futures Releases Policy Paper: Digital Ownership for Libraries and the Public"


In response, Library Futures recommends policymakers adopt an approach of digital ownership that extends the current paradigm for print works and allow libraries to both maintain the benefits of print collections and innovate even further toward providing new methods of access, preservation, and education by creating new lending models, equitizing access for underserved communities, and contributing to a more democratic balance. To that end, we have outlined some approaches to solving this issue through structural, community-based, and technical means:

  • Legal reform: This can include judicial remedies through the courts, legislative action on the part of Congress, or regulatory intervention by an authority such as the Federal Trade Commission.
  • Collective action: Community intervention can be a powerful way to act concertedly to stand against entities that are prohibiting libraries from exercising their rights, such as boycotts and grassroots action, state legislative initiatives, and the collective use of incentives and accountability measures for publishers.
  • Library-owned infrastructure: The library community can build its own infrastructure to ensure that it is oriented towards the needs of their users and provides libraries with the choice to own their digital content. This is not without its challenges (practical and resource-wise), but sustainable infrastructure can put control of digital content back into the hands of libraries and users.

Policy Paper

https://www.libraryfutures.net/post/digital-ownership-for-libraries-and-the-public

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Mastodon over Mammon — Towards Publicly Owned Scholarly Knowledge"


Twitter is in turmoil and the scholarly community on the platform is once again starting to migrate. As with the early internet, scholarly organizations are at the forefront of developing and implementing a decentralized alternative to Twitter, Mastodon. Both historically and conceptually, this is not a new situation for the scholarly community. Historically, scholars were forced to leave social media platform FriendFeed after it was bought by Facebook in 2006. Conceptually, the problems associated with public scholarly discourse subjected to the whims of corporate owners are not unlike those of scholarly journals owned by monopolistic corporations: in both cases the perils associated with a public good in private hands are palpable. For both short form (Twitter/Mastodon) and longer form (journals) scholarly discourse, decentralized solutions exist, some of which are already enjoying some institutional support. Here we argue that scholarly organizations, in particular learned societies, are now facing a golden opportunity to rethink their hesitations towards such alternatives and support the migration of the scholarly community from Twitter to Mastodon by hosting Mastodon instances. Demonstrating that the scholarly community is capable of creating a truly public square for scholarly discourse, impervious to private takeover, might renew confidence and inspire the community to focus on analogous solutions for the remaining scholarly record —encompassing text, data and code —to safeguard all publicly owned scholarly knowledge.

https://doi.org/10.5281/zenodo.7643817

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Penn State: "University Libraries Expands Open Access Support via 3 New BTAA [Big 10] Agreements"


The agreements with Wiley, Institute of Physics (IOP) and Microbiology Society cover OA publishing charges for Penn State corresponding authors publishing in these publishers’ journals. Those qualified articles will be immediately open access on the publisher’s platform. These publishers will offer a choice of open access licenses to Penn State authors publishing in their journals. Authors retain copyright in their articles.

The agreements run for three years from Jan. 1, 2023, to Dec. 31, 2025. In general, articles will need to be accepted during the agreements’ timeframe. The agreements also cover subscriptions and read access to Wiley, Institute of Physics (IOP) and Microbiology Society journals. Unlimited open access publishing is included with no additional cost to individual Penn State authors.

 

bit.ly/3Sjx9Xa

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Can an Artificial Intelligence Chatbot Be the Author of a Scholarly Article?"


At the end of 2022, the appearance of ChatGPT, an artificial intelligence (AI) chatbot with amazing writing ability, caused a great sensation in academia. The chatbot turned out to be very capable, but also capable of deception, and the news broke that several researchers had listed the chatbot (including its earlier version) as co-authors of their academic papers. In response, Nature and Science expressed their position that this chatbot cannot be listed as an author in the papers they publish. Since an AI chatbot is not a human being, in the current legal system, the text automatically generated by an AI chatbot cannot be a copyrighted work; thus, an AI chatbot cannot be an author of a copyrighted work. Current AI chatbots such as ChatGPT are much more advanced than search engines in that they produce original text, but they still remain at the level of a search engine in that they cannot take responsibility for their writing. For this reason, they also cannot be authors from the perspective of research ethics.

https://doi.org/10.6087/kcse.292

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"How Open Access Diamond Journals Comply with Industry Standards Exemplified by Plan S Technical Requirements"


Purpose:

This study investigated how well current open access (OA) diamond journals in the Directory of Open Access Journals (DOAJ) and a survey conform to Plan S requirements, including licenses, peer review, author copyright, unique article identifiers, digital archiving, and machine-readable licenses.

Method:

Data obtained from DOAJ journals and surveyed journals from mid-June to mid-July 2020 were analyzed for a variety of Plan S requirements. The results were presented using descriptive statistics.

Results:

Out of 1,465 journals that answered, 1,137 (77.0%) reported compliance with the Committee on Publication Ethics (COPE) principles. The peer review types used by OA diamond journals were double-blind (6,339), blind (2,070), peer review (not otherwise specified, 1,879), open peer review (42), and editorial review (118) out of 10,449 DOAJ journals. An author copyright retention policy was adopted by 5,090 out of 10,448 OA diamond journals (48.7%) in DOAJ. Of the unique article identifiers, 5,702 (54.6%) were digital object identifiers, 58 (0.6%) were handles, and 14 (0.1%) were uniform resource names, while 4,675 (44.7%) used none. Out of 1,619 surveyed journals, the archiving solutions were national libraries (n=170, 10.5%), Portico (n=67, 4.1%), PubMed Central (n=15, 0.9%), PKP PN (n=91, 5.6%), LOCKSS (n=136, 8.4%), CLOCKSS (n=87, 5.4%), the National Computing Center for Higher Education (n=6, 0.3%), others (n=69, 4.3%), no policy (n=855, 52.8%), and no reply (n=123, 7.6%). Article-level metadata deposition was done by 8,145 out of 10,449 OA diamond journals (78.0%) in DOAJ.

Conclusion:

OA diamond journals’ compliance with industry standards exemplified by the Plan S technical requirements was insufficient, except for the peer review type.

https://doi.org/10.6087/kcse.295

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Impact and Perceived Value of the Revolutionary Advent of Artificial Intelligence in Research and Publishing among Researchers: A Survey-Based Descriptive Study"


Purpose:

This study was conducted to understand the perceptions and awareness of artificial intelligence (AI) in the academic publishing landscape.

Method:

We conducted a global survey entitled "Role and impact of AI on the future of academic publishing" to understand the impact of the AI wave in the scholarly publishing domain. This English-language survey was open to all researchers, authors, editors, publishers, and other stakeholders in the scholarly community. Conducted between August and October 2021, the survey received responses from around 212 universities across 54 countries.

Results:

Out of 365 respondents, about 93% belonged to the age groups of 18–34 and 35–54 years. While 50% of the respondents selected plagiarism detection as the most widely known AI-based application, image recognition (42%), data analytics (40%), and language enhancement (39%) were some other known applications of AI. The respondents also expressed the opinion that the academic publishing landscape will significantly benefit from AI. However, the major challenges restraining the large-scale adoption of AI, as expressed by 93% of the respondents, were limited knowledge and expertise, as well as difficulties in integrating AI-based solutions into existing IT infrastructure.

Conclusion:

The survey responses reflected the necessity of AI in research and publishing. This study suggests possible ways to support a smooth transition. This can be best achieved by educating and creating awareness to ease possible fears and hesitation, and to actualize the promising benefits of AI.

https://doi.org/10.6087/kcse.294

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |