"Can ChatGPT Be Used to Predict Citation Counts, Readership, and Social Media Interaction? An Exploration among 2222 Scientific Abstracts"


This study explores the potential of ChatGPT, a large language model, in scientometrics by assessing its ability to predict citation counts, Mendeley readers, and social media engagement. In this study, 2222 abstracts from PLOS ONE articles published during the initial months of 2022 were analyzed using ChatGPT-4, which used a set of 60 criteria to assess each abstract. Using a principal component analysis, three components were identified: Quality and Reliability, Accessibility and Understandability, and Novelty and Engagement. The Accessibility and Understandability of the abstracts correlated with higher Mendeley readership, while Novelty and Engagement and Accessibility and Understandability were linked to citation counts (Dimensions, Scopus, Google Scholar) and social media attention. Quality and Reliability showed minimal correlation with citation and altmetrics outcomes. Finally, it was found that the predictive correlations of ChatGPT-based assessments surpassed traditional readability metrics. The findings highlight the potential of large language models in scientometrics and possibly pave the way for AI-assisted peer review.

https://doi.org/10.1007/s11192-024-04939-y

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Google Scholar is Manipulatable"


Citations are widely considered in scientists’ evaluation. As such, scientists may be incentivized to inflate their citation counts. While previous literature has examined self-citations and citation cartels, it remains unclear whether scientists can purchase citations. Here, we compile a dataset of ~1.6 million profiles on Google Scholar to examine instances of citation fraud on the platform. We survey faculty at highly-ranked universities, and confirm that Google Scholar is widely used when evaluating scientists. Intrigued by a citation-boosting service that we unravelled during our investigation, we contacted the service while undercover as a fictional author, and managed to purchase 50 citations. These findings provide conclusive evidence that citations can be bought in bulk, and highlight the need to look beyond citation counts.

https://arxiv.org/abs/2402.04607

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Is Gold Open Access Helpful for Academic Purification? A Causal Inference Analysis Based on Retracted Articles in Biochemistry"


The results showed that compared to non-OA, Gold OA is advantageous in reducing the retraction time of flawed articles, but does not demonstrate a significant advantage in reducing citations after retraction. This indicates that Gold OA may help expedite the detection and retraction of flawed articles, ultimately promoting the practice of responsible research.

https://doi.org/10.1016/j.ipm.2023.103640

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Open-Access Papers Draw More Citations from a Broader Readership"


Now, after years of little conclusive evidence to support these assertions, researchers report that open-access papers have a greater reach than paywalled ones in two key ways: They attract more total citations, and those citations come from scholars in a wider range of locations, institutions, and fields of research. The study also reports a "citation diversity advantage" for a controversial type of open-access article, those deposited in "green" public repositories.

http://tinyurl.com/27p6pfje

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Promotion of Scientific Publications on ArXiv and X Is on the Rise and Impacts Citations"


Here, based on a large dataset of computer science publications, we study trends in the use of early preprint publications and revisions on ArXiv and the use of X (formerly Twitter) for promotion of such papers in the last 10 years. We find that early submission to ArXiv and promotion on X have soared in recent years. Estimating the effect that the use of each of these modern affordances has on the number of citations of scientific publications, we find that in the first 5 years from an initial publication peer-reviewed conference papers submitted early to ArXiv gain on average 21.1±17.4 more citations, revised on ArXiv gain 18.4±17.6 more citations, and promoted on X gain 44.4±8 more citations. Our results show that promoting one’s work on ArXiv or X has a large impact on the number of citations, as well as the number of influential citations computed by Semantic Scholar, and thereby on the career of researchers. We discuss the far-reaching implications of these findings for future scientific publishing systems and measures of scientific impact.

https://arxiv.org/abs/2401.11116

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Self-Archiving Adoption in Legal Scholarly Communication: A Literature Review;"


This article explores the current Library and Information Science (LIS) literature on open access and self-archiving and related studies. . . It further investigates the open access and self-archiving practices in disciplinary . . . Finally, it examines self-archiving in law and concludes that the research gap and lack of literature on self-archiving in the discipline of law makes this study worthwhile.

https://doi.org/10.1080/13614576.2023.2279760

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Paywall: "Analyzing the Relationship between Citation-Based Impact Metrics and Electronic Journal Usage: A Case Study"


We focus on the impact of major JIFs on local e-journal usage and propose an alternative approach to conventional methods for collection selectors. By treating journal usage patterns as panel data and employing fixed-effects regression models, we find that journal popularity has the greatest influence on local e-journal usage and the effects of impact factors on academic article usage can vary across different disciplines.

https://doi.org/10.1080/01462679.2023.2230166

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"The Impacts of Changes in Journal Data Policies: A Cross-disciplinary Survey"


This discipline-specific survey of journal DSP and SMP highlighted the increasing adoption rates and rankings of DSP over time. Furthermore, the findings suggest that DSP adoption may have a notable impact on the increase in JIF. The adoption of DSP by journals may be associated with the increased attention and credibility of the articles.

https://doi.org/10.1002/pra2.924

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Paywall: "Trends in Research Impact Librarianship: Developing a New Program and Services"


Research impact librarianship is an area within the profession that continues to grow out of need for dedicated expertise of bibliometrics and other various assessment measures.. . . The Libraries at the University of Houston is in the midst of creating a research visibility and impact program born out of an initiative to elevate the university’s level of prestige and impact by developing personnel, programs, and practices to support research visibility and impact across the institution. This article discusses the University of Houston Libraries’ process and progress toward formalizing research impact services.

https://doi.org/10.1080/01930826.2023.2262364

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Measured in a Context: Making Sense of Open Access Book Data"


Open access (OA) book platforms, such as JSTOR, OAPEN Library or Google Books, have been available for over a decade. Each platform shows usage data, but this results in confusion about how well an individual book is performing overall. Even within one platform, there are considerable usage differences between subjects and languages. Some context is therefore necessary to make sense of OA books usage data. A possible solution is a new metric — the Transparent Open Access Normalized Index (TOANI) score. It is designed to provide a simple answer to the question of how well an individual open access book or chapter is performing. The transparency is based on clear rules, and by making all of the data used visible. The data is normalized, using a common scale for the complete collection of an open access book platform and, to keep the level of complexity as low as possible, the score is based on a simple metric. As a proof of the concept, the usage of over 18,000 open access books and chapters in the OAPEN Library has been analysed, to determine whether each individual title has performed as well as can be expected compared to similar titles.

https://doi.org/10.1629/uksg.627

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Using Altmetric Data Responsibly: A Guide to Interpretation and Good Practice

This guide focuses specifically on data from the data provider and company, Altmetric, but other types of altmetrics are mentioned and occasionally used as a comparison in this guide, such as the Open Syllabus database to find the educational engagement with scholarly outputs. This guide opens with an introduction followed by an overview of Altmetric and the Altmetric Attention Score, Altmetrics and Responsible Research Assessment, Output Types Tracked by Altmetric, and the Altmetric Sources of Attention, which include: News and Mainstream Media, Social Media (X (formerly Twitter), Facebook, Reddit, and historical data from Google+, Pinterest, LinkedIn, and Sina Weibo); Patents, Peer Review, Syllabi (historical data only), Multimedia, Public Policy Documents, Wikipedia, Research Highlights, Reference Managers, and Blogs; finally, there is a conclusion, a list of related resources and readings, two appendices, and references. This guide is intended for use by librarians, practitioners, funders, and other users of Altmetric data or those who are interested in incorporating altmetrics into their bibliometric practice and/or research analytics. It can also help researchers who are going up for annual evaluations and promotion and tenure reviews, who can use the data in informed and practical applications. It can also be a useful reference guide for research managers and university administrators who want to understand the broader online engagement with research publications beyond traditional scholarly citations, also known as bibliometrics, but who also want to avoid misusing, misinterpreting, or abusing Altmetric data when making decisions, creating policies, and evaluating faculty members and researchers at their institutions.

http://hdl.handle.net/10919/116448

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"What Happens When a Journal Converts to Open Access? A Bibliometric Analysis"


In recent years, increased stakeholder pressure to transition research to Open Access has led to many journals converting, or "flipping," from a closed access (CA) to an open access (OA) publishing model. Changing the publishing model can influence the decision of authors to submit their papers to a journal, and increased article accessibility may influence citation behaviour. In this paper we aimed to understand how flipping a journal to an OA model influences the journal’s future publication volumes and citation impact. We analysed two independent sets of journals that had flipped to an OA model, one from the Directory of Open Access Journals (DOAJ) and one from the Open Access Directory (OAD), and compared their development with two respective control groups of similar journals. For bibliometric analyses, journals were matched to the Scopus database. We assessed changes in the number of articles published over time, as well as two citation metrics at the journal and article level: the normalised impact factor (IF) and the average relative citations (ARC), respectively. Our results show that overall, journals that flipped to an OA model increased their publication output compared to journals that remained closed. Mean normalised IF and ARC also generally increased following the flip to an OA model, at a greater rate than was observed in the control groups. However, the changes appear to vary largely by scientific discipline. Overall, these results indicate that flipping to an OA publishing model can bring positive changes to a journal.

https://doi.org/10.1007/s11192-021-03972-5

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Quantification of Open Scholarship — A Mapping Review"


This mapping review addresses scientometric indicators that quantify open scholarship. The goal is to determine what open scholarship metrics are currently being applied and which are discussed, e.g. in policy papers. The paper contributes to a better understanding on how open scholarship is quantitatively recorded in research assessment and where gaps can be identified. The review is based on a search in four databases, each with 22 queries. Out of 3385 hits, we coded 248 documents chosen according to the research questions. The review discusses the open scholarship metrics of the documents as well as the topics addressed in the publications, the disciplines the publications come from and the journals they were published. The results indicate that research and teaching practices are unequally represented regarding open scholarship metrics. Open research material is a central and exhausted topic in publications. Open teaching practices, on the other hand, play a role in the discussion and strategy papers of the review, but open teaching material is not recorded using concrete scientometric indicators. Here, we see a research gap and discuss potentials for further research and investigation.

https://doi.org/10.1162/qss_a_00266

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"You Do Not Receive Enough Recognition for Your Influential Science"


During career advancement and funding allocation decisions in biomedicine, reviewers have traditionally depended on journal-level measures of scientific influence like the impact factor. Prestigious journals are thought to pursue a reputation of exclusivity by rejecting large quantities of papers, many of which may be meritorious. It is possible that this process could create a system whereby some influential articles are prospectively identified and recognized by journal brands but most influential articles are overlooked. Here, we measure the degree to which journal prestige hierarchies capture or overlook influential science. We quantify the fraction of scientists’ articles that would receive recognition because (a) they are published in journals above a chosen impact factor threshold, or (b) are at least as well-cited as articles appearing in such journals. We find that the number of papers cited at least as well as those appearing in high-impact factor journals vastly exceeds the number of papers published in such venues. At the investigator level, this phenomenon extends across gender, racial, and career stage groupings of scientists. We also find that approximately half of researchers never publish in a venue with an impact factor above 15, which under journal-level evaluation regimes may exclude them from consideration for opportunities. Many of these researchers publish equally influential work, however, raising the possibility that the traditionally chosen journal-level measures that are routinely considered under decision-making norms, policy, or law, may recognize as little as 10-20% of the work that warrants recognition.

https://doi.org/10.1101/2023.09.07.556750

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Scopus Introduces the Author Position Metric — A New Researcher Signal"


Most publications are created with the input from multiple co-authors. Traditional citation metrics give each co-author the same citation impact, even though the actual contribution of each researcher will not have been even. . . .

We have now added a new feature to capture the following authorship positions or types:

  • First author: The first author mentioned in the publication
  • Last author: The last author mentioned in the publication
  • Corresponding author: An author is marked as the corresponding author in the publication. Since June 2020, newly released documents in Scopus can contain more than one corresponding author. . .
  • Co-author: For documents with more than one author, co-authors are any author that is not a first, last or corresponding author
  • Single author: An author is the only author of a publication

https://tinyurl.com/45ynjmr7

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Tracing Data: A Survey Investigating Disciplinary Differences in Data Citation"


Data citations, or citations in reference lists to data, are increasingly seen as an important means to trace data reuse and incentivize data sharing. Although disciplinary differences in data citation practices have been well documented via scientometric approaches, we do not yet know how representative these practices are within disciplines. Nor do we yet have insight into researchers’ motivations for citing — or not citing — data in their academic work. Here, we present the results of the largest known survey (n = 2,492) to explicitly investigate data citation practices, preferences, and motivations, using a representative sample of academic authors by discipline, as represented in the Web of Science (WoS). We present findings about researchers’ current practices and motivations for reusing and citing data and also examine their preferences for how they would like their own data to be cited. We conclude by discussing disciplinary patterns in two broad clusters, focusing on patterns in the social sciences and humanities, and consider the implications of our results for tracing and rewarding data sharing and reuse.

https://doi.org/10.1162/qss_a_00264

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Citation Beneficiaries of Discipline-Specific Mega-Journals: Who and How Much"


The emergence of mega-journals (MJs) has influenced scholarly communication. One concrete manifestation of this impact is that more citations have been generated. Citations are the foundation of many evaluation metrics to assess the scientific impact of journals, disciplines, and regions. We focused on searching for citation beneficiaries and quantifying the relative benefit at the journal, discipline and region levels. More specifically, we examined the distribution and contribution to citation-based metrics of citations generated by the five discipline-specific mega-journals (DSMJs) categorized as Environmental Sciences (ES) on Web of Science (WoS) from Clarivate Analytics in 2021: Sustainability, International Journal of Environmental Research and Public Health, Environmental Science and Pollution Research, Journal of Cleaner Production and Science of the Total Environment. Analysis of the distribution of citing data of the five DSMJs shows a pattern with wide coverage but skewness by region and the WoS category; that is, papers in the five DSMJs contributed 26.66% of their citations in 2021 to Mainland China and 22.48% to the ES. Moreover, 15 journals within the ES had their JIFs boosted by more than 20%, benefitting from the high citing rates of the five DSMJs. More importantly, the analysis provides clear evidence that DSMJs can contribute to JIF scores throughout a discipline through their volume of references. Overall, DSMJs can widely impact scholarly evaluation because they contribute citation benefits and improve the evaluation index performance of different scientific entities at different levels. Considering the important application of citation indicators in the academic evaluation system and the increase in citations, it is important to reconsider the real research impact that citations can reflect.

https://doi.org/10.1057/s41599-023-02050-w

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Identification and Portraits of Open Access Journals Based on Open Impact Metrics Extracted from Social Activities "


This study finds that open access journals strengthen international academic communication and cooperation, build cross-border and cross-regional knowledge-sharing projects, realize the knowledge of interdisciplinary sharing and exchange, and, most importantly, provide a one-stop service for readers. This research indicates that through the use of open impact metrics, it is possible to identify the portraits of open access journals, thus providing a new method to construct and reform open access journal evaluation systems.

https://tinyurl.com/3hvs2y8v

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Who Re-Uses Data? A Bibliometric Analysis of Dataset Citations"


Open data is receiving increased attention and support in academic environments, with one justification being that shared data may be re-used in further research. But what evidence exists for such re-use, and what is the relationship between the producers of shared datasets and researchers who use them? Using a sample of data citations from OpenAlex, this study investigates the relationship between creators and citers of datasets at the individual, institutional, and national levels. We find that the vast majority of datasets have no recorded citations, and that most cited datasets only have a single citation. Rates of self-citation by individuals and institutions tend towards the low end of previous findings and vary widely across disciplines. At the country level, the United States is by far the most prominent exporter of re-used datasets, while importation is more evenly distributed. Understanding where and how the sharing of data between researchers, institutions, and countries takes place is essential to developing open research practices.

https://arxiv.org/abs/2308.04379

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

An Index, a Publisher and an Unequal Global Research Economy


This is the story of how a publisher and a citation index turned the science communication system into a highly profitable global industry. Over the course of seventy years, academic journal articles have become commodities, and their meta-data a further source of revenue. . . . During the 1950s, two men — Robert Maxwell and Eugene Garfield — begin to experiment with their blueprint for the research economy. Maxwell created an ‘international’ publisher — Pergamon Press — charming the editors of elite, not-for-profit society journals into signing commercial contracts. Garfield invented the science citation index to help librarians manage this growing flow of knowledge. . . . Sixty years later, the global science system has become a citation economy, with academic credibility mediated by the currency produced by the two dominant commercial citation indexes: Elsevier’s Scopus and Clarivates Web of Science. The reach of these citation indexes and their data analytics is amplified by digitisation, computing power and financial investment. . . . Non-Anglophone journals are disproportionately excluded from these indexes, reinforcing the stratification of academic credibility geographies and endangering long established knowledge ecosystems.

https://tinyurl.com/3x7try9p

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Bibliometrics Methods in Detecting Citations to Questionable Journals"


This paper intends to analyse whether journals that had been removed from the Directory of Open Access Journals (DOAJ) in 2018 due to suspected misconduct were cited within journals indexed in the Scopus database. Our analysis showed that Scopus contained over 15 thousand references to the removed journals identified.

https://doi.org/10.1016/j.acalib.2023.102749

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Requires Registration: "Clarivate Report Urges Shift from Single Metrics to Visual Research Profiles"


The report focuses on four key areas:

  • Individuals and their publications: The issue of excessive self-citation in research publications is addressed, with identification of outliers following examination of the distinctive patterns of self-citation observed among Highly Cited Researchers, while considering variations in citation rates between fields.
  • Future research trends: Research Fronts identifies current areas of research attention by analyzing frequently cited, recent papers that cluster together, providing valuable insights for research planning, resource management and policy decisions.
  • Journals and their characteristics: The profile and value of a journal in the Web of Science is more than its Journal Impact Factor. We explore how the indicator of national orientation (INO) offers new perspectives on journals, helping researchers choose the best venues for their papers.
  • Influence of international collaboration: Simple metrics mask the influence of well-cited, internationally co-authored papers, so cannot be properly used to assess them. Collaborative Citation Impact (Collab-CNCI) allows deconstruction of impact, enabling better evaluation of domestic and international activity.

https://tinyurl.com/4vetr6px

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Factors Affecting Publication Impact and Citation Trends Over Time"


Based on the results, researchers should seek out grant funding and generously incorporate literature into their co-authored publications to increase their publications’ potential for future impact. These factors may influence article quality, resulting in more citations over time. Further research is needed to better understand their influence and the influence of other factors.

https://doi.org/10.18438/eblip30206

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Evaluating the Efficacy of ChatGPT-4 in Providing Scientific References across Diverse Disciplines"


This work conducts a comprehensive exploration into the proficiency of OpenAI’s ChatGPT-4 in sourcing scientific references within an array of research disciplines. Our in-depth analysis encompasses a wide scope of fields including Computer Science (CS), Mechanical Engineering (ME), Electrical Engineering (EE), Biomedical Engineering (BME), and Medicine, as well as their more specialized sub-domains. Our empirical findings indicate a significant variance in ChatGPT-4’s performance across these disciplines. Notably, the validity rate of suggested articles in CS, BME, and Medicine surpasses 65%, whereas in the realms of ME and EE, the model fails to verify any article as valid. Further, in the context of retrieving articles pertinent to niche research topics, ChatGPT-4 tends to yield references that align with the broader thematic areas as opposed to the narrowly defined topics of interest. This observed disparity underscores the pronounced variability in accuracy across diverse research fields, indicating the potential requirement for model refinement to enhance its functionality in academic research. Our investigation offers valuable insights into the current capacities and limitations of AI-powered tools in scholarly research, thereby emphasizing the indispensable role of human oversight and rigorous validation in leveraging such models for academic pursuits.

https://arxiv.org/abs/2306.09914v1

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |