"Unleashing the Power of AI. A Systematic Review of Cutting-Edge Techniques in AI-Enhanced Scientometrics, Webometrics, and Bibliometrics"


Findings: (i) Regarding scientometrics, the application of AI yields various distinct advantages, such as conducting analyses of publications, citations, research impact prediction, collaboration, research trend analysis, and knowledge mapping, in a more objective and reliable framework. (ii) In terms of webometrics, AI algorithms are able to enhance web crawling and data collection, web link analysis, web content analysis, social media analysis, web impact analysis, and recommender systems. (iii) Moreover, automation of data collection, analysis of citations, disambiguation of authors, analysis of co-authorship networks, assessment of research impact, text mining, and recommender systems are considered as the potential of AI integration in the field of bibliometrics.

https://arxiv.org/abs/2403.18838

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Now You Can Use ChatGPT without an Account"


OpenAI will no longer require an account to use ChatGPT, the company’s free AI platform. However, this only applies to ChatGPT, as other OpenAI products, like DALL-E 3, cost money to access and will still require an account for access. . . .

OpenAI said it introduced "additional content safeguards for this experience," including blocking prompts in a wider range of categories, but did not expound more on what these categories are. The option to opt out of model training will still be available, even to those without accounts.

https://tinyurl.com/582ehjhm

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Developing a Foundation for the Informational Needs of Generative AI Users through the Means of Established Interdisciplinary Relationships"


University faculty immediately had many questions and concerns in response to the public proliferation of generative artificial intelligence programs leveraging large language models to generate complex text responses to simple prompts. Librarians at the University of South Florida (USF) pooled their skills, existing relationships with faculty and professional staff across campus to provide information that answered common questions raised by those faculty on generative artificial intelligence usage within research related topics. Faculty concern regarding the worry of plagiarism, how to instruct students to use the new tools and how to discern the reliability of information generated by artificial intelligence tools were placed at the forefront.

https://doi.org/10.1016/j.acalib.2024.102876

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Generative AI for Trustworthy, Open, and Equitable Scholarship"


We focus on the potential of GenAI to address known problems for the alignment of science practice and its underlying core values. As institutions culturally charged with the curation and preservation of the world’s knowledge and cultural heritage, libraries are deeply invested in promoting a durable, trustworthy, and sustainable scholarly knowledge commons. With public trust in academia and in research waning [reference] and in the face of recent high-profile instances of research misconduct [reference], the scholarly community must act swiftly to develop policies, frameworks, and tools for leveraging the power of GenAI in ways that enhance, rather than erode, the trustworthiness of scientific communications, the breadth of scientific impact, and the public’s trust in science, academia, and research.

https://doi.org/10.21428/e4baedd9.567bfd15

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Evolving AI Strategies in Libraries: Insights from Two Polls of ARL Member Representatives over Nine Months—Report Published"


To effectively chart this [AI] transition, two quick polls were conducted among members of the Association of Research Libraries (ARL) to capture changing perspectives on the potential impact of AI, assess the extent of AI exploration and implementation within libraries, and identify AI applications relevant to the current library environment.

Today, ARL has released the results of the two polls—analyzing and juxtaposing the outcomes of these two surveys to better understand how library leaders are managing the complexities of integrating AI into their operations and services. The report also includes recommendations for ARL research libraries.

https://tinyurl.com/2t9nywcv

Report

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"TDM & AI Rights Reserved? Fair Use & Evolving Publisher Copyright Statements"


Earlier this year, we noticed that some academic publishers have revised the copyright notices on their websites to state they reserve rights to text and data mining (TDM) and AI training (for example, see the website footers for Elsevier and Wiley). . . .SPARC asked Kyle K. Courtney, Director of Copyright and Information Policy for Harvard Library, to address key questions regarding these revised copyright statements and the continuing viability of fair use justifications for TDM.

https://tinyurl.com/4prkfbb3

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Use ‘Jan’ to Chat with AI without the Privacy Concerns"


Jan is a free an open source application that makes it easy to download multiple large language models and start chatting with them. There are simple installers for Windows, macOS, and Linux. Now, this isn’t perfect. The models aren’t necessarily as good as the latest ones from OpenAI or Google, and depending on how powerful your computer is, the results might take a while to come in.

https://tinyurl.com/4m8p4b82

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Human-Centered Explainable Artificial Intelligence: An Annual Review of Information Science and Technology (Arist) Paper"


Explainability is central to trust and accountability in artificial intelligence (AI) applications. The field of human-centered explainable AI (HCXAI) arose as a response to mainstream explainable AI (XAI) which was focused on algorithmic perspectives and technical challenges, and less on the needs and contexts of the non-expert, lay user. HCXAI is characterized by putting humans at the center of AI explainability. . . . This review identifies the foundational ideas of HCXAI, how those concepts are operationalized in system design, how legislation and regulations might normalize its objectives, and the challenges that HCXAI must address as it matures as a field.

https://doi.org/10.1002/asi.24889

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Exploring the Potential of Large Language Models and Generative Artificial Intelligence (GPT): Applications in Library and Information Science"


The presented study offers a systematic overview of the potential application of large language models (LLMs) and generative artificial intelligence tools, notably the GPT model and the ChatGPT interface, within the realm of library and information science (LIS). The paper supplements and extends the outcomes of a comprehensive information survey on the subject matter with the author’s own experiences and examples showcasing possible applications, demonstrated through illustrative instances. This study does not involve testing available LLMs or selecting the most suitable tool; instead, it targets information professionals, specialists, librarians, and scientists, aiming to inspire them in various ways.

https://doi.org/10.1177/09610006241241066

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Latest ‘Crisis’ — Is the Research Literature Overrun with ChatGPT- and LLM-generated Articles?"


Elsevier has been under the spotlight this month for publishing a paper that contains a clearly ChatGPT-written portion of its introduction. The first sentence of the paper’s Introduction reads, "Certainly, here is a possible introduction for your topic:. . . ." To date, the article remains unchanged, and unretracted. A second paper, containing the phrase "I’m very sorry, but I don’t have access to real-time information or patient-specific data, as I am an AI language model" was subsequently found, and similarly remains unchanged. This has led to a spate of amateur bibliometricians scanning the literature for similar common AI-generated phrases, with some alarming results.

https://tinyurl.com/4a8bjmzy

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Fair Use Rights to Conduct Text and Data Mining and Use Artificial Intelligence Tools Are Essential for UC Research and Teaching"


The UC Libraries invest more than $60 million each year licensing systemwide electronic content needed by scholars for these and other studies. (Indeed, the $60 million figure represents license agreements made at the UC systemwide and multi-campus levels. But each individual campus also licenses electronic resources, adding millions more in total expenditures.) Our libraries secure campus access to a broad range of digital resources including books, scientific journals, databases, multimedia resources, and other materials. In doing so, the UC Libraries must negotiate licensing terms that ensure scholars can make both lawful and comprehensive use of the materials the libraries have procured. Increasingly, however, publishers and vendors are presenting libraries with content license agreements that attempt to preclude, or charge additional and unsupportable fees for, fair uses like training AI tools in the course of conducting TDM. . . .

If the UC Libraries are unable to protect these fair uses, UC scholars will be at the mercy of publishers aggregating and controlling what may be done with the scholarly record. Further, UC scholars’ pursuit of knowledge will be disproportionately stymied relative to academic colleagues in other global regions, given that a large proportion of other countries preclude contractual override of research exceptions.

Indeed, in more than forty countries—including all those within the European Union (EU)—publishers are prohibited from using contracts to abrogate exceptions to copyright in non-profit scholarly and educational contexts. Article 3 of the EU’s Directive on Copyright in the Digital Single Market preserves the right for scholars within research organizations and cultural heritage institutions (like those researchers at UC) to conduct TDM for scientific research, and further proscribes publishers from invalidating this exception by license agreements (see Article 7). Moreover, under AI regulations recently adopted by the European Parliament, copyright owners may not opt out of having their works used in conjunction with artificial intelligence tools in TDM research—meaning copyrighted works must remain available for scientific research that is reliant on AI training, and publishers cannot override these AI training rights through contract. Publishers are thus obligated to—and do—preserve fair use-equivalent research exceptions for TDM and AI within the EU, and can do so in the United States, too. . . .

In all events, adaptable licensing language can address publishers’ concerns by reiterating that the licensed products may be used with AI tools only to the extent that doing so would not: i. create a competing or commercial product or service for use by third parties; ii. unreasonably disrupt the functionality of the subscribed products; or iii. reproduce or redistribute the subscribed products for third parties. In addition, license agreements can require commercially reasonable security measures (as also required in the EU) to extinguish the risk of content dissemination beyond permitted uses. In sum, these licensing terms can replicate the research rights that are unequivocally reserved for scholars elsewhere.

https://tinyurl.com/4fvpdz35

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Microsoft Is Developing Tech That Would Let Users Write with Their Eyes, a Huge Win for Accessibility"


Microsoft published a new patent for a device called the Eye-Gaze, which would allow users to communicate and interact with electronic devices without the use of hands and fingers for typing. . . .

The only other peripheral that comes to mind that’s remotely similar to the Eye-Gaze is the Apple Vision Pro, but that’s in a mixed reality setting which still requires some hand movements.

https://tinyurl.com/2s443y86

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Responsible Artificial Intelligence: A Structured Literature Review"


Our research endeavors to advance the concept of responsible artificial intelligence (AI), a topic of increasing importance within EU policy discussions. The EU has recently issued several publications emphasizing the necessity of trust in AI, underscoring the dual nature of AI as both a beneficial tool and a potential weapon. This dichotomy highlights the urgent need for international regulation. Concurrently, there is a need for frameworks that guide companies in AI development, ensuring compliance with such regulations. Our research aims to assist lawmakers and machine learning practitioners in navigating the evolving landscape of AI regulation, identifying focal areas for future attention. This paper introduces a comprehensive and, to our knowledge, the first unified definition of responsible AI. Through a structured literature review, we elucidate the current understanding of responsible AI. Drawing from this analysis, we propose an approach for developing a future framework centered around this concept. Our findings advocate for a human-centric approach to Responsible AI. This approach encompasses the implementation of AI methods with a strong emphasis on ethics, model explainability, and the pillars of privacy, security, and trust.

https://arxiv.org/abs/2403.06910

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"An OpenAI Spinoff Has Built an AI Model That Helps Robots Learn Tasks Like Humans"


Now three of OpenAI’s early research scientists say the startup they spun off in 2017, called Covariant, has solved that problem and unveiled a system that combines the reasoning skills of large language models with the physical dexterity of an advanced robot. . . .

This represents a leap forward, Chen told me, in robots that can adapt to their environment using training data rather than the complex, task-specific code that powered the previous generation of industrial robots. It’s also a step toward worksites where managers can issue instructions in human language without concern for the limitations of human labor. ("Pack 600 meal-prep kits for red pepper pasta using the following recipe. Take no breaks!")

https://tinyurl.com/3nek7xx2

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "The Obscene Energy Demands of A.I."


It’s been estimated that ChatGPT is responding to something like two hundred million requests per day, and, in so doing, is consuming more than half a million kilowatt-hours of electricity. (For comparison;s sake, the average U.S. household consumes twenty-nine kilowatt-hours a day.)

https://tinyurl.com/ynrd4k4p

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Generative AI in Higher Education: The Product Landscape


Since last fall, Ithaka S+R has been partnering with 19 colleges and universities from the US and Canada to assess GAI’s impact on higher education and make evidence-based, proactive decisions about how to manage the far-ranging effects of GAI.[3] As part of this project, Ithaka S+R has been cataloging GAI applications geared towards teaching, learning, and research in the higher education context. Today, we are excited to make our Product Tracking tool (https://sr.ithaka.org/our-work/generative-ai-product-tracker/) publicly available. . . .

This issue brief is designed to enrich the descriptive data captured in the Product Tracker. In the brief’s first section, we provide a typology of existing products and value propositions. In the second, we offer observations about what the product landscape suggests about the future of teaching, learning, and research practices, and speculations on the near-term future of the academic GAI market.

https://doi.org/10.18665/sr.320394

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "I Used Generative AI to Turn My Story into a Comic—and You Can Too"


After more than a year in development, Lore Machine is now available to the public for the first time. For $10 a month, you can upload 100,000 words of text (up to 30,000 words at a time) and generate 80 images for short stories, scripts, podcast transcripts, and more. There are price points for power users too, including an enterprise plan costing $160 a month that covers 2.24 million words and 1,792 images. The illustrations come in a range of preset styles, from manga to watercolor to pulp ’80s TV show.

https://tinyurl.com/54mj6t77

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

OCUL [Ontario Council of University Libraries] Machine Learning/Artificial Intelligence Report and Strategy: Interim Report


This report describes use cases for machine learning relevant to the OCUL consortium and recommends projects utilizing machine learning technologies. It also considers key contextual issues such as ethical concerns, technical capacity, available expertise, and infrastructure needs. All sections are drafts with some sections more fully developed than others

https://tinyurl.com/38cjdn9p

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Responsible AI at the Vanderbilt Television News Archive: A Case Study"


We provide an overview of the use of machine-learning and artificial intelligence at the Vanderbilt Television News Archive (VTNA). After surveying our major initiatives to date, which include the full transcription of the collection using a custom language model deployed on Amazon Web Services (AWS), we address some ethical considerations we encountered, including the possibility of staff downsizing and misidentification of individuals in news recordings.

https://doi.org/10.7191/jeslib.805

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Using AI/Machine Learning to Extract Data from Japanese American Confinement Records"


Purpose: This paper examines the use of Artificial Intelligence/Machine Learning to extract a more comprehensive data set from a structured “standardized” form used to document Japanese American incarcerees during World War II.

Setting/Participants/Resources: The Bancroft Library partnered with Densho, a community memory organization, and Doxie.AI to complete this work.

Brief Description: The project digitized the complete set of Form WRA-26 "individual record"’ for more than 110,000 Japanese Americans incarcerated in War Relocation Authority camps during WWII. The library utilized AI/machine learning to automate text extraction from over 220,000 images of a structured "standardized" form; our goal was to improve upon and collect information not previously recorded in the Japanese American Internee Data file held by the National Archives and Records Administration. The project team worked with technical, academic, legal, and community partners to address ethical and logistical issues raised by the data extraction process, and to assess appropriate access options for the dataset(s) and digitized records.

https://doi.org/10.7191/jeslib.850

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Implementation of Keenious at Carnegie Mellon University"


n the fall of 2022, the Carnegie Mellon University (CMU) Libraries began investigating Keenious—an artificial intelligence (AI)-based article recommender tool&mdashfor a possible trial implementation to improve pathways to resource discovery and assist researchers in more effectively searching for relevant research. This process led to numerous discussions within the library regarding the unique nature of AI-based tools when compared with traditional library resources, including ethical questions surrounding data privacy, algorithmic transparency, and the impact on the research process. This case study explores these topics and how they were negotiated up to and immediately following CMU’s implementation of Keenious in January, 2023, and highlights the need for more frameworks for evaluating AI-based tools in academic settings.

https://doi.org/10.7191/jeslib.800

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Evaluating the Performance of ChatGPT and Perplexity AI in Business Reference"


The Thomas Mahaffey Jr. Business Library conducted a study to assess the performance of two competing generative AI products, ChatGPT and Perplexity AI, in answering business reference questions. The study used a data set consisting of a sample of anonymized reference questions submitted through the library’s ServiceNow ticketing system between January 2018 and May 2022. The questions were input as prompts to each competing AI. . . . Results showed similar and underwhelming performance between each AI at the composite level. Analysis of scores in each individual scoring dimension showed greater variance in the score distributions between the competing AI. Through the evaluation process, key strengths, weaknesses, and trends emerged between each AI.

https://doi.org/10.1080/08963568.2024.2317534

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Leveraging ChatGPT and Bard for Academic Librarians and Information Professionals: A Case Study of Developing Pedagogical Strategies Using Generative AI Models"


This study focuses on improving pedagogical strategies by integrating artificial intelligence (AI) chatbots and library databases. Examples from ChatGPT and Bard were used to demonstrate the quality of information. A cross-examination using a research validation template was conducted; it revealed that no artificial hallucinations were produced. However, the information provided by both AI chatbots was slightly outdated based on organizational changes and did not provide an in-depth analysis of the company.

https://doi.org/10.1080/08963568.2024.2321729

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Anthropic Says Its Latest AI Bot Can Beat Gemini and ChatGPT"


Anthropic, the AI company started by several former OpenAI employees, says the new Claude 3 family of AI models performs as well as or better than leading models from Google and OpenAI. Unlike earlier versions, Claude 3 is also multimodal, able to understand text and photo inputs.

Anthropic says Claude 3 will answer more questions, understand longer instructions, and be more accurate. Claude 3 can understand more context, meaning it can process more information.

https://tinyurl.com/yb3dw8u7

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Artificial intelligence News 03/04/24

"Adobe Is Testing a New AI Tool That Can Create Music From Text Prompts"

"The Best Generative AI Courses Money Can Buy"

"Generative AI Is Challenging a 234-Year-Old Law"

"How to Picture A.I."

"Is ChatGPT Making Scientists Hyper-Productive? The Highs and Lows of Using AI"

"StarCoder Is a Code-Generating AI That Runs on Most GPUs"

"‘Up to 1,000X Faster’: AI Startup Wants to Make GPU Training Obsolete with an Extraordinary Piece of Tech — Meet the Tseltin Machine Which May Come To a Device near You Sooner than You Think"

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |