"Writing with CHATGPT: An Illustration of Its Capacity, Limitations & Implications for Academic Writers"


Rather than being alarmed or anxious, writers need to understand ChatGPT’s strengths and weaknesses. It is better at structure than it is at content. It is a good brainstorming tool (think titles, outlines, counter-arguments), but you must double check everything it tells you, especially if you’re outside your domain of expertise. It can provide summaries of complex ideas, and connect them with other ideas, but only if you have put a lot of thought into the incremental prompting needed to shift it from its generic default and train it to focus on what you care about. Its access to information is limited to what it was originally trained on, therefore your own training phase is essential to identify gaps and inaccuracies. It can be used for labor, such as reformatting abstracts or reducing the length of sections, but it can’t replace the thinking a writer does to determine why some paragraphs or ideas deserve more words and others can be cut back. It can be inaccurate: in fact, rather stubbornly so, persisting with inaccuracies even after they are pointed out, while at the same time presenting its next attempt as corrected.

https://doi.org/10.5334/pme.1072

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"CORE-GPT: Combining Open Access Research and Large Language Models for Credible, Trustworthy Question Answering"


In this paper, we present CORE-GPT, a novel question-answering platform that combines GPT-based language models and more than 32 million full-text open access scientific articles from CORE. We first demonstrate that GPT3.5 and GPT4 cannot be relied upon to provide references or citations for generated text. We then introduce CORE-GPT which delivers evidence-based answers to questions, along with citations and links to the cited papers, greatly increasing the trustworthiness of the answers and reducing the risk of hallucinations. CORE-GPT’s performance was evaluated on a dataset of 100 questions covering the top 20 scientific domains in CORE, resulting in 100 answers and links to 500 relevant articles. The quality of the provided answers and and relevance of the links were assessed by two annotators. Our results demonstrate that CORE-GPT can produce comprehensive and trustworthy answers across the majority of scientific domains, complete with links to genuine, relevant scientific articles.

https://arxiv.org/abs/2307.04683

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Claude 2: ChatGPT Rival Launches Chatbot That Can Summarise a Novel"


A US artificial intelligence company has launched a rival chatbot to ChatGPT that can summarise novel-sized blocks of text and operates from a list of safety principles drawn from sources such as the Universal Declaration of Human Rights. . . .

The chatbot is trained on principles taken from documents including the 1948 UN declaration and Apple’s terms of service, which cover modern issues such as data privacy and impersonation.

https://tinyurl.com/ms44eccd

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

10 AI Researchers on How AI Can Either Improve the World or Destroy It

Steve Rose of The Guardian interviews the experts.

Five Ways AI Could Improve the World: ‘We Can Cure All Diseases, Stabilise Our Climate, Halt Poverty’

Five Ways AI Might Destroy the World: ‘Everyone on Earth Could Fall over Dead in the Same Second’

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"SSP Conference Debate: AI and the Integrity of Scholarly Publishing"


At the annual meeting of the Society for Scholarly Publishing held in Portland, Oregon last month, the closing plenary session was a formal debate on the proposition "Resolved: Artificial intelligence will fatally undermine the integrity of scholarly publishing." Arguing in favor of the proposition was Tim Vines, founder of DataSeer and a Scholarly Kitchen Chef. Arguing against was Jessica Miles, Vice President for Strategy and Investments at Holtzbrinck Publishing Group.

https://tinyurl.com/ururdfvw

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

AI Is Training AI: "Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks"


Large language models (LLMs) are remarkable data annotators. They can be used to generate high-fidelity supervised training data, as well as survey and experimental data. With the widespread adoption of LLMs, human gold-standard annotations are key to understanding the capabilities of LLMs and the validity of their results. However, crowdsourcing, an important, inexpensive way to obtain human annotations, may itself be impacted by LLMs, as crowd workers have financial incentives to use LLMs to increase their productivity and income. To investigate this concern, we conducted a case study on the prevalence of LLM usage by crowd workers. We reran an abstract summarization task from the literature on Amazon Mechanical Turk and, through a combination of keystroke detection and synthetic text classification, estimate that 33-46% of crowd workers used LLMs when completing the task. Although generalization to other, less LLM-friendly tasks is unclear, our results call for platforms, researchers, and crowd workers to find new ways to ensure that human data remain human, perhaps using the methodology proposed here as a stepping stone

https://arxiv.org/abs/2306.07899

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"OCLC Introduces AI-generated Book Recommendations in WorldCat.org and WorldCat Find beta"


OCLC is beta testing book recommendations generated by artificial intelligence (AI) in WorldCat.org, the website that allows users to explore the collections of thousands of libraries through a single search. Searchers can now obtain AI-enabled book recommendations for print and e-books and then look for those items in libraries near them. The AI-generated book recommendations beta is now available in WorldCat.org and WorldCat Find, the mobile app extension for WorldCat.org.

https://tinyurl.com/44j4ascr

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Evaluating the Efficacy of ChatGPT-4 in Providing Scientific References across Diverse Disciplines"


This work conducts a comprehensive exploration into the proficiency of OpenAI’s ChatGPT-4 in sourcing scientific references within an array of research disciplines. Our in-depth analysis encompasses a wide scope of fields including Computer Science (CS), Mechanical Engineering (ME), Electrical Engineering (EE), Biomedical Engineering (BME), and Medicine, as well as their more specialized sub-domains. Our empirical findings indicate a significant variance in ChatGPT-4’s performance across these disciplines. Notably, the validity rate of suggested articles in CS, BME, and Medicine surpasses 65%, whereas in the realms of ME and EE, the model fails to verify any article as valid. Further, in the context of retrieving articles pertinent to niche research topics, ChatGPT-4 tends to yield references that align with the broader thematic areas as opposed to the narrowly defined topics of interest. This observed disparity underscores the pronounced variability in accuracy across diverse research fields, indicating the potential requirement for model refinement to enhance its functionality in academic research. Our investigation offers valuable insights into the current capacities and limitations of AI-powered tools in scholarly research, thereby emphasizing the indispensable role of human oversight and rigorous validation in leveraging such models for academic pursuits.

https://arxiv.org/abs/2306.09914v1

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"European Lawmakers Vote to Adopt EU AI Act"


European Union lawmakers have passed the EU AI Act that will govern use and deployment of artificial intelligence technology within the EU. . . . Changes introduced by MEPs to the original commission draft act include some top-level regulation of general-purpose AI tools such as ChatGPT. These foundation models will require mandatory labelling for AI-generated content and the forced disclosure of training data covered by copyright. . . . . Other changes include a fine-tuned list of prohibited practices, extended to include subliminal techniques, biometric categorisation, predictive policing, and internet-scraped facial recognition databases.

https://tinyurl.com/nhet5ckd

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Congressional Research Service: Generative Artificial Intelligence: Overview, Issues, and Questions for Congress


The recent public release of many GenAI tools, and the race by companies to develop ever-more powerful models, have generated widespread discussion of their capabilities, potential concerns with their use, and debates about their governance and regulation. This CRS InFocus describes the development and uses of GenAI, concerns raised by the use of GenAI tools, and considerations for Congress. For additional considerations related to data privacy, see CRS Report R47569, Generative Artificial Intelligence and Data Privacy: A Primer, by Kristen E. Busch.

https://tinyurl.com/bdrpkzcj

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"New ChatGPT Course at ASU Gives Students a Competitive Edge"


A new Arizona State University course will provide students with those skills, providing expertise that is becoming increasingly sought after.

Basic Prompt Engineering with ChatGPT: An Introduction is open this summer to students in any major, and despite the name, is not really about engineering.

https://tinyurl.com/rcfnt39d

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Guest Post — Accessibility Powered by AI: How Artificial Intelligence Can Help Universalize Access to Digital Content"


More than 1 billion people around the world have some type of disability (including visual, hearing, cognitive, learning, mobility, and other disabilities) that affects how they access digital content. No wonder we spend so much time talking about accessibility tools!

Digital transformation can revolutionize the world, turning it into an inclusive place for people with and without disabilities, with accessibility powered by artificial intelligence. This post provides an overview of how AI can improve accessibility in different ways, illustrated with real-world applications and examples.

https://tinyurl.com/3s64tvm7

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"AI Is about to Turn Book Publishing Upside-down"


The latest generation of AI is a game changer. Not incremental change—something gentle, something gradual: this AI changes everything, fast. Scary fast.

I believe that every function in trade book publishing today can be automated with the help of generative AI. And, if this is true, then the trade book publishing industry as we know it will soon be obsolete. We will need to move on.

https://tinyurl.com/2p9z6pr6

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Top AI Researchers and CEOs Warn against ‘Risk of Extinction’ in 22-Word Statement"


The 22-word statement, trimmed short to make it as broadly acceptable as possible, reads as follows: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war."

https://cutt.ly/EwqXnHn9

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Pew Research Center: "A Majority of Americans Have Heard of ChatGPT, but Few Have Tried It Themselves"


However, few U.S. adults have themselves used ChatGPT for any purpose. Just 14% of all U.S. adults say they have used it for entertainment, to learn something new, or for their work. This lack of uptake is in line with a Pew Research Center survey from 2021 that found that Americans were more likely to express concerns than excitement about increased use of artificial intelligence in daily life.

https://cutt.ly/Ywqld1X7

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"EU’s New AI Law Targets Big Tech Companies but Is Probably Only Going to Harm the Smallest Ones"


In a bold stroke, the EU’s amended AI Act would ban American companies such as OpenAI, Amazon, Google, and IBM from providing API access to generative AI models. The amended act, voted out of committee on Thursday, would sanction American open-source developers and software distributors, such as GitHub, if unlicensed generative models became available in Europe. While the act includes open source exceptions for traditional machine learning models, it expressly forbids safe-harbor provisions for open source generative systems.

Any model made available in the EU, without first passing extensive, and expensive, licensing, would subject companies to massive fines of the greater of €20,000,000 or 4% of worldwide revenue.

(Quote from Technomancers.ai.)

https://bit.ly/3ociZwo

Paywall: "Microsoft Says New A.I. Shows Signs of Human Reasoning"


When computer scientists at Microsoft started to experiment with a new artificial intelligence system last year, they asked it to solve a puzzle that should have required an intuitive understanding of the physical world. . . . The clever suggestion [by the AI] made the researchers wonder whether they were witnessing a new kind of intelligence. In March, they published a 155-page research paper arguing that the system was a step toward artificial general intelligence, or A.G.I., which is shorthand for a machine that can do anything the human brain can do.

https://bit.ly/42FkLp1

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces"


Scholarly publications are key to the transfer of knowledge from scholars to others. However, research papers are information-dense, and as the volume of the scientific literature grows, the need for new technology to support the reading process grows. In contrast to the process of finding papers, which has been transformed by Internet technology, the experience of reading research papers has changed little in decades. The PDF format for sharing research papers is widely used due to its portability, but it has significant downsides including: static content, poor accessibility for low-vision readers, and difficulty reading on mobile devices. This paper explores the question "Can recent advances in AI and HCI power intelligent, interactive, and accessible reading interfaces — even for legacy PDFs?" We describe the Semantic Reader Project, a collaborative effort across multiple institutions to explore automatic creation of dynamic reading interfaces for research papers. Through this project, we’ve developed ten research prototype interfaces and conducted usability studies with more than 300 participants and real-world users showing improved reading experiences for scholars. We’ve also released a production reading interface for research papers that will incorporate the best features as they mature. We structure this paper around challenges scholars and the public face when reading research papers — Discovery, Efficiency, Comprehension, Synthesis, and Accessibility — and present an overview of our progress and remaining open challenges.

https://arxiv.org/abs/2303.14334

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Google is Changing the Way We Search with AI. It Could Upend theWeb."


At the same time, the talk of replacing search results with AI-generated answers has roiled the world of people who make their living writing content and building websites. If a chatbot takes over the role of helping people find useful information, what incentive would there be for anyone to write how-to guides, travel blogs or recipes?

https://cutt.ly/s6kmQpF

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Is There a Case for Accepting Machine Translated Scholarly Content in Repositories?"


Multilingualism is a critical characteristic of a healthy, inclusive, and diverse research communications landscape. However, multilingualism presents a particular challenge for the discovery of research outputs. Although researchers and other information seekers may only be able to read in one or two languages, they may want to know about all the relevant research in their area, regardless of the language in which it is published. Conversely, information seekers may want to discover research outputs in their own language(s) more easily. To facilitate this, COAR Task Force on Supporting Multilingualism and non-English Content in Repositories has been developing and promoting good practices for repositories in managing multilingual and non-English content. In the course of our work, the topic of machine translation (MT) has sparked a heated discussion within the Task Group and we would like to share with you the nature of this discussion.

https://bit.ly/42D1nbF

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Quick Poll Results: ARL Member Representatives on Generative AI in Libraries"


We conducted a quick poll of Association of Research Libraries (ARL) member representatives in April 2023 to gather insights into their current perspectives on generative AI adoption, its potential implications, and the role of libraries in AI-driven environments. In this blog post, we summarize, synthesize, and provide recommendations based on the survey responses, aiming to offer valuable insights for senior library directors navigating the AI landscape.

https://bit.ly/3M9yVc2

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

2023 EDUCAUSE Horizon Report: Teaching and Learning Edition


This report profiles key trends and emerging technologies and practices shaping the future of teaching and learning, and envisions a number of scenarios and implications for that future. . . .

Artificial intelligence (AI) has taken the world by storm, with new AI-powered tools such as ChatGPT opening up new opportunities in higher education for content creation, communication, and learning, while also raising new concerns about the misuses and overreach of technology. Our shared humanity has also become a key focal point within higher education, as faculty and leaders continue to wrestle with understanding and meeting the diverse needs of students and to find ways of cultivating institutional communities that support student well-being and belonging.

https://bit.ly/3panaJd

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond"


This paper presents a comprehensive and practical guide for practitioners and end-users working with Large Language Models (LLMs) in their downstream natural language processing (NLP) tasks.. . . Firstly, we offer an introduction and brief summary of current GPT- and BERT-style LLMs. Then, we discuss the influence of pre-training data, training data, and test data. Most importantly, we provide a detailed discussion about the use and non-use cases of large language models for various natural language processing tasks, such as knowledge-intensive tasks, traditional natural language understanding tasks, natural language generation tasks, emergent abilities, and considerations for specific tasks.We present various use cases and non-use cases to illustrate the practical applications and limitations of LLMs in real-world scenarios. . . . Furthermore, we explore the impact of spurious biases on LLMs and delve into other essential considerations, such as efficiency, cost, and latency, to ensure a comprehensive understanding of deploying LLMs in practice.

https://arxiv.org/abs/2304.13712

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"AI Is Tearing Wikipedia Apart"


The current draft policy notes that anyone unfamiliar with the risks of large language models should avoid using them to create Wikipedia content. . . . The community is also divided on whether large language models should be allowed to train on Wikipedia content. While open access is a cornerstone of Wikipedia’s design principles, some worry the unrestricted scraping of internet data allows AI companies like OpenAI to exploit the open web to create closed commercial datasets for their models. This is especially a problem if the Wikipedia content itself is AI-generated, creating a feedback loop of potentially biased information, if left unchecked.

https://bit.ly/3NLrc50

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "’The Godfather of A.I.’ Leaves Google and Warns of Danger Ahead"


Down the road, he is worried that future versions of the technology pose a threat to humanity because they often learn unexpected behavior from the vast amounts of data they analyze. This becomes an issue, he said, as individuals and companies allow A.I. systems not only to generate their own computer code but actually run that code on their own. And he fears a day when truly autonomous weapons — those killer robots — become reality.

"The idea that this stuff could actually get smarter than people — a few people believed that," he said. "But most people thought it was way off. And I thought it was way off. I thought it was 30 to 50 years or even longer away. Obviously, I no longer think that."

https://bit.ly/3VoA9Dh

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |