"STM Statement Regarding Unlicensed Use of STM’s Members’ Content in the Training, Development, and Operation of AI Models"


The unlicensed use of STM’s members’ content in the training, development, and operation of AI models is of great concern to STM and to our members. Because STM’s members do not share a single jurisdiction, the particular actions and practices of a given AI developer with respect to a given domestic copyright law are too varied to enumerate here. However, regardless of legal nuances among jurisdictions, STM considers the conclusion to be the same — the collection of our members’ content and its use in AI training without authorization, compensation or attribution, amounts to infringement. We support the statements about third parties’ use of content in generative AI training and development that have been made by our sister organizations the International Publishers Association and the UK Publishers Association.

https://tinyurl.com/5n6zh9sy

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Google’s Wrong Answer to the Threat of AI — Stop Indexing Content"


"Google is no longer trying to index the entire web," writes Schmalbach [Vincent Schmalbach, SEO expert]. "In fact, it’s become extremely selective, refusing to index most content. This isn’t about content creators failing to meet some arbitrary standard of quality. Rather, it’s a fundamental change in how Google approaches its role as a search engine." The default setting from now on will be not to index content unless it is genuinely unique, authoritative and has ‘brand recognition’.

https://tinyurl.com/32t98fhu

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"ARL & CNI Release Deluxe Edition of AI-Influenced Future Scenarios for Research Environment"


This Deluxe Edition of the ARL/CNI AI Scenarios includes:

  • The Final Scenario Set: This final scenario set explores potential futures where AI plays a pivotal role, providing critical insights into the evolving challenges and opportunities for the research environment.
  • The Strategic Context Report: This report summarizes community feedback gathered through focus groups and interviews about an AI-influenced future for the research environment that were held in winter 2023–24 and spring 2024.
  • The Provocateur Interview Report: Featuring forward-thinking dialogues with industry leaders, these interviews challenge conventional wisdom and stimulate stretch thinking with regards to an AI-influenced future.

https://tinyurl.com/5n7xwc8c

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"A Real-World Test of Artificial Intelligence Infiltration of a University Examinations System: A ‘Turing Test’ Case Study"


The recent rise in artificial intelligence systems, such as ChatGPT, poses a fundamental problem for the educational sector. In universities and schools, many forms of assessment, such as coursework, are completed without invigilation. Therefore, students could hand in work as their own which is in fact completed by AI. Since the COVID pandemic, the sector has additionally accelerated its reliance on unsupervised ‘take home exams’. If students cheat using AI and this is undetected, the integrity of the way in which students are assessed is threatened. We report a rigorous, blind study in which we injected 100% AI written submissions into the examinations system in five undergraduate modules, across all years of study, for a BSc degree in Psychology at a reputable UK university. We found that 94% of our AI submissions were undetected. The grades awarded to our AI submissions were on average half a grade boundary higher than that achieved by real students. Across modules there was an 83.4% chance that the AI submissions on a module would outperform a random selection of the same number of real student submissions.

https://doi.org/10.1371/journal.pone.0305354

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"RIAA Sues Suno & Udio AI Music Generators For ‘Trampling’ on Copyright"


Major recording labels of the RIAA have filed a pair of broadly similar copyright lawsuits against two key generative AI music services. The owners of Udio and Suno stand accused of copying the labels’ music on a massive scale and the labels suggest that they’re already on the back foot. In pre-litigation correspondence, both were ‘evasive’ on content sources before citing fair use, which the RIAA notes only arises as a defense in cases of unauthorized use of copyright works.

https://tinyurl.com/p9tnycte

See also: "World’s Biggest Music Labels Sue Over AI Copyright."

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "AI Is Exhausting the Power Grid. Tech Firms Are Seeking a Miracle Solution."


In addition to fusion, tech giants are hoping to generate power through such futuristic schemes as small nuclear reactors hooked to individual computing centers and machinery that taps geothermal energy by boring 10,000 feet into the Earth’s crust. . . .

A recent Goldman Sachs analysis of energy that will power the AI boom into 2030. . . found data centers will account for 8 percent of total electricity use in the United States by 2030, a near tripling of their share today. New solar and wind energy will meet about 40 percent of that new power demand from data centers, the forecast said, while the rest will come from a vast expansion in the burning of natural gas. The new emissions created would be comparable to that of putting 15.7 million additional gas-powered cars on the road.

https://tinyurl.com/5fhwpc36

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Empowering Knowledge through AI: Open Scholarship Proactively Supporting Well Trained Generative AI"


Generative AI has taken the world by storm over the last few years, and the world of scholarly communications has not been immune to this. Most discussions in this area address how we can integrate these tools into our workflows, concerns about how researchers and students might misuse the technology or the unauthorised use of copyrighted work. This article argues for a novel viewpoint that librarians and publishers should be encouraging the use of their scholarly content in the training of AI algorithms. Inclusion of scholarly works would advance the reliability and accuracy of the information in training datasets and ensure that this content is included in new knowledge discovery platforms. The article also argues that inclusion can be achieved by improving linkage to content, and, by making sure that licences explicitly allow inclusion in AI training datasets, it advocates for a more collaborative approach to shaping the future of the information landscape in academia.

https://doi.org/10.1629/uksg.649

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Why Does AI Hallucinate?"


But none of these techniques will stop hallucinations fully. As long as large language models are probabilistic, there is an element of chance in what they produce. Roll 100 dice and you’ll get a pattern. Roll them again and you’ll get another. Even if the dice are, like large language models, weighted to produce some patterns far more often than others, the results still won’t be identical every time. Even one error in 1,000—or 100,000—adds up to a lot of errors when you consider how many times a day this technology gets used.

https://tinyurl.com/2w2y3d94

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Springer Nature Unveils Two New AI Tools to Protect Research Integrity"


Geppetto works by dividing the paper up into sections and uses its own algorithms to check the consistency of the text in each section. The sections are then given a score based on the probability that the text in them has been AI generated. The higher the score, the greater the probability of there being problems, initiating a human check by Springer Nature staff. Geppetto is already responsible for identifying hundreds of fake papers soon after submission, preventing them from being published — and from taking up editors’ and peer reviewers’ valuable time. . . .

SnappShot, also developed in-house, is an AI-assisted image integrity analysis tool. Currently used to analyse PDF files containing gel and blot images and look for duplications in those image types— another known integrity problem within the industry — this will be expanded to cover additional image types and integrity problems and speed up checks on papers.

https://tinyurl.com/3uxbvans

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Taylor & Francis Issues Expanded Guidance on AI Application for Authors, Editors and Reviewers "


Taylor & Francis has issued the latest iteration of its policy on the application of AI tools. The policy aims to promote ethical and transparent use of AI, while addressing the risks and challenges it can pose for research publishing.

From the policy:

Authors must clearly acknowledge within the article or book any use of Generative AI tools through a statement which includes: the full name of the tool used (with version number), how it was used, and the reason for use. For article submissions, this statement must be included in the Methods or Acknowledgments section. Book authors must disclose their intent to employ Generative AI tools at the earliest possible stage to their editorial contacts for approval — either at the proposal phase if known, or if necessary, during the manuscript writing phase. If approved, the book author must then include the statement in the preface or introduction of the book .

https://tinyurl.com/h3rfkynm

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Apple Is Putting ChatGPT in Siri for Free Later This Year"


Apple is partnering with OpenAI to put ChatGPT into Siri, the company announced at its WWDC 2024 keynote on Monday.

ChatGPT will be available for free in iOS 18 and macOS Sequoia later this year without an account, and Apple says that user queries won’t be logged. The popular chatbot will also be integrated into Apple’s systemwide writing tools.

https://tinyurl.com/29bs35b3

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Charles W. Bailey, Jr.Posted on Categories Artificial Intelligence/Robots

Generative AI: "The Impossibility of Fair LLMs"


The need for fair AI is increasingly clear in the era of general-purpose systems such as ChatGPT, Gemini, and other large language models (LLMs). However, the increasing complexity of human-AI interaction and its social impacts have raised questions of how fairness standards could be applied. Here, we review the technical frameworks that machine learning researchers have used to evaluate fairness, such as group fairness and fair representations, and find that their application to LLMs faces inherent limitations. We show that each framework either does not logically extend to LLMs or presents a notion of fairness that is intractable for LLMs, primarily due to the multitudes of populations affected, sensitive attributes, and use cases. To address these challenges, we develop guidelines for the more realistic goal of achieving fairness in particular use cases: the criticality of context, the responsibility of LLM developers, and the need for stakeholder participation in an iterative process of design and evaluation. Moreover, it may eventually be possible and even necessary to use the general-purpose capabilities of AI systems to address fairness challenges as a form of scalable AI-assisted alignment.

https://arxiv.org/abs/2406.03198

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Towards Conversational Discovery: New Discovery Applications for Scholarly Information in the Era of Generative Artificial Intelligence "


Here, we. . . discuss how GenAI is moving us towards conversational discovery and what this might mean for publishing, as well as potential future trends in information discovery.

AI-powered features include natural language search, concise summaries, and synthesis of research. . . .

It [Scopus AI] has the ability to use keywords from research abstracts to generate concept maps for each query. Dimensions Assistant offers well-structured explanations. . . researchers can receive notifications each time content is generated . . . .

There are two types of AI/GenAI powered discovery systems: AI+ refers to native applications which can only be built based on GenAI (such as Chat GPT and Perplexity.ai), while +AI means AI/GenAI can be integrated to improve existing discovery tools and search engines such as Google and Bing.

https://tinyurl.com/53chtzu7

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"4 Types of Gen AI Risk and How to Mitigate Them"


Risk around using gen AI can be classified based on two factors: intent and usage. Accidental misapplication of gen AI is different from deliberate malpractices (intent). Similarly, using gen AI tools to create content is differentiated from consuming content that other parties may have created with gen AI (usage). To mitigate the risk of gen AI content misuse and misapplication, organizations need to develop the capabilities to detect, identify, and prevent the spread of such potentially misleading content.

https://tinyurl.com/3shctfct

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Generative AI Issues in Scholarly Publishing: "Guest Post: Jagged Edges of Conversational Interfaces Over Scholarly and Professional Content "


The fundamental tension is that unlike web distribution of static content, which has enormous scale advantages due to very low marginal costs, the RAG [Retrieval-Augmented Generation] pattern has high marginal costs (10-1000X) that scale linearly. While token costs remain high, for general scholarly applications outside of specialty practitioners, the central business or product challenge will be how to generate sufficient incremental revenue to offset the vastly higher compute costs to use GenAI technology to generate responses to queries.

https://tinyurl.com/38x432h7

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

ARL Poll: "AI and Libraries: Strengths in a Digital Tomorrow"


The poll results from the ARL/CNI 2035 Scenarios exploration reveal diverse strengths that research libraries can harness as they navigate AI-influenced futures. These strengths underscore libraries’ vital role in maintaining information integrity and ensuring equitable access amidst the challenges posed by AI advancements. For libraries, these insights emphasize the importance of continuing to build on these core competencies while staying adaptive and responsive to emerging technological trends. Leveraging the ARL/CNI 2035 Scenarios and continued attention to the broader strategic landscape will enable libraries to be proactive and remain relevant and effective as custodians of knowledge in an increasingly digital and AI-driven world.

https://tinyurl.com/38mmuxnb

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Generative AI in Higher Education: A Global Perspective of Institutional Adoption Policies and Guidelines"


Integrating generative AI (GAI) into higher education is crucial for preparing a future generation of GAI-literate students. Yet a thorough understanding of the global institutional adoption policy remains absent, with most of the prior studies focused on the Global North and the promises and challenges of GAI, lacking a theoretical lens. This study utilizes the Diffusion of Innovations Theory to examine GAI adoption strategies in higher education across 40 universities from six global regions. It explores the characteristics of GAI innovation, including compatibility, trialability, and observability, and analyses the communication channels and roles and responsibilities outlined in university policies and guidelines. The findings reveal a proactive approach by universities towards GAI integration, emphasizing academic integrity, teaching and learning enhancement, and equity. Despite a cautious yet optimistic stance, a comprehensive policy framework is needed to evaluate the impacts of GAI integration and establish effective communication strategies that foster broader stakeholder engagement. The study highlights the importance of clear roles and responsibilities among faculty, students, and administrators for successful GAI integration, supporting a collaborative model for navigating the complexities of GAI in education. This study contributes insights for policymakers in crafting detailed strategies for its integration.

https://arxiv.org/abs/2405.11800

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

International Scientific Report on the Safety of Advanced AI


The interim report highlights several key takeaways, including:

  • General-purpose AI can be used to advance the public interest, leading to enhanced wellbeing, prosperity, and scientific discoveries.
  • According to many metrics, the capabilities of general-purpose AI are advancing rapidly. Whether there has been significant progress on fundamental challenges such as causal reasoning is debated among researchers.
  • Experts disagree on the expected pace of future progress of general-purpose AI capabilities, variously supporting the possibility of slow, rapid, or extremely rapid progress.
  • There is limited understanding of the capabilities and inner workings of general-purpose AI systems. Improving our understanding should be a priority.
  • Like all powerful technologies, current and future general-purpose AI can be used to cause harm. For example, malicious actors can use AI for large-scale disinformation and influence operations, fraud, and scams.
  • Malfunctioning general-purpose AI can also cause harm, for instance through biassed decisions with respect to protected characteristics like race, gender, culture, age, and disability.
  • Future advances in general-purpose AI could pose systemic risks, including labour market disruption, and economic power inequalities. Experts have different views on the risk of humanity losing control over AI in a way that could result in catastrophic outcomes.
  • Several technical methods (including benchmarking, red-teaming and auditing training data) can help to mitigate risks, though all current methods have limitations, and improvements are required.
  • The future of AI is uncertain, with a wide range of scenarios appearing possible. The decisions of societies and governments will significantly impact its future.

Report

https://tinyurl.com/3h7bdvzr

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Project Astra is the Future of AI at Google"


Google calls it Project Astra, and it’s a real-time, multimodal AI assistant that can see the world, knows what things are and where you left them, and can answer questions or help you do almost anything. In an incredibly impressive demo video that Hassabis swears is not faked or doctored in any way, an Astra user in Google’s London office asks the system to identify a part of a speaker, find their missing glasses, review code, and more. It all works practically in real time and in a very conversational way.

https://tinyurl.com/bkkfaxrd

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"OpenAI Announces New Multimodal Desktop GPT with New Voice and Vision Capabilities"


GPT-4o can recognize and respond to screenshots, photos, documents, or charts uploaded to it. The new GPT-4o model can also recognize facial expressions and information written by hand on paper. OpenAI said the improved model and accompanying chatbot can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, "which is similar to human response time in a conversation". . . .

It displayed a better conversational capability, where users can interrupt it and begin new or modified queries, and it is also versed in 50 languages. In one onstage live demonstration, the Voice Mode was able to translate back and forth between Murati speaking Italian and Barret Zoph, OpenAI’s head of post-training, speaking English.

https://tinyurl.com/576busr4

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Guidance Needed for Using Artificial Intelligence to Screen Journal Submissions for Misconduct"


Journals and publishers are increasingly using artificial intelligence (AI) to screen submissions for potential misconduct, including plagiarism and data or image manipulation. While using AI can enhance the integrity of published manuscripts, it can also increase the risk of false/unsubstantiated allegations. Ambiguities related to journals’ and publishers’ responsibilities concerning fairness and transparency also raise ethical concerns. In this Topic Piece, we offer the following guidance: (1) All cases of suspected misconduct identified by AI tools should be carefully reviewed by humans to verify accuracy and ensure accountability; (2) Journals/publishers that use AI tools to detect misconduct should use only well-tested and reliable tools, remain vigilant concerning forms of misconduct that cannot be detected by these tools, and stay abreast of advancements in technology; (3) Journals/publishers should inform authors about irregularities identified by AI tools and give them a chance to respond before forwarding allegations to their institutions in accordance with Committee on Publication Ethics guidelines; (4) Journals/publishers that use AI tools to detect misconduct should screen all relevant submissions and not just random/purposefully selected submissions; and (5) Journals should inform authors about their definition of misconduct, their use of AI tools to detect misconduct, and their policies and procedures for responding to suspected cases of misconduct.

https://doi.org/10.1177/17470161241254052

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"AI Deception: A Survey of Examples, Risks, and Potential Solutions"


This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta’s CICERO) and general-purpose AI systems (including large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI. Finally, we outline several potential solutions: first, regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements; second, policymakers should implement bot-or-not laws; and finally, policymakers should prioritize the funding of relevant research, including tools to detect AI deception and to make AI systems less deceptive. Policymakers, researchers, and the broader public should work proactively to prevent AI deception from destabilizing the shared foundations of our society.

https://doi.org/10.1016/j.patter.2024.100988

| Artificial Intelligence |
| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

AI-native Platform: "Reimagining Research Impact: Introducing Web of Science Research Intelligence"


Currently being developed in partnership with leading academic institutions, Web of Science Research Intelligence is an AI-native platform that embodies a vision centered on three pillars: unification, innovation and impact. It seamlessly integrates funding data with research outputs that include publications, patents, conference proceedings, books, policy documents and more. Based on these data, the platform identifies relevant funding opportunities within emerging research areas, equipping institutions and researchers to innovate.

  • A conversational assistant powered by generative AI enables all users to gain insights and create qualitative narratives for more balanced impact assessment, from data scientists to those with limited analysis experience.
  • Tailored recommendations for collaboration and funding help early career researchers build their networks and all researchers position themselves to win.
  • A new framework for measuring societal impact beyond traditional citation metrics will empower researchers and institutions to showcase the broader impacts of their work.

https://tinyurl.com/2zdshm6b

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"A Literature Review of User Privacy Concerns in Conversational Chatbots: A Social Informatics Approach: An Annual Review of Information Science and Technology (ARIST) Paper"


Since the introduction of OpenAI’s ChatGPT-3 in late 2022, conversational chatbots have gained significant popularity. These chatbots are designed to offer a user-friendly interface for individuals to engage with technology using natural language in their daily interactions. However, these interactions raise user privacy concerns due to the data shared and the potential for misuse in these conversational information exchanges. Furthermore, there are no overarching laws and regulations governing such conversational interfaces in the United States. Thus, there is a need to investigate the user privacy concerns. To understand these concerns in the existing literature, this paper presents a literature review and analysis of 38 papers out of 894 retrieved papers that focus on user privacy concerns arising from interactions with text-based conversational chatbots through the lens of social informatics. The review indicates that the primary user privacy concern that has consistently been addressed is self-disclosure. This review contributes to the broader understanding of privacy concerns regarding chatbots the need for further exploration in this domain. As these chatbots continue to evolve, this paper acts as a foundation for future research endeavors and informs potential regulatory frameworks to safeguard user privacy in an increasingly digitized world.

https://doi.org/10.1002/asi.24898

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |