"An OpenAI Spinoff Has Built an AI Model That Helps Robots Learn Tasks Like Humans"


Now three of OpenAI’s early research scientists say the startup they spun off in 2017, called Covariant, has solved that problem and unveiled a system that combines the reasoning skills of large language models with the physical dexterity of an advanced robot. . . .

This represents a leap forward, Chen told me, in robots that can adapt to their environment using training data rather than the complex, task-specific code that powered the previous generation of industrial robots. It’s also a step toward worksites where managers can issue instructions in human language without concern for the limitations of human labor. ("Pack 600 meal-prep kits for red pepper pasta using the following recipe. Take no breaks!")

https://tinyurl.com/3nek7xx2

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "The Obscene Energy Demands of A.I."


It’s been estimated that ChatGPT is responding to something like two hundred million requests per day, and, in so doing, is consuming more than half a million kilowatt-hours of electricity. (For comparison;s sake, the average U.S. household consumes twenty-nine kilowatt-hours a day.)

https://tinyurl.com/ynrd4k4p

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Generative AI in Higher Education: The Product Landscape


Since last fall, Ithaka S+R has been partnering with 19 colleges and universities from the US and Canada to assess GAI’s impact on higher education and make evidence-based, proactive decisions about how to manage the far-ranging effects of GAI.[3] As part of this project, Ithaka S+R has been cataloging GAI applications geared towards teaching, learning, and research in the higher education context. Today, we are excited to make our Product Tracking tool (https://sr.ithaka.org/our-work/generative-ai-product-tracker/) publicly available. . . .

This issue brief is designed to enrich the descriptive data captured in the Product Tracker. In the brief’s first section, we provide a typology of existing products and value propositions. In the second, we offer observations about what the product landscape suggests about the future of teaching, learning, and research practices, and speculations on the near-term future of the academic GAI market.

https://doi.org/10.18665/sr.320394

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "I Used Generative AI to Turn My Story into a Comic—and You Can Too"


After more than a year in development, Lore Machine is now available to the public for the first time. For $10 a month, you can upload 100,000 words of text (up to 30,000 words at a time) and generate 80 images for short stories, scripts, podcast transcripts, and more. There are price points for power users too, including an enterprise plan costing $160 a month that covers 2.24 million words and 1,792 images. The illustrations come in a range of preset styles, from manga to watercolor to pulp ’80s TV show.

https://tinyurl.com/54mj6t77

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

OCUL [Ontario Council of University Libraries] Machine Learning/Artificial Intelligence Report and Strategy: Interim Report


This report describes use cases for machine learning relevant to the OCUL consortium and recommends projects utilizing machine learning technologies. It also considers key contextual issues such as ethical concerns, technical capacity, available expertise, and infrastructure needs. All sections are drafts with some sections more fully developed than others

https://tinyurl.com/38cjdn9p

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Responsible AI at the Vanderbilt Television News Archive: A Case Study"


We provide an overview of the use of machine-learning and artificial intelligence at the Vanderbilt Television News Archive (VTNA). After surveying our major initiatives to date, which include the full transcription of the collection using a custom language model deployed on Amazon Web Services (AWS), we address some ethical considerations we encountered, including the possibility of staff downsizing and misidentification of individuals in news recordings.

https://doi.org/10.7191/jeslib.805

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Using AI/Machine Learning to Extract Data from Japanese American Confinement Records"


Purpose: This paper examines the use of Artificial Intelligence/Machine Learning to extract a more comprehensive data set from a structured “standardized” form used to document Japanese American incarcerees during World War II.

Setting/Participants/Resources: The Bancroft Library partnered with Densho, a community memory organization, and Doxie.AI to complete this work.

Brief Description: The project digitized the complete set of Form WRA-26 "individual record"’ for more than 110,000 Japanese Americans incarcerated in War Relocation Authority camps during WWII. The library utilized AI/machine learning to automate text extraction from over 220,000 images of a structured "standardized" form; our goal was to improve upon and collect information not previously recorded in the Japanese American Internee Data file held by the National Archives and Records Administration. The project team worked with technical, academic, legal, and community partners to address ethical and logistical issues raised by the data extraction process, and to assess appropriate access options for the dataset(s) and digitized records.

https://doi.org/10.7191/jeslib.850

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Implementation of Keenious at Carnegie Mellon University"


n the fall of 2022, the Carnegie Mellon University (CMU) Libraries began investigating Keenious—an artificial intelligence (AI)-based article recommender tool&mdashfor a possible trial implementation to improve pathways to resource discovery and assist researchers in more effectively searching for relevant research. This process led to numerous discussions within the library regarding the unique nature of AI-based tools when compared with traditional library resources, including ethical questions surrounding data privacy, algorithmic transparency, and the impact on the research process. This case study explores these topics and how they were negotiated up to and immediately following CMU’s implementation of Keenious in January, 2023, and highlights the need for more frameworks for evaluating AI-based tools in academic settings.

https://doi.org/10.7191/jeslib.800

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Evaluating the Performance of ChatGPT and Perplexity AI in Business Reference"


The Thomas Mahaffey Jr. Business Library conducted a study to assess the performance of two competing generative AI products, ChatGPT and Perplexity AI, in answering business reference questions. The study used a data set consisting of a sample of anonymized reference questions submitted through the library’s ServiceNow ticketing system between January 2018 and May 2022. The questions were input as prompts to each competing AI. . . . Results showed similar and underwhelming performance between each AI at the composite level. Analysis of scores in each individual scoring dimension showed greater variance in the score distributions between the competing AI. Through the evaluation process, key strengths, weaknesses, and trends emerged between each AI.

https://doi.org/10.1080/08963568.2024.2317534

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Leveraging ChatGPT and Bard for Academic Librarians and Information Professionals: A Case Study of Developing Pedagogical Strategies Using Generative AI Models"


This study focuses on improving pedagogical strategies by integrating artificial intelligence (AI) chatbots and library databases. Examples from ChatGPT and Bard were used to demonstrate the quality of information. A cross-examination using a research validation template was conducted; it revealed that no artificial hallucinations were produced. However, the information provided by both AI chatbots was slightly outdated based on organizational changes and did not provide an in-depth analysis of the company.

https://doi.org/10.1080/08963568.2024.2321729

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Anthropic Says Its Latest AI Bot Can Beat Gemini and ChatGPT"


Anthropic, the AI company started by several former OpenAI employees, says the new Claude 3 family of AI models performs as well as or better than leading models from Google and OpenAI. Unlike earlier versions, Claude 3 is also multimodal, able to understand text and photo inputs.

Anthropic says Claude 3 will answer more questions, understand longer instructions, and be more accurate. Claude 3 can understand more context, meaning it can process more information.

https://tinyurl.com/yb3dw8u7

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Artificial intelligence News 03/04/24

"Adobe Is Testing a New AI Tool That Can Create Music From Text Prompts"

"The Best Generative AI Courses Money Can Buy"

"Generative AI Is Challenging a 234-Year-Old Law"

"How to Picture A.I."

"Is ChatGPT Making Scientists Hyper-Productive? The Highs and Lows of Using AI"

"StarCoder Is a Code-Generating AI That Runs on Most GPUs"

"‘Up to 1,000X Faster’: AI Startup Wants to Make GPU Training Obsolete with an Extraordinary Piece of Tech — Meet the Tseltin Machine Which May Come To a Device near You Sooner than You Think"

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Digital Science Announces Innovative New Research Solutions from Dimensions, Delivering Insights from ~350 Million Publications, Grants, Patents and Clinical Trials through CHATGPT’s Conversational Interface."


With Dimensions Research GPT, any Plus or Enterprise ChatGPT Store subscriber can gain insight from the tens of millions of Open Access publications available through Dimensions, providing a quick and effective search of open research on the user-friendly ChatGPT platform. Search results include clickable links that take users to Dimensions, where they can learn more about source materials.

Dimensions Research GPT Enterprise is available to mutual ChatGPT Enterprise and organization-wide Dimensions customers, and incorporates insights from all publications, grants, clinical trials and patents within the Dimensions database. Queries such as "tell me about clinical trials for malaria vaccines", followed by "tell me about the key research articles that underpin these trials" provide answers that come fully linked to the research literature behind the responses, ensuring that researchers can access and check the source of the information.

https://www.digital-science.com/news/dimensions-research-gpt/

Here is a inquiry I made to Dimensions Research GPT:

Describe the history of the open access movement

The history of the open access movement, which aims to make scholarly research freely available to all, is rich and diverse, involving various stakeholders including researchers, libraries, publishers, and educational institutions. The top results from Dimensions include a range of perspectives and approaches, with a total of 946,585 related documents found.

Perspectives on Open Access Publishing:

Particle Physics and Open Access: Travis C. Brooks (2007) discusses the adoption of open access in particle physics, highlighting the field’s longstanding tradition of preprints and the European efforts towards fully open access publishing models (Brooks, 2007).

Scholarly Publishing System Critique: Molly Kleinman (2017) offers a critical view of the traditional scholarly publishing system and presents the open access movement as a remedy to its drawbacks, emphasizing its benefits for faculty, universities, and the public (Kleinman, 2017).

Academic Publishing Evolution: Bernard Forgues and Sébastien Liarte (2013) reflect on the evolution of academic publishing, discussing the challenges of subscription prices and time embargoes, and how the open access movement aims to reshape scholarly communication (Forgues & Liarte, 2013).

Innovative Open Access Models and Reflections:

Open Access Archives for Education: Bonnie Gordon et al. (2016) describe how Interference Archive uses an open stacks policy for education, allowing for a participatory approach in deciding how histories are told and accessed (Gordon et al., 2016).

Scholarly Knowledge Freedom: Peter Baldwin (2023) examines the past, present, and future of the open access movement, arguing for the free dissemination of scholarly knowledge (Baldwin, 2023).

University Presses and Open Access: Paul Ayris and Lara Speicher (2015) discuss the launch of UCL Press, the UK’s first fully open access university press, highlighting the global movement towards open access and open science (Ayris & Speicher, 2015).

Other Relevant Insights:

MOOCs and Licensing Challenges: David Kernohan (2013) examines the challenges of licensing and intellectual property in the massive open online course (MOOC) movement, advocating for open licenses to maximize content management benefits (Kernohan, 2013).

National Parks and Open Access: Kristian Martinus Mennen (2023) explores the history of the national parks movement in England and Wales, emphasizing its role in providing public access to natural spaces (Mennen, 2023).

In conclusion, the history of the open access movement is characterized by efforts across various fields to make scholarly knowledge freely available, addressing the challenges posed by traditional publishing models and exploring innovative solutions. With 946,585 documents identified, the movement’s impact and evolution continue to be a rich area for exploration and discussion.

This is what Microsoft’s Copilot said:

The Open Access Movement has a rich history spanning several decades. Let’s delve into some key milestones:

Project Gutenberg (1971): Michael Hart launched Project Gutenberg, an initiative to digitize and archive cultural works, making them freely accessible 1.

Psycoloquy (1989): Stevan Harnad launched the free online journal Psycoloquy, which later became a peer-reviewed journal in January 1990 1.

The Public-Access Computer Systems Review (1989): Charles W. Bailey Jr. initiated this free online journal, which transitioned to peer-reviewed status in April 1992 1.

Tim Berners-Lee and the World Wide Web (1990-1991): Tim Berners-Lee proposed the concept of the World Wide Web, leading to its launch in March 1991. This pivotal development facilitated global information sharing 1.

ArXiv (1991): Physicist Paul Ginsparg established ArXiv, an open repository for preprints in physics and related fields 1.

Emergence of E-Journals (Early 1990s): Several e-journals, such as Ejournal, Electronic Journal of Communication, and Postmodern Culture, emerged during this period 1.

Standards for the World Wide Web (1991): Tim Berners-Lee released the standards for the World Wide Web, shaping its growth and accessibility 1.

Project MUSE (1993): The Milton S. Eisenhower Library, in partnership with Johns Hopkins University Press, launched Project MUSE, a pioneer in online distribution that allowed full-text searches 1.

Self-Archiving Proposal (1994): Stevan Harnad proposed self-archiving, advocating for authors to freely share their research 1.

Social Science Research Network (SSRN) (1994): Wayne Marr and Michael Jensen founded SSRN, a platform for sharing working papers and preprints 1.

Open Access Conversion (1994): The print journal Florida Entomologist transitioned to an open-access format, making all previous issues available for free 1.

Networked Digital Library of Theses and Dissertations (NDLTD) (1996): Virginia Polytechnic Institute and State University launched NDLTD, promoting open access to theses and dissertations 1.

Internet Archive (1996): Brewster Kahle established the Internet Archive, preserving digital content for public access 1.

Medline and PubMed (1997): Medline, initially launched in 1966, became open access and was incorporated into PubMed 1. SciELO (1997): SciELO (Scientific Electronic Library Online) began development as a project supported by FAPESP and BIREME 1.

The Open Access Movement continues to evolve, driven by the vision of making scholarly knowledge freely available to all.

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"OpenAI: ‘The New York Times Paid Someone to Hack Us’"


OpenAI accuses The New York Times of paying someone to hack OpenAI’s products. This was allegedly done to gather evidence for the copyright infringement complaint the newspaper filed late last year. This lawsuit fails to meet The Times’ "famously rigorous journalistic standards," the defense argues, asking the New York federal court to dismiss it in part.

http://tinyurl.com/44j8wchb

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Your Organization Isn’t Designed to Work with GenAI "


A good example is the customer service model employed by Jerry, a company valued at $450 million with over five million customers that serves as a one stop-shop for car owners to get insurance and financing. Jerry receives over 200,000 messages a month from customers. With such high volume, the company struggled to respond to customer queries within 24 hours, let alone minutes or seconds. By installing their GenAI solution in May 2023, they moved from having humans in the lead in the entirety of the customer service process and answering only 54% of customer inquiries within 24 hours or less to having AI in the lead 100% of the time and answering over 96% of inquiries within 30 seconds by June 2023. They project $4 million in annual savings from this transformation.

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

U.S. Copyright Office Update on Its Artificial Intelligence Initiatives


In March 2023, the Office announced a broad initiative to examine the copyright implications of the current forms of generative AI. Although we had previously examined the scope of copyright in works created using AI, the increasing sophistication and public adoption of generative AI tools raised new questions about the process of training and the legal status of the outputs. Our goal was to gather information from a full range of knowledgeable and interested parties in order to produce a report to assist Congress, thecourts, and others in formulating policy in this area. In taking this initiative forward, we are monitoring related work being done in other agencies, including the U.S. Patent and Trademark Office (USPTO) and the Federal Trade Commission, and communicating with them on an ongoing basis.

This letter summarizes the Office’s work so far and describes our agenda for the rest of 2024, including the release of the report, updates to the Compendium of U.S. Copyright Office Practices, and the publication of a proposed economic research agenda.

http://tinyurl.com/4tpeyw3t

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "GPT-4 Developer Tool Can Hack Websites without Human Help"


OpenAI’s artificial intelligence model GPT-4 has the capability to hack websites and steal information from online databases without human help, researchers have found. That suggests individuals or organisations without hacking expertise could unleash AI agents to carry out cyber attacks.

http://tinyurl.com/24u7bnud

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"A Descriptive Analysis of AI Guides in Academic Libraries"


The purpose of this work is to do an exploratory descriptive analysis of topics included in Artificial Intelligence (AI) electronic library guides of academic libraries. AI library guides from members of the Association of Research Libraries (ARL) are chosen for selection. Of the 48 libraries randomly selected, 25 have AI guides. Data collected was subject to a content analysis process which included the use of terms co-occurrence using the VOSviewer software. The results show three clusters of terms, which are complemented with the discussion of materials found in the guides.

https://digitalcommons.unl.edu/libphilprac/8033/

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Using CHATGPT-Generated Essays in Library Instruction"


This case study details a library instruction activity developed by a team of academic librarians, which intended to leverage experiential learning to make students and faculty aware of the function, capabilities, and limitations of text-generating artificial intelligence (AI) tools like OpenAI’s ChatGPT. The activity is described, with its development connected to key instructional theories and frameworks.

https://doi.org/10.1016/j.acalib.2024.102863

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Generative AI News

Adobe Acrobat Adds Generative AI to "Easily Chat With Documents"

ChatGPT Spat Out Gibberish for Many Users Overnight before OpenAI Fixed It

Exploring ChatGPT for Next-Generation Information Retrieval: Opportunities and Challenges

Generative AI’s Environmental Costs Are Soaring — And Mostly Secret

Google Releases "Open" AI Models after Meta

Generative AI Market Share Reach USD 126.5 Billion by 2031, Impacting Factors That Could Escalate Market’s Rapid Growth

Grammarly and Harris Poll Find Using Generative AI for Communication Could Save Up to $1.6 Trillion Annually in U.S. Productivity

One Month with Microsoft’s AI Vision of the Future: Copilot Pro

OpenAI Can’t Register "GPT" as a Trademark — Yet

Top 20 Generative AI Tools & Applications in 2024

What the EU’s Tough AI Law Means for Research and ChatGPT

Why the New York Times Might Win Its Copyright Lawsuit against OpenAI

"Guest Post — There is More to Reliable Chatbots than Providing Scientific References: The Case of ScopusAI"


In October, my institution was granted access to the Beta version of ScopusAI. I have tested it using a concept connected to my PhD dissertation in physics, an "electromagnon". In this post, I want to share my experience and use it to illustrate the many dimensions the design and assessment of such tools need to consider. . . .

[The author provides an extensive description and analysis of the performed tests as well as their broader implications.]

And if AI is only as good as its underlying data, let’s not forget who owns the scholarly data and regulates access to it. Big scholarly publishers have long been using content as a resource to capitalize on. AI tools amplify existing imbalances in access to scholarly text: if a publisher owns the exclusive right to a text, they can train their own AI on it and make this content unavailable to competing AI projects, profiting from the copyright yet again. Currently, most AI research assistants are grounded with abstracts, but the real value is contained in the full text of articles, and accessing them remains very difficult.

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Will Traffic from Search Engines Fall 25% by 2026?"


Will traffic from search engines to your website drop 25% by 2026 as a result of the rise of generative AI answer engines? That’s what Gartner, a technological research and consulting firm, is predicting. . . . Gartner previously predicted that "organic search traffic will decrease by 50% or more [by 2028] as consumers embrace generative AI-powered search."

http://tinyurl.com/mtxvt8kj

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"OpenAI Introduces Sora, Its Text-to-Video AI Model"


The AI company says Sora "can create realistic and imaginative scenes from text instructions." The text-to-video model allows users to create photorealistic videos up to a minute long — all based on prompts they’ve written. . . . The model can also generate a video based on a still image, as well as fill in missing frames on an existing video or extend it.

http://tinyurl.com/y6jfbyd6

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"OpenAI Wants to Eat Google Search’s Lunch"


OpenAI is reportedly developing a search app that would directly compete with Google Search . . . Microsoft Bing would allegedly power the service from Sam Altman, which could be the most serious threat Google Search has ever faced. Current AI-enabled search engines from Google and Perplexity answer your questions with a clear AI-generated answer, usually in one to two sentences. Then, the engine provides links to its sources below, like a hybrid between an AI chatbot and a search engine. The report says this new search product could be faster than ChatGPT, without sacrificing its powerful summarizing abilities.

http://tinyurl.com/yc65hb5p

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Text File That Runs the Internet"


But robots.txt is not a legal document — and 30 years after its creation, it still relies on the good will of all parties involved. Disallowing a bot on your robots.txt page. . . sends a message, but it’s not going to stand up in court. Any crawler that wants to ignore robots.txt can simply do so, with little fear of repercussions. . . . As the AI companies continue to multiply, and their crawlers grow more unscrupulous, anyone wanting to sit out or wait out the AI takeover has to take on an endless game of whac-a-mole. . . . If AI is in fact the future of search, as Google and others have predicted, blocking AI crawlers could be a short-term win but a long-term disaster.

http://tinyurl.com/5n8s72bz

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |