"OpenAI Introduces Sora, Its Text-to-Video AI Model"


The AI company says Sora "can create realistic and imaginative scenes from text instructions." The text-to-video model allows users to create photorealistic videos up to a minute long — all based on prompts they’ve written. . . . The model can also generate a video based on a still image, as well as fill in missing frames on an existing video or extend it.

http://tinyurl.com/y6jfbyd6

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"OpenAI Wants to Eat Google Search’s Lunch"


OpenAI is reportedly developing a search app that would directly compete with Google Search . . . Microsoft Bing would allegedly power the service from Sam Altman, which could be the most serious threat Google Search has ever faced. Current AI-enabled search engines from Google and Perplexity answer your questions with a clear AI-generated answer, usually in one to two sentences. Then, the engine provides links to its sources below, like a hybrid between an AI chatbot and a search engine. The report says this new search product could be faster than ChatGPT, without sacrificing its powerful summarizing abilities.

http://tinyurl.com/yc65hb5p

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Text File That Runs the Internet"


But robots.txt is not a legal document — and 30 years after its creation, it still relies on the good will of all parties involved. Disallowing a bot on your robots.txt page. . . sends a message, but it’s not going to stand up in court. Any crawler that wants to ignore robots.txt can simply do so, with little fear of repercussions. . . . As the AI companies continue to multiply, and their crawlers grow more unscrupulous, anyone wanting to sit out or wait out the AI takeover has to take on an endless game of whac-a-mole. . . . If AI is in fact the future of search, as Google and others have predicted, blocking AI crawlers could be a short-term win but a long-term disaster.

http://tinyurl.com/5n8s72bz

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Court Dismisses Authors’ Copyright Infringement Claims Against OpenAI"


Several authors, including comedian Sarah Silverman, have suffered an early loss in their copyright battle against OpenAI. The authors accused OpenAI of using pirated copies of their books to train its models. A California federal court dismissed the vicarious copyright infringement and DMCA violation claims. However, the lawsuit isn’t over yet.

http://tinyurl.com/478vm6kw

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

2024 EDUCAUSE AI Landscape Study


Moving from reaction to action, higher education stakeholders are currently exploring the opportunities afforded by AI for teaching, learning, and work. . . To aid in these efforts, we present this inaugural EDUCAUSE AI Landscape Study, in which we summarize the higher education community’s current sentiments and experiences related to strategic planning and readiness, policies and procedures, workforce, and the future of AI in higher education.

http://tinyurl.com/4fhprhs6

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"WARC-GPT: An Open-Source Tool for Exploring Web Archives Using AI"


Using WARC-GPT, you can ask specific questions in natural language against a collection of WARC files. Rather than relying on keyword searches and metadata filters to sort through search results, WARC-GPT provides a new starting point for search using multi-document full-text search with summarization to explore the contents of web archives. WARC-GPT lists the sources used to generate the response and relevant text excerpts, which you can use to verify the information provided and identify points of interest within a collection of web archives.

http://tinyurl.com/3vvpsyj9

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Generative Artificial Intelligence in Higher Education: Evidence from an Analysis of Institutional Policies and Guidelines"


In this paper we examined documents produced by 116 US universities categorized as high research activity or R1 institutions to comprehensively understand GenAI related advice and guidance given to institutional stakeholders. Through an extensive analysis, we found the majority of universities (N=73, 63%) encourage the use of GenAI and many provide detailed guidance for its use in the classroom (N=48, 41%). More than half of all institutions provided sample syllabi (N=65, 56%) and half (N=58, 50%) provided sample GenAI curriculum and activities that would help instructors integrate and leverage GenAI in their classroom. Notably, most guidance for activities focused on writing, whereas code and STEM-related activities were mentioned half the time and vaguely even when they were (N=58, 50%).

https://arxiv.org/abs/2402.01659

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"How (and Why) the University of Michigan Built Its Own Closed Generative AI Tools "


On August 21, U-M did have three unique generative AI tools ready for returning students and employees.

  1. U-M GPT is the tool that most resembles ChatGPT. It is able to answer questions, produce written content, and make recommendations. Additionally, U-M GPT supports multiple commercial and open-source language models and AI art generators, broadening its utility and applications. . . .
  2. U-M Maizey is a no-code platform that allows users to build unique and customized chat programs by using their own datasets in combination with U-M’s AI language models. . . .
  3. U-M GPT Toolkit is designed for AI developers who require full control over the AI model and environment that they are building, training, and hosting. Researchers and developers who want to use the U-M GPT Toolkit must contact the ITS AI team for access.

http://tinyurl.com/yu8ym4j8

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

White Paper: AI Perceptions at the University of Baltimore


This white paper, produced by the UBalt AI team, explores the perceptions of Artificial Intelligence (AI) and generative AI within the UBalt community. It aims to uncover how students, faculty, and staff view AI’s role and implications in the educational landscape. The university collaborated with Ithaka S+R to acquire established, reliable and valid surveys from the AI literature, which was then adapted by the UBalt AI team to meet the needs of our academic community. This survey included a blend of both quantitative and qualitative questions, ensuring a deep understanding of the respondents’ views. . . . The responses obtained were then analyzed using descriptive and inferential statistics, as well as an exploratory qualitative analysis to extract meaningful insights, setting the stage for informed discussions and decision-making around AI in education.

http://tinyurl.com/mr47zx3j

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Deepfake Scammer Walks off with $25 Million in First-of-Its-Kind AI Heist"


The scam featured a digitally recreated version of the company’s chief financial officer, along with other employees, who appeared in a video conference call instructing an employee to transfer funds.

http://tinyurl.com/9aspy8u7

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Could AI Change the Scientific Publishing Market Once and for All?"


Artificial-intelligence tools in research like ChatGPT are playing an increasingly transformative role in revolutionizing scientific publishing and re-shaping its economic background. They can help academics to tackle such issues as limited space in academic journals, accessibility of knowledge, delayed dissemination, or the exponential growth of academic output. Moreover, AI tools could potentially change scientific communication and academic publishing market as we know them. They can help to promote Open Access (OA) in the form of preprints, dethrone the entrenched journals and publishers, as well as introduce novel approaches to the assessment of research output. It is also imperative that they should do just that, once and for all.

https://arxiv.org/abs/2401.14952

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"On the Readiness of Scientific Data for a Fair and Transparent Use in Machine Learning"


To ensure the fairness and trustworthiness of machine learning (ML) systems, recent legislative initiatives and relevant research in the ML community have pointed out the need to document the data used to train ML models. Besides, data-sharing practices in many scientific domains have evolved in recent years for reproducibility purposes. In this sense, the adoption of these practices by academic institutions has encouraged researchers to publish their data and technical documentation in peer-reviewed publications such as data papers. In this study, we analyze how this scientific data documentation meets the needs of the ML community and regulatory bodies for its use in ML technologies. We examine a sample of 4041 data papers of different domains, assessing their completeness and coverage of the requested dimensions, and trends in recent years, putting special emphasis on the most and least documented dimensions. As a result, we propose a set of recommendation guidelines for data creators and scientific data publishers to increase their data’s preparedness for its transparent and fairer use in ML technologies.

https://arxiv.org/abs/2401.10304

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"ChatGPT in Medical Libraries, Possibilities and Future Directions: An Integrative Review"


Positioned as a review, our study elucidates the applications of ChatGPT in medical libraries and discusses relevant considerations. The integration of ChatGPT into medical library services holds promise for enhancing information retrieval and user experience, benefiting library users and the broader medical community.

https://doi.org/10.1111/hir.12518

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Launch of Scopus AI to Help Researchers Navigate the World of Research "


Scopus AI is based on Scopus’ trusted content from over 27,000 academic journals, from more than 7,000 publishers worldwide, with over 1.8 billion citations, and includes over 17 million author profiles. Scopus content is vetted by an independent board of world-renowned scientists and librarians who represent the major scientific disciplines.

Since the alpha launch in August 2023, thousands of researchers across the world have tested Scopus AI. Their feedback has reinforced that, as generative AI evolves, researchers want trustworthy, cited research that is relevant and highly personalized to their needs.

Feedback from the research community has led to Scopus AI offering the following powerful features:

  • Expanded and Enhanced Summaries that provide researchers with fast overviews of key topics that they can dig deeper into, sometimes even highlighting gaps in literature. . . .
  • Foundational and Influential Papers that enable researchers to rapidly pinpoint seminal works, navigating academic progress and impact with precision and ease.
  • Academic Expert Search identifies leading experts in their fields and provides explanations of their expertise relevant to the user’s query, helping save time.
  • Enhanced breadth of research, covering ten years of Scopus content to support well-rounded perspective on topics of interest, and improved design to enhance the user experience.

http://tinyurl.com/22f78hv6

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Can ChatGPT Identify Predatory Biomedical and Dental Journals? A Cross-Sectional Content Analysis"


ChatGPT may effectively distinguish between predatory and legitimate journals, with accuracy rates of 92.5% and 71%, respectively. The potential utility of large-scale language models in exposing predatory publications is worthy of further consideration.

https://doi.org/10.1016/j.jdent.2024.104840

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Very Scary’: Mark Zuckerberg’s Pledge to Build Advanced AI Alarms Experts"


The Meta chief executive has said the company will attempt to build an artificial general intelligence (AGI) system and make it open source, meaning it will be accessible to developers outside the company. The system should be made "as widely available as we responsibly can," he added.

AGI? "An artificial general intelligence (AGI) is a hypothetical type of intelligent agent. If realized, an AGI could learn to accomplish any intellectual task that human beings or animals can perform. Alternatively, AGI has been defined as an autonomous system that surpasses human capabilities in the majority of economically valuable tasks."

http://tinyurl.com/2apt6kh6

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"How Can Universities Create AI Tools for their Communities? An Interview with the Creators of UC San Diego’s TritonGPT"


Since then, the University of California San Diego has launched TritonGPT, currently available in beta by invitation only. TritonGPT, a language model with a ChatGPT-like interface, was trained to answer detailed questions about UC San Diego’s policies, procedures, and campus life. Though TritonGPT is designed to serve a purpose very similar to the platforms at Harvard, Michigan, and Knoxville, UC San Diego’s approach stands out because instead of relying on Microsoft’s Azure OpenAI service, TritonGPT is hosted on local infrastructure and optimized for use in administrative operations.

https://tinyurl.com/5t3bwedm

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Google Launches Gemini, the AI model It Hopes Will Take Down GPT-4"


Gemini is more than a single AI model. There’s a lighter version called Gemini Nano that is meant to be run natively and offline on Android devices. There’s a beefier version called Gemini Pro that will soon power lots of Google AI services and is the backbone of Bard starting today. And there’s an even more capable model called Gemini Ultra that is the most powerful LLM Google has yet created and seems to be mostly designed for data centers and enterprise applications.

https://tinyurl.com/5n8jp9n2

| Artificial Intelligence and Libraries Bibliography |
Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"If Creators Suing AI Companies Over Copyright Win, It Will Further Entrench Big Tech"


The almost certain outcome [of copyright suits against AI companies](because it’s what happens every other time a similar situation arises) is that there will be one (possibly two) giant entities who will be designated as the "collection society" with whom AI companies will have to negotiate or to just purchase a "training license" and that entity will then collect a ton of money, much of which will go towards "administration," and actual artists will… get a tiny bit.. . .

But, given the enormity of the amount of content, and the structure of this kind of thing, the cost will be extremely high for the AI companies (a few pennies for every creator online can add up in aggregate), meaning that only the biggest of big tech will be able to afford it.

https://tinyurl.com/y4azzsdt

| Artificial Intelligence and Libraries Bibliography |
Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

STM: "New White Paper Launch: Generative AI in Scholarly Communications"


The paper looks at the ethical, legal, and practical aspects of GenAI, highlighting its potential to transform scholarly communications, and covers a range of topics from intellectual property rights to the challenges of maintaining integrity in the digital age. The paper provides best-practice principles and recommendations for authors, editorial teams, reviewers, and vendors, ensuring a responsible and ethical approach to the use of GenAI tools.

https://tinyurl.com/4m6m8n9j

| Artificial Intelligence and Libraries Bibliography |
Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Beyond the Hype Cycle: Experiments with ChatGPT’s Advanced Data Analysis at the Palo Alto City Library"


In June and July of 2023 the Palo Alto City Library’s Digital Services team embarked on an exploratory journey applying Large Language Models (LLMs) to library projects. This article, complete with chat transcripts and code samples, highlights the challenges, successes, and unexpected outcomes encountered while integrating ChatGPT Pro into our day-to-day work.

Our experiments utilized ChatGPTs Advanced Data Analysis feature (formerly Code Interpreter). The first goal tested the Search Engine Optimization (SEO) potential of ChatGPT plugins. The second goal of this experiment aimed to enhance our web user experience by revising our BiblioCommons taxonomy to better match customer interests and make the upcoming Personalized Promotions feature more relevant. ChatGPT helped us perform what would otherwise be a time-consuming analysis of customer catalog usage to determine a list of taxonomy terms better aligned with that usage.

In the end, both experiments proved the utility of LLMs in the workplace and the potential for enhancing our librarian’s skills and efficiency. The thrill of this experiment was in ChatGPT’s unprecedented efficiency, adaptability, and capacity. We found it can solve a wide range of library problems and speed up project deliverables. The shortcomings of LLMs, however, were equally palpable. Each day of the experiment we grappled with the nuances of prompt engineering, contextual understanding, and occasional miscommunications with our new AI assistant. In short, a new class of skills for information professionals came into focus.

https://journal.code4lib.org/articles/17867

| Artificial Intelligence and Libraries Bibliography |
Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall — NYT: "Ego, Fear and Money: How the A.I. Fuse Was Lit"


The people who were most afraid of the risks of artificial intelligence decided they should be the ones to build it. Then distrust fueled a spiraling competition. . . .

Over dinner, Mr. Gates told them he doubted that large language models could work. He would stay skeptical, he said, until the technology performed a task that required critical thinking & passing an A.P. biology test, for instance. . . .

[Five months later] Mr. Brockman gave the system [GPT-4] a multiple-choice advanced biology test, and Ms. Voss graded the answers. . . .

There were 60 questions. GPT-4 got only one answer wrong.

Mr. Gates sat up in his chair, his eyes opened wide. In 1980, he had a similar reaction when researchers showed him the graphical user interface that became the basis for the modern personal computer. He thought GPT was that revolutionary.

https://tinyurl.com/mvjs3z3k

| Artificial Intelligence and Libraries Bibliography |
Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

IFLA AI SIG: Developing a Library Strategic Response to Artificial Intelligence


The strategy most aligned to existing library practices and librarian identities, particularly in university, school and public libraries, is to take a lead role in promoting AI literacy. There is a widespread understanding that the public, as citizens and workers need to understand the new technologies. Students, whatever discipline they are studying, need such knowledge for employability. . . .

AI literacy is likely to include the ability to identify when AI is being used; to appreciate the differences between narrow and general AI; to understand what types of problem AI is good at solving; to understand how machine learning models are trained. It would also include awareness of ethical issues such as bias, privacy, explainability and social impact.

https://tinyurl.com/s6r6czrh

| Artificial Intelligence and Libraries Bibliography |
Research Data Curation and Management Works | | Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"AI for Academia: Digital Science Acquires Writefull to Empower Researchers and Publishers"


Writefull’s AI language models are trained on billions of sentences taken from millions of journal articles. Matched with a firm commitment to data privacy, this means its models offer unparalleled assistance to users in academic writing, paraphrasing, copy editing and revisions. . . .

Writefull’s language services are now used by students and researchers at more than 1,500 institutions, and are integrated into the workflows of top publishers and copy editors, such as at the American Chemical Society (ACS), Hindawi, the British Ecological Society, Sage, and the Royal Society of Chemistry (RSC). Writefull’s APIs are also integrated with Digital Science’s collaborative LaTeX editor Overleaf.

https://tinyurl.com/ywyap23p

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |