"AI Learns to Write Computer Code in "Stunning" Advance"


After training, AlphaCode solved about 34% of assigned problems, DeepMind reports this week in Science. . . . To further test its prowess, DeepMind entered AlphaCode into online coding competitions. In contests with at least 5000 participants, the system outperformed 45.7% of programmers. The researchers also compared its programs with those in its training database and found it did not duplicate large sections of code or logic. It generated something new—a creativity that surprised Ellis.

bit.ly/3UPpRdr

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Are We Undervaluing Open Access by Not Correctly Factoring in the Potentially Huge Impacts of Machine Learning? — An Academic Librarian’s View (I)"


Synopsis: I have recently adjusted my view to the position that the benefits of Machine learning techniques are more likely to be real and large. This is based on the recent incredible results of LLM (Large Language models) and about a year’s experimenting with some of the newly emerging tools based on such technologies.

If I am right about this, are we academic librarians systematically undervaluing Open Access by not taking this into account sufficiently when negotiating with publishers? Given that we control the purse strings, we are one of the most impactful parties (next to publishers and researchers) that will help decide how fast if at all the transition to an Open Access World occurs.

https://cutt.ly/U19MZzK

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

And It Ran: "I Used ChatGPT to Create an Entire AI Application on AWS"


So in this blog post I describe how I used ChatGPT to create a simple sentiment analysis application from scratch. The app should run on an EC2 instance and utilize a state-of-the-art NLP model from the Hugging Face Model Hub. The results were astonishing.

https://cutt.ly/a1Z6c8i

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

ChatGPT: "Finally, an A.I. Chatbot That Reliably Passes ‘the Nazi Test ’"


A chatbot that meets the hype is finally here. On Thursday, OpenAI released ChatGPT, a bot that converses with humans via cutting-edge artificial intelligence. The bot can help you write code, compose essays, dream up stories, and decorate your living room. And that’s just what people discovered on day one.

https://cutt.ly/d1XqQKN

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"The Scary Truth about AI Copyright Is Nobody Knows What Will Happen Next"


First, can you copyright the output of a generative AI model, and if so, who owns it? Second, if you own the copyright to the input used to train an AI, does that give you any legal claim over the model or the content it creates? Once these questions are answered, an even larger one emerges: how do you deal with the fallout of this technology? What kind of legal restraints could—or should—be put in place on data collection? And can there be peace between the people building these systems and those whose data is needed to create them?

https://cutt.ly/UM9vOJK

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"We Could Run Out Of Data to Train AI Language Programs"


The trouble is, the types of data typically used for training language models may be used up in the near future—as early as 2026, according to a paper by researchers from Epoch, an AI research and forecasting organization, that is yet to be peer reviewed. The issue stems from the fact that, as researchers build more powerful models with greater capabilities, they have to find ever more texts to train them on. Large language model researchers are increasingly concerned that they are going to run out of this sort of data, says Teven Le Scao, a researcher at AI company Hugging Face, who was not involved in Epoch’s work.

https://cutt.ly/L1Wj6of

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Applying AI to Digital Archives: Trust, Collaboration and Shared Professional Ethics"


Policy makers produce digital records on a daily basis. A selection of records is then preserved in archival repositories. However, getting access to these archival materials is extremely complicated for many reasons—including data protection, sensitivity, national security, and copyright. Artificial Intelligence (AI) can be applied to archives to make them more accessible, but it is still at an experimental stage. While skills gaps contribute to keeping archives ‘dark’, it is also essential to examine issues of mistrust and miscommunication. This article argues that although civil servants, archivists, and academics have similar professional principles articulated through professional codes of ethics, these are not often communicated to each other. This lack of communication leads to feelings of mistrust between stakeholders. Mistrust of technology also contributes to the barriers to effective implementation of AI tools. Therefore, we propose that surfacing the shared professional ethics between stakeholders can contribute to deeper collaborations between humans. In turn, these collaborations can lead to the building of trust in AI systems and tools. The research is informed by semi-structured interviews with thirty government professionals, archivists, historians, digital humanists, and computer scientists. Previous research has largely focused on preservation of digital records, rather than access to these records, and on archivists rather than records creators such as government professionals. This article is the first to examine the application of AI to digital archives as an issue that requires trust and collaboration across the entire archival circle (from record creators to archivists, and from archivists to users).

https://doi.org/10.1093/llc/fqac073

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Meta’s Game-Playing AI Can Make and Break Alliances Like a Human"


Learning to play Diplomacy is a big deal for several reasons. Not only does it involve multiple players, who make moves at the same time, but each turn is preceded by a brief negotiation in which players chat in pairs in an attempt to form alliances or gang up on rivals. After this round of negotiation, players then decide what pieces to move—and whether to honor or renege on a deal.

https://cutt.ly/c1bEU9c

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Ethics of Artificial Intelligence: Case Studies and Options for Addressing Ethical Challenges


This open access collection of AI ethics case studies is the first book to present real-life case studies combined with commentaries and strategies for overcoming ethical challenges. Case studies are one of the best ways to learn about ethical dilemmas and to achieve insights into various complexities and stakeholder perspectives.

https://cutt.ly/z1zj5Oy

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"When AI Can Make Art—What Does It Mean for Creativity?"


While internet users have embraced this supercharged creative potential—armed with the correctly refined prompt, even novices can now create arresting digital canvases—some artists have balked at the new technology’s capacity for mimicry. Among the prompts entered into image generators Stable Diffusion and Midjourney, many tag an artist’s name in order to ensure a more aesthetically pleasing style for the resulting image. Something as mundane as a bowl of oranges can become eye-catching if rendered in the style of, say, Picasso. Because the AI has been trained on billions of images, some of which are copyrighted works by living artists, it can generally create a pretty faithful approximation.

https://cutt.ly/iMv27Pn

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Microsoft, GitHub, and OpenAI Sued: "The Lawsuit That Could Rewrite the Rules of AI Copyright"


Microsoft, its subsidiary GitHub, and its business partner OpenAI have been targeted in a proposed class action lawsuit alleging that the companies’ creation of AI-powered coding assistant GitHub Copilot relies on "software piracy on an unprecedented scale". . . .Copilot, which was unveiled by Microsoft-owned GitHub in June 2021, is trained on public repositories of code scraped from the web, many of which are published with licenses that require anyone reusing the code to credit its creators. Copilot has been found to regurgitate long sections of licensed code without providing credit—prompting this lawsuit that accuses the companies of violating copyright law on a massive scale.

https://cutt.ly/FMwC4mR

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"An AI Toolkit for Libraries"


Now that artificial intelligence (AI) tools are being widely used across academic publishing, how can we make informed assessments of these utilities? There is a need for a set of skills for evaluating new tools and measuring existing ones, which should enable anyone commissioning or managing AI utilities to understand what questions to ask, what parameters to measure and possible pitfalls to avoid when introducing a new utility. The skills required are not technical. Potential problems include bias in the corpus, a poor training set or poor use of metrics for evaluation. This article gives a quick overview of some of areas where AI tools are being used and how they work. It then provides a checklist for assessment. The goal is not to discredit AI, but to make effective use of it.

http://doi.org/10.1629/uksg.592

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Paywall: "‘So How Do We Balance All of These Needs?’: How the Concept of AI Technology Impacts Digital Archival Expertise"


Four main themes were identified: fitting AI into day to day practice; the responsible use of (AI) technology; managing expectations (about AI adoption) and bias associated with the use of AI. The analysis suggests that AI adoption combined with hindsight about digitisation as a disruptive technology might provide archival practitioners with a framework for re-defining, advocating and outlining digital archival expertise.

https://doi.org/10.1108/JD-08-2022-0170

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Google —Text Prompts Create Videos (with Live Examples): "Imagen Video: High Definition Video Generation Wwth Diffusion Models"


We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models. Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models. . . . We find Imagen Video not only capable of generating videos of high fidelity, but also having a high degree of controllability and world knowledge, including the ability to generate diverse videos and text animations in various artistic styles and with 3D object understanding.

https://cutt.ly/aBzo4R2

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

Openly Licensed Photos and AI Facial Recognition White Paper: AI_Commons

"This white paper presents the case of using openly licensed photographs for AI facial recognition training datasets. . . . The case creates an opportunity to ask fundamental questions about the challenges that open licensing faces today, related to privacy, exploitation of the commons at massive scales of use, or dealing with unexpected and unintended uses of works that are openly licensed"

https://cutt.ly/pBuHEmH

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"NIH Launches Bridge2AI Program to Expand the Use of Artificial Intelligence in Biomedical and Behavioral Research"

https://cutt.ly/gVPCJVw

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |