"[AAP] Publishers File Brief Opposing Internet Archive Appeal of Loss"


Controlled digital lending is a frontal assault on the foundational copyright principle that rightsholders exclusively control the terms of sale for every different format of their work — a principle that has spawned the broad diversity in formats of books, movies, television and music that consumers enjoy today.

"[T]here is no resemblance between IA’s conversion of millions of print books into ebooks and the historical practice of lending print books. Nor does IA’s distribution of ebooks without paying authors and their publishers a dime conform with the modern practices of libraries, which acquire licenses to lend ebooks to their local communities and enjoy the benefits of digital distribution lawfully."

The Internet Archive ("IA") operates a mass-digitization enterprise in which it copies millions of complete, in-copyright print books and distributes the resulting bootleg ebooks from its website to anyone in the world for free. Granting summary judgment, the District Court properly held that IA’s infringement is not saved by fair use as each of the four factors weighs against IA under longstanding case law.

https://tinyurl.com/5ah5vx3x

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Fair Use Rights to Conduct Text and Data Mining and Use Artificial Intelligence Tools Are Essential for UC Research and Teaching"


The UC Libraries invest more than $60 million each year licensing systemwide electronic content needed by scholars for these and other studies. (Indeed, the $60 million figure represents license agreements made at the UC systemwide and multi-campus levels. But each individual campus also licenses electronic resources, adding millions more in total expenditures.) Our libraries secure campus access to a broad range of digital resources including books, scientific journals, databases, multimedia resources, and other materials. In doing so, the UC Libraries must negotiate licensing terms that ensure scholars can make both lawful and comprehensive use of the materials the libraries have procured. Increasingly, however, publishers and vendors are presenting libraries with content license agreements that attempt to preclude, or charge additional and unsupportable fees for, fair uses like training AI tools in the course of conducting TDM. . . .

If the UC Libraries are unable to protect these fair uses, UC scholars will be at the mercy of publishers aggregating and controlling what may be done with the scholarly record. Further, UC scholars’ pursuit of knowledge will be disproportionately stymied relative to academic colleagues in other global regions, given that a large proportion of other countries preclude contractual override of research exceptions.

Indeed, in more than forty countries—including all those within the European Union (EU)—publishers are prohibited from using contracts to abrogate exceptions to copyright in non-profit scholarly and educational contexts. Article 3 of the EU’s Directive on Copyright in the Digital Single Market preserves the right for scholars within research organizations and cultural heritage institutions (like those researchers at UC) to conduct TDM for scientific research, and further proscribes publishers from invalidating this exception by license agreements (see Article 7). Moreover, under AI regulations recently adopted by the European Parliament, copyright owners may not opt out of having their works used in conjunction with artificial intelligence tools in TDM research—meaning copyrighted works must remain available for scientific research that is reliant on AI training, and publishers cannot override these AI training rights through contract. Publishers are thus obligated to—and do—preserve fair use-equivalent research exceptions for TDM and AI within the EU, and can do so in the United States, too. . . .

In all events, adaptable licensing language can address publishers’ concerns by reiterating that the licensed products may be used with AI tools only to the extent that doing so would not: i. create a competing or commercial product or service for use by third parties; ii. unreasonably disrupt the functionality of the subscribed products; or iii. reproduce or redistribute the subscribed products for third parties. In addition, license agreements can require commercially reasonable security measures (as also required in the EU) to extinguish the risk of content dissemination beyond permitted uses. In sum, these licensing terms can replicate the research rights that are unequivocally reserved for scholars elsewhere.

https://tinyurl.com/4fvpdz35

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

U.S. Copyright Office Update on Its Artificial Intelligence Initiatives


In March 2023, the Office announced a broad initiative to examine the copyright implications of the current forms of generative AI. Although we had previously examined the scope of copyright in works created using AI, the increasing sophistication and public adoption of generative AI tools raised new questions about the process of training and the legal status of the outputs. Our goal was to gather information from a full range of knowledgeable and interested parties in order to produce a report to assist Congress, thecourts, and others in formulating policy in this area. In taking this initiative forward, we are monitoring related work being done in other agencies, including the U.S. Patent and Trademark Office (USPTO) and the Federal Trade Commission, and communicating with them on an ongoing basis.

This letter summarizes the Office’s work so far and describes our agenda for the rest of 2024, including the release of the report, updates to the Compendium of U.S. Copyright Office Practices, and the publication of a proposed economic research agenda.

http://tinyurl.com/4tpeyw3t

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Text File That Runs the Internet"


But robots.txt is not a legal document — and 30 years after its creation, it still relies on the good will of all parties involved. Disallowing a bot on your robots.txt page. . . sends a message, but it’s not going to stand up in court. Any crawler that wants to ignore robots.txt can simply do so, with little fear of repercussions. . . . As the AI companies continue to multiply, and their crawlers grow more unscrupulous, anyone wanting to sit out or wait out the AI takeover has to take on an endless game of whac-a-mole. . . . If AI is in fact the future of search, as Google and others have predicted, blocking AI crawlers could be a short-term win but a long-term disaster.

http://tinyurl.com/5n8s72bz

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Court Dismisses Authors’ Copyright Infringement Claims Against OpenAI"


Several authors, including comedian Sarah Silverman, have suffered an early loss in their copyright battle against OpenAI. The authors accused OpenAI of using pirated copies of their books to train its models. A California federal court dismissed the vicarious copyright infringement and DMCA violation claims. However, the lawsuit isn’t over yet.

http://tinyurl.com/478vm6kw

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Even If You Hate Both AI and Section 230, You Should Be Concerned about the Hawley/Blumenthal Bill to Remove 230 Protections from AI"


Considering that AI is currently being built into basically everything, this "exemption" [from Section 230] will basically eat the entire law, because increasingly all content produced online will involve "the use or provision" of generative AI, even if the content itself has nothing to do with the service provider.

In short, this bill doesn’t just strip 230 protections from AI output, in effect it strips 230 from any company that offers AI in its products. Which is basically a set of internet companies rapidly approaching "all of them."

https://tinyurl.com/ykjx8v4t

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Judge Will Toss Part of Authors’ AI Copyright Lawsuit "


According to Reuters, judge Vince Chhabria said the authors’ allegations that text generated by Llama infringes their copyrights simply doesn’t stand up to scrutiny. "When I make a query of Llama, I’m not asking for a copy of Sarah Silverman’s book—I’m not even asking for an excerpt," Chhabria observed, noting that, under the authors’ theory, a side-by-side comparison of text generated by the AI application and Silverman’s book would have to show they are similar.

However, the judge said he will not dismiss the case with prejudice, meaning the authors will be allowed to amend and refile their claims.

https://tinyurl.com/sd4wbba4

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"UC Berkeley Library to Copyright Office: Protect Fair Uses in AI Training for Research and Education"


If the Copyright Office were to enable rightsholders to opt-out of training AI for research and teaching fair uses, then academic institutions and scholars would face even greater hurdles in licensing content for research purposes. It would be operationally difficult for academic publishers and content aggregators to amass and license the "leftover" body of copyrighted works that remain eligible for AI training. Costs associated with publishers’ efforts in compiling "AI-training-eligible" content would be passed along as additional fees charged to academic libraries, who are already financially constrained to preserve TDM and other fair uses for scholars. In addition, rightsholders might opt out of allowing their work to be used for AI training fair uses, and then turn around and charge AI usage fees to scholars (or libraries)—essentially licensing back fair uses for research. These scenarios would impede scholarship by or for research teams who lack grant or institutional funds to cover these additional expenses; penalize research in or about underfunded disciplines or geographical regions; and result in bias as to the topics and regions studied.

https://tinyurl.com/5cd2vc85

| Artificial Intelligence and Libraries Bibliography |
Research Data Curation and Management Works | | Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

$2,500 Fee: "COAR’s response to the American Chemical Society’s New Fee or Repository Deposit"


COAR strongly objects to this charge for the following reasons:

  • Authors own their manuscripts and should retain their rights. Authors typically hold the copyright to their research, but too often transfer those rights to publishers when publishing their manuscript. When authors retain the copyright to their manuscript, they have the right to disseminate and use their own manuscript as they choose. If authors’ rights are retained, publishers do not own an article accepted manuscript (AAM) and researchers should not be duped into paying a fee to exercise a right they already have.
  • This fee is in direct contravention with the ethos of open science and scholarship and equity. . .
  • ACS is charging $2,500 while providing no added value. There is not a fee for an extra service offered. It requires no extra work on the side of the publisher, but rather is an attempt to develop a new revenue stream, while at the same time they will be receiving funds from subscriptions and pay-to-access for this same article.
  • ACS is creating a false impression about compliance with funder policies. . . . A fee is only required if you want to publish in an ACS journal and sign over your rights.

See ACS’ "Open Access Pricing for Authors: The Power of Choice" for more fee details.

https://tinyurl.com/4u4dfxsk

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"On the Culture of Open Access: The Sci-Hub Paradox"


Based on a large randomized sample, this study first shows that OA publications, including those in fully OA journals, receive more citations than their subscription-based counterparts. However, the OACA has slightly decreased over the seven last years. The introduction of a distinction between those accessible or not via the Sci-hub platform among subscription-based suggest that the generalization of its use cancels the positive effect of OA publishing. The results show that publications in fully OA journals are victims of the success of Sci-hub. Thus, paradoxically, although Sci-hub may seem to facilitate access to scientific knowledge, it negatively affects the OA movement as a whole, by reducing the comparative advantage of OA publications in terms of visibility for researchers

https://doi.org/10.1007/s11192-023-04792-5

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "Digital Ownership: The Case of E-books"


This paper presents the results of an empirical research study that used an online survey to examine e-book consumers’ perspectives on digital ownership and digital rights. The study revealed that while most participants value and desire ownership rights, certain conventional ownership rights, such as reselling, gifting, and lending, are deemed less significant and can be relinquished by consumers due to cost-related factors. Furthermore, contrary to prevailing assumptions, the study found no discernible generational gap concerning people’s perceptions of digital ownership rights.

https://doi.org/10.1002/pra2.807

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Supporting Open Access for 20 Years: Five Issues That Have Slowed the Transition to Full and Immediate OA"


Current estimates suggest that more than 50% of the world’s research articles are published open access and that there are around 20,000 fully OA journals. Data also indicates that publishing OA is, on average, cheaper than publishing in subscription journals. For example, an analysis by Delta Think shows that around 45% of all scholarly articles were published as paid-for open access in 2021, but this accounted for just under 15% of the total journal publishing revenue.

However, after two decades of discussions, advocacy, policy development and strategy, can this level of OA be considered a success, particularly when half of all research articles published today is hidden behind a paywall? I think not.

https://tinyurl.com/2s396wh7

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"ACS, Elsevier, and Researchgate Resolve Litigation, with Solution to Support Researchers"


ACS and Elsevier, members of the Coalition for Responsible Sharing, have agreed to a legal settlement with ResearchGate that ensures copyright-compliant sharing of research articles published with ACS or Elsevier on the ResearchGate site. The lawsuits pending against ResearchGate in Germany and the United States are now resolved. The specific terms of the parties’ settlement are confidential.

Background: "Munich Court Ruling Sides with Elsevier, ACS over ResearchGate."

https://tinyurl.com/mrr9xywj

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Us Rejects AI Copyright for Famous State Fair-Winning Midjourney Art"


"The Board finds that the Work contains more than a de minimis amount of content generated by artificial intelligence ("AI"), and this content must therefore be disclaimed in an application for registration. Because Mr. Allen is unwilling to disclaim the AI-generated material, the Work cannot be registered as submitted," the office wrote in its decision.

https://tinyurl.com/3exv5ecw

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Internet Archive Appeals Loss in Library Ebook Lawsuit"


The Internet Archive announced today that it has appealed its loss in a major ebook copyright case. A notice indicates that it’s filed with the Second Circuit Court of Appeals in Hachette v. Internet Archive, a publishing industry lawsuit over the nonprofit group’s Open Library program. . . .

Court documents indicate the Internet Archive is still preparing its response to the lawsuit by UMG and other record labels; a pretrial conference in that case is currently scheduled for October.

https://tinyurl.com/ysp8m558

"Microsoft Offers Legal Protection for AI Copyright Infringement Challenges"


"Specifically, if a third party sues a commercial customer for copyright infringement for using Microsoft’s Copilots or the output they generate, we will defend the customer and pay the amount of any adverse judgments or settlements that result from the lawsuit, as long as the customer used the guardrails and content filters we have built into our products," writes Microsoft.

Further information: "Microsoft Announces New Copilot Copyright Commitment for Customers."

https://tinyurl.com/53x9yh6m

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Appeals Court Rules That Library of Congress Can No Longer Require Deposit of Published Works"


The bottom line now seems to be that CO [Copyright Office] can no longer require the deposit of two copies of all published works. Deposit can, it appears, continue to be a condition of copyright registration, but in light of this ruling it seems only a matter of time before that requirement is challenged as well. . . .

The implications of this ruling for the Library of Congress are potentially significant — if for no other reason than it will now have to purchase many of the books it once could rely on publishers and authors providing gratis.

https://tinyurl.com/zt23ksh8

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Paywall: "A Study on Copyright Issues of Different Controlled Digital Lending (CDL) Modes"


The paper will explore CDL modes by combing CDL practices and programs from research papers and official website documents of different library organizations. Then, based on legal frameworks of CDL in the US, Canada and the UK which are summarized, copyright issues of CDL modes are analyzed from perspectives of implementing institution, service resources, and usage mode. Finally, some copyright recommendations for sustainable development of CDL are proposed.

https://doi.org/10.1177/09610006231190654

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Open Access and University IP Policies in the United States"


Although the existing, somewhat messy, maze of institutional IP policies, publishing agreements, and OA policies can seem daunting, understanding their terms is important for authors who want to see their works made openly available. I’ll leave for another day to explore whether it’s a good thing that the rights situation is so complex. In many situations, rights thickets like these can be a real detriment to authors and access to their works. In this case the situation is at least nuanced such that authors are able to leverage pre-existing licenses to avoid negotiating away the bundle of rights they need to see their works made available openly.

https://tinyurl.com/3ttpd7my

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Rights of UC Authors Are at Stake. Here’s What We Are Doing about It."


"We have learned that many publishers are requiring UC authors to sign misleading License to Publish agreements, which undermine the spirit and intent of [UC’s open access policies]," wrote Susan Cochran, Chair of the faculty Academic Senate PDF.

By purporting to restrict an author’s abilities to reuse their own work, "these agreements essentially turn faculty authors into readers, as opposed to creators and owners of their own work," the Academic Senate chair concludes.

The team that leads negotiations with scholarly publishers on behalf of the university, including representatives from UC’s California Digital Library, the 10 campus libraries, and the Academic Senate, is now taking up the charge, making author rights the next frontier in advocating for the UC research community.

https://tinyurl.com/mry3hczw

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Internet Archive Responds to Recording Industry Lawsuit Targeting Obsolete Media"


Late Friday, some of the world’s largest record labels, including Sony and Universal Music Group, filed a lawsuit against the Internet Archive and others for the Great 78 Project, a community effort for the preservation, research and discovery of 78 rpm records that are 70 to 120 years old. . . .

Of note, the Great 78 Project has been in operation since 2006 to bring free public access to a largely forgotten but culturally important medium. Through the efforts of dedicated librarians, archivists and sound engineers, we have preserved hundreds of thousands of recordings that are stored on shellac resin, an obsolete and brittle medium. The resulting preserved recordings retain the scratch and pop sounds that are present in the analog artifacts; noise that modern remastering techniques remove.

These preservation recordings are used in teaching and research, including by university professors like Jason Luther of Rowan University, whose students use the Great 78 collection as the basis for researching and writing podcasts for use in class assignments . . . While this mode of access is important, usage is tiny—on average, each recording in the collection is only accessed by one researcher per month.

https://tinyurl.com/bdevycm5

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Judgment Entered in Publishers, Internet Archive Copyright Case"


Most importantly, the proposed agreement includes a permanent injunction that would, among its provisions, bar the IA’s lending of unauthorized scans of in-copyright, commercially available books, as well as bar the IA from "profiting from" or "inducing" any other party’s "infringing reproduction, public distribution, public display and/or public performance" of books "in any digital or electronic form" once notified by the copyright holder. . . .

The negotiated payment is all inclusive—it covers costs, fees, damages, and other claims, including the IA’s claim that damages should be remitted—something that should assuage initial concerns expressed by some who feared a massive damage award might force the nonprofit IA to cease operations. The negotiated judgment does seek destruction of the IA’s scans as the publishers’ initial complaint had suggested.

https://tinyurl.com/p3yaszd9

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"The New York Times Prohibits AI Vendors from Devouring Its Content"


The new terms prohibit the use of Times content—which includes articles, videos, images, and metadata—for training any AI model without express written permission. In Section 2.1 of the TOS, the NYT says that its content is for the reader’s “personal, non-commercial use” and that non-commercial use does not include “the development of any software program, including, but not limited to, training a machine learning or artificial intelligence (AI) system.”

https://tinyurl.com/2cc4uhuc

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Sites Scramble to Block ChatGPT Web Crawler after Instructions Emerge"


But for large website operators, the choice to block large language model (LLM) crawlers isn’t as easy as it may seem. Making some LLMs blind to certain website data will leave gaps of knowledge that could serve some sites very well (such as sites that don’t want to lose visitors if ChatGPT supplies their information for them), but it may also hurt others. For example, blocking content from future AI models could decrease a site’s or a brand’s cultural footprint if AI chatbots become a primary user interface in the future. As a thought experiment, imagine an online business declaring that it didn’t want its website indexed by Google in the year 2002—a self-defeating move when that was the most popular on-ramp for finding information online.

https://tinyurl.com/yc4mcejn

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |

"Publishers, Internet Archive Agree to Streamline Digital Book-Lending Case"


The proposed order would require the Archive to pay Lagardere SCA’s (LAGA.PA) Hachette Book Group, News Corp’s (NWSA.O) HarperCollins Publishers, John Wiley & Sons (WLY.N) and Bertelsmann SE & Co’s (BTGGg.F) Penguin Random House an undisclosed amount of money if it loses its appeal.

The order would also permanently block the Archive from lending out copies of the publishers’ books without permission, pending the result of the appeal.

https://tinyurl.com/yc5j2vb8

| Research Data Publication and Citation Bibliography | Research Data Sharing and Reuse Bibliography | Research Data Curation and Management Bibliography | Digital Scholarship |