"Harvard Law School Digitization Project Publishes Nearly 7 Million Court Cases Online"


The Caselaw Access Project, also known as CAP, aimed "to make all published U.S. court decisions freely available to the public online in a consistent format, digitized from the collection of the Harvard Law School Library," according to the project’s website. . . .

CAP launched in 2015 through a partnership with Ravel Law, a legal research and analytics startup company. Per the terms of the partnership, CAP received financial support in exchange for Ravel obtaining eight years of exclusivity with the caselaw documents, according to Harvard Law Today, a school-run publication.

https://tinyurl.com/mrxpy558

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"[AAP] Publishers File Brief Opposing Internet Archive Appeal of Loss"


Controlled digital lending is a frontal assault on the foundational copyright principle that rightsholders exclusively control the terms of sale for every different format of their work — a principle that has spawned the broad diversity in formats of books, movies, television and music that consumers enjoy today.

"[T]here is no resemblance between IA’s conversion of millions of print books into ebooks and the historical practice of lending print books. Nor does IA’s distribution of ebooks without paying authors and their publishers a dime conform with the modern practices of libraries, which acquire licenses to lend ebooks to their local communities and enjoy the benefits of digital distribution lawfully."

The Internet Archive ("IA") operates a mass-digitization enterprise in which it copies millions of complete, in-copyright print books and distributes the resulting bootleg ebooks from its website to anyone in the world for free. Granting summary judgment, the District Court properly held that IA’s infringement is not saved by fair use as each of the four factors weighs against IA under longstanding case law.

https://tinyurl.com/5ah5vx3x

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Digitization and the Market for Physical Works: Evidence from the Google Books Project"


We study the impact of the Google Books digitization project on the market for physical books. We find that digitization significantly boosts the demand for physical versions and provide evidence for the discovery channel. Moreover, digitization allows independent publishers to introduce new editions for existing books, further increasing sales.

https://tinyurl.com/2pbuzty2

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"Are Searches in OCR-generated Archives Trustworthy?"


The accuracy of searches was tested by performing sample searches of leading newspaper databases. The test revealed several weaknesses in the search process, including an average 18 percent error rate for single words in body text, and a far higher error rates for advertisements. Such high error rates encourage a critical look at the 20-year-old sector.

https://doi.org/10.1515/jbwg-2023-0003

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

Millions of Digitized Books May Be Destroyed: "Press Conference Statement: Brewster Kahle, Internet Archive"


Here’s what’s at stake in this case: hundreds of libraries contributed millions of books to the Internet Archive for preservation in addition to those books we have purchased. Thousands of donors provided the funds to digitize them.

The publishers are now demanding that those millions of digitized books, not only be made inaccessible, but be destroyed.

This is horrendous. Let me say it again—the publishers are demanding that millions of digitized books be destroyed.

And if they succeed in destroying our books or even making many of them inaccessible, there will be a chilling effect on the hundreds of other libraries that lend digitized books as we do.

This could be the burning of the Library of Alexandria moment—millions of books from our community’s libraries mdash;gone.

http://bit.ly/3JHMjli

| Research Data Curation and Management Works |
| Digital Curation and Digital Preservation Works |
| Open Access Works |
| Digital Scholarship |

"The Oregon Digital Newspaper Program’s Commitment to Open Access"

Sarah Seymore, has published "The Oregon Digital Newspaper Program's Commitment to Open Access" in the OLA Quarterly.

Here's an excerpt:

The Oregon Digital Newspaper Program (ODNP) at the University of Oregon Libraries is an initiative to digitize historic and current Oregon newspapers, making them freely available to the public through a keyword-searchable online database. The ODNP is committed to open access and has included collaboration and data sharing with larger programs like the Library of Congress' Chronicling America historic newspaper website. Since 2015, the ODNP has increased its open access mission by archiving and hosting born-digital newspaper content, as well as continuing digitization of historic newspapers from microfilm and print. This article outlines the ODNP's past and current open access efforts, inclusion of diverse content, and open source, sustainable applications, websites, and workflows.

Research Data Curation Bibliography, Version 10 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Remembering the CLASSICs: Impact of the CLASSICs Act on Memory Institutions, Orphan Works, and Mass Digitization"

Shannon Price, has published "Remembering the CLASSICs: Impact of the CLASSICs Act on Memory Institutions, Orphan Works, and Mass Digitization" in the UCLA Entertainment Law Review.

Here's an excerpt:

This paper considers the impact of the CLASSICs Act on memory institutions' ability to combat two of the most significant legal challenges that they face: orphan works and mass digitization. Although the CLASSICs Act is at best a partial solution for orphan works and mass digitization, it has fundamentally changed the landscape for memory institution use of pre–72 sound recordings.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

Current Best Practices among Cultural Heritage Institutions when Dealing with Copyright Orphan Works and Analysis of Crowdsourcing Options

Victoria Stobo et al. have self-archived "Current Best Practices among Cultural Heritage Institutions when Dealing with Copyright Orphan Works and Analysis of Crowdsourcing Options."

Here's an excerpt:

The purpose of this study is to establish the current state of best practices among Cultural Heritage Institutions (CHIs) when dealing with in-copyright orphan works in three countries: the United Kingdom, Netherlands and Italy. A baseline understanding of current practice provides a benchmark against which crowdsourcing (or any other proposal) to address the challenge posed by orphan works can be evaluated. The research team used a purposive sample to approach the 'Big 3' national libraries and film archives in each country, typically including the national library, the national archive and the national film archive. The researchers also aimed to include at least one institution from each jurisdiction that had used the EUIPO database, and one institution that digitized orphan works but opted not to use the database.

Research Data Curation Bibliography, Version 9 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Text Data Mining from the Author’s Perspective: Whose Text, Whose Mining, and to Whose Benefit?"

Christine L. Borgman has self-archived "Text Data Mining from the Author's Perspective: Whose Text, Whose Mining, and to Whose Benefit?."

Here's an excerpt:

Given the many technical, social, and policy shifts in access to scholarly content since the early days of text data mining, it is time to expand the conversation about text data mining from concerns of the researcher wishing to mine data to include concerns of researcher-authors about how their data are mined, by whom, for what purposes, and to whose benefits.

Research Data Curation Bibliography, Version 8 | Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"14 Million Books & 6 Million Visitors: HathiTrust Growth and Usage in 2016"

HathiTrust has released 14 Million Books & 6 Million Visitors: HathiTrust Growth and Usage in 2016 .

Here's an excerpt from the announcement:

The HathiTrust collection continues to grow steadily. As of January 1st, 2017, there are 14,816,187 volumes in the collection. Over one million volumes were added to the collection over the course of the preceding year, scanned from the library collections of 39 contributors. . . .

Within the HathiTrust certified trusted repository, 38% of the collection is available to users to access in full view, and the remaining 62% is made available in other ways: all users can search across and within those limited view books; researchers can now perform transformational, non-consumptive research within these books; and users with print disabilities can access the full text.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Out of Print: The Orphans of Mass Digitization"

Mary Murrell has published "Out of Print: The Orphans of Mass Digitization" in Current Anthropology.

Here's an excerpt:

In the 2000s an interconnected set of elite projects in the United States sought to digitize "all books in all languages" and make them available online. These mass digitization projects were efforts to absorb the print book infrastructure into a new one centered in computer networks. Mass book digitization has now faded from view, and here I trace its setbacks through a curious figure—the "orphan"—that emerged from within these projects and acted ultimately as an agent of impasse. In legal policy debates, an "orphan" refers to a copyrighted work whose owner cannot be found, but its history, range of meanings, and deployments reveal it to be considerably more complex. Based on fieldwork conducted at a digital library engaged in mass digitization, this paper analyzes the "orphan" as a personifying metaphor that digital library activists embraced in order to challenge and/or disrupt the social relations that adhere in and around books.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"Collaborative Academic Library Digital Collections Post-Cambridge University Press, HathiTrust and Google Decisions on Fair Use"

Michelle M. Wu has published "Collaborative Academic Library Digital Collections Post-Cambridge University Press, HathiTrust and Google Decisions on Fair Use" in the Journal of Copyright in Education and Librarianship.

Here's an excerpt:

Academic libraries face numerous stressors as they seek to meet the needs of their users through technological advances while adhering to copyright laws. This paper seeks to explore one specific proposal to balance these interests, the impact of recent decisions on its viability, and the copyright challenges that remain after these decisions

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap

"The Economics of Book Digitization and the Google Books Litigation"

Hannibal Travis has self-archived "The Economics of Book Digitization and the Google Books Litigation."

Here's an excerpt from the announcement:

This piece explores the digitization and uploading to the Internet of full-text books, book previews in the form of chapters or snippets, and databases that index the contents of book collections. Along the way, it will describe the economics of copyright, the "digital dilemma," and controversies surrounding fair use arguments in the digital environment. It illustrates the deadweight losses from restricting digital libraries, book previews, copyright litigation settlements, and dual-use technologies that enable infringement but also fair use.

Digital Curation and Digital Preservation Works | Open Access Works | Digital Scholarship | Digital Scholarship Sitemap