Mass Digitizaton – Page 9

“Editorial: Google Deal or Rip-Off?”

In "Editorial: Google Deal or Rip-Off?," Francine Fialkoff, Library Journal Editor-in-Chief, takes a hard look at the Google-Association of American Publishers/Authors Guild copyright settlement.

Here's an excerpt:

Clearly, the public had little standing in the negotiations that led to the recent agreement in the class-action lawsuit against Google for scanning books from library shelves. . . . Well, the suit was never about the public interest but about corporate interests, and librarians did not have much power at the bargaining table, no matter how hard those consulted pushed. While there are many provisions in the document that specify what libraries can and can't do and portend greater access, ultimately, it is the restrictions that scream out at us from the miasma of details.

Other perspectives can be found in my recently updated Google Book Search Bibliography, Version 3.

Google Book Search: Now with Magazines

Google has added magazines to Google Book Search.

Google Book Search Bibliography, Version 3

The Google Book Search Bibliography, Version 3 is now available.

This bibliography presents selected English-language articles and other works that are useful in understanding Google Book Search. It primarily focuses on the evolution of Google Book Search and the legal, library, and social issues associated with it. Where possible, links are provided to works that are freely available on the Internet, including e-prints in disciplinary archives and institutional repositories. Note that e-prints and published articles may not be identical.

Federal Judge John Sprizzo Tentatively Approves Google-AAP/AG Settlement

Federal Judge John Sprizzo has tentatively approved the Google-Association of American Publishers/Authors Guild copyright settlement.

Read more about it at "NY Judge Tentatively OKs Google Copyright Deal."

A Guide for the Perplexed: Libraries & the Google Library Project Settlement

ARL and ALA have released A Guide for the Perplexed: Libraries & the Google Library Project Settlement.

Here's an excerpt from the press release:

The guide is designed to help the library community better understand the terms and conditions of the recent settlement agreement between Google, the Authors Guild, and the Association of American Publishers concerning Google’s scanning of copyrighted works. Band notes that the settlement is extremely complex and presents significant challenges and opportunities to libraries. The guide outlines and simplifies the settlement’s provisions, with special emphasis on the provisions that apply directly to libraries.

Georgia Harper on the Google-AAP/AG Copyright Settlement

In "The LJ Academic Newswire Newsmaker Interview: Georgia Harper," Harper, Scholarly Communications Advisor at the University Libraries of the University of Texas at Austin, discusses the Google-AAP/AG copyright settlement and the part that research libraries played in it. Also see her blog posting ("Google Book Search—and Buy").

Here's an excerpt:

Brewster Kahle has chastised public libraries for working with Google under a cloak of secrecy. Can libraries realistically refuse NDAs?

I think Kahle’s point, and others raise this point too, is more about the deleterious effects of secrecy on the negotiation process itself. Secrecy tends to be isolating. If you don’t consult with your colleagues at other institutions, your leverage may be diminished. Of course, a library could also hire a business and/or legal consultant to help, and bind the consultant to the NDA. Yes, Kahle has identified a very thorny problem, but it’s one we can ameliorate. I don’t think it’s workable simply not to do business with companies whose assets are ideas and information just because they feel compelled to protect them through secrecy. Either way, consultation does increase information, and information is power—in fact, the power of information is also the source of the [NDA] problem in the first place.

Google-AAP/AG Copyright Settlement: Vaidhyanathan Questions, Google Answers

On October 28th, Siva Vaidhyanathan posed some questions to Google about its copyright settlement with the Association of American Publishers and the Authors Guild ("My Initial Take on the Google-Publishers Settlement"). Now, Google has replied ("Some Initial Answers to My Initial Questions about Google Book Search and the Settlement").

PALINET to Digitize 20 Million Textual Pages

With support from the Alfred P. Sloan Foundation, PALINET's Mass Digitization Collaborative plans to digitize 20 million textual pages of public domain material from participating member libraries. The scanned digital texts will be freely available from the Internet Archive.

Read more about at "PALINET's Mass Digitization Collaborative Underway."

University of Michigan Library's MBooks Adds User-Created Public Collections

The University of Michigan Library's MBooks project now offers user-created public collections of e-books.

In the future, Michigan plans to add MTagger functionality to MBooks.

Coverage of the Demise of Microsoft's Mass Digitization Project

Microsoft's decision to end its Live Search Books program, which provided important funding for the Open Content Alliance, has been widely covered by newspapers, blogs, and other information sources.

Here's a selection of articles and posts: "Books Scanning to be Publicly Funded," "'It Ain’t Over Till It's Over': Impact of the Microsoft Shutdown," "Microsoft Abandons Live Search Books/Academic Scan Plan," "Microsoft Burns Book Search—Lacks 'High Consumer Intent,'" "Microsoft Shuts Down Two of Its Google 'Wannabe’s': Live Search Books and Live Search Academic," "Microsoft Will Shut Down Book Search Program," "Microsoft's Book-Search Project Has a Surprise Ending," "Post-Microsoft, Libraries Mull Digitization," "Publishers Surprised by Microsoft Move," "Why Killing Live Book Search Is Good for the Future of Books," and "Without Microsoft, British Library Keeps on Digitizing."

Google Book Search Bibliography, Version 2

The Google Book Search Bibliography, Version 2 is now available.

California Digital Library Puts Up Mass Digitization Projects Page

The California Digital Library has added a UC Libraries Mass Digitization Projects page to its Inside CDL Web site.

The Web site includes links to Frequently Asked Questions, contracts with digitization partners, and other information.

Of special interest in the FAQ are the questions "What rights to the digitized content does UC have in the projects; will access be limited in any way?" and "How will our patrons be able to access these texts, i.e. through MELVYL, or local catalogs, or a webpage, any search engine, or….?"

Isilon's IQ Clustered Storage System Chosen by Michigan and Rice for Digital Repository Storage

Isilon Systems has announced that its IQ Clustered Storage System will be used to support the Michigan Digitization Project and the Rice Digital Scholarship Archive.

Here's an excerpt from the press release about Michigan:

Isilon Systems . . . today announced that the University of Michigan (U-M) has selected Isilon's IQ clustered storage system as the primary repository for its Michigan Digitization Project. In partnership with Google, the University of Michigan and its Michigan Digitization Project are digitizing more than 7.5 million books, ensuring these valuable resources are available to the public into perpetuity. This enormous undertaking includes the storage of digital copies of all unique books within the libraries of the entire Big-Ten Conference and directly supports Google Book Search, which aims to create a single, comprehensive, searchable, virtual card catalog of all books in all languages. The University of Michigan, in partnership with Indiana University (IU), is leveraging Isilon's IQ clustered storage system to create a Shared Digital Repository (SDR) of the universities' published library materials. Using Isilon IQ, U-M and IU are able consolidate digital copies of millions of books into one, single, shared pool of storage to meet the rapidly growing storage demand of its massive book digitization project. . . .

In conjunction with the Committee for Institutional Cooperation (CIC), an academic partnership formed by the universities of the Big-Ten Conference and the University of Chicago, the University of Michigan and Indiana University are working to create a Shared Digital Repository (SDR) which will mirror the content from U-M and the CIC libraries found in Google Book Search. Using Isilon IQ clustered storage, featuring its OneFS® operating system software, U-M has eliminated disparate data silos to create a shared pool of storage for the digitization efforts of these partner institutions. Each digitized book is approximately 55 MB in size, downloading at a rate of 3 MB/second, 24 hours a day, 7 days a week, for the entire six year duration of the project. Isilon IQ reduces storage management time, enabling U-M to accelerate the book scanning process, preserve valuable materials, and ultimately expand the research and learning capabilities for millions of users across the globe.

Here's an excerpt from the press release about Rice:

Isilon . . . today announced that Rice University has selected Isilon's IQ clustered storage system as its central repository for digital multimedia, including video of selected speeches by international dignitaries and musical performances from the Shepherd School of Music. In an effort to preserve the many historic events held at these prestigious venues and ensure the productions are available to the public into perpetuity, Rice has deployed Isilon clustered storage to consolidate hundreds of recorded musical performances and keynote speeches into a single, highly scalable and reliable shared pool of storage for the Rice Digital Scholarship Archive, an institutional repository based on the DSpace software platform. . . .

Through a cooperative effort between Rice University's Digital Library Initiative, Fondren Library and Central IT department, the university has created a central repository for all its critical multi-media content, enabling a variety of departments to execute on vital, content-driven projects simultaneously, activity that was impossible with traditional storage. Prior to using Isilon IQ, Rice's storage management for the Digital Scholarship archiving system was unable to effectively support management of large digital video and audio files that required streaming for delivery. These assets, therefore, were stored on a variety of streaming servers by various groups across campus, creating multiple access bottlenecks that led to inefficient storage management and undue IT cost and complexity. By unifying all of its digital content onto one, easy to use, "pay as you grow" clustered storage system, Rice University has removed costly data access and management barriers and dramatically simplified its storage architecture. Additionally, using Isilon's SmartQuotas provisioning and quota management software application, Rice is also storing its Language Center's multi-media course work and its Central IT department's webcasts on Isilon IQ, delivering immediate, concurrent data access to multiple users and user groups, further reducing storage management costs to maximize system efficiency.

Rice University will stream its collection of musical performances from the Shepherd School, as well as its video library of the many world leaders and dignitaries that have spoken at the Baker Institute, to thousands of users online. This operation necessitates the use of multiple media servers, using Windows, Quicktime and Real Player formats. Isilon clustered storage communicates natively over CIFS, NFS FTP, and HTTP, as well as interoperating with Windows, Mac and Linux environments, enabling seamless integration with Rice's variety of server formats and enabling all content to be streamed from one, central, easily and immediately accessible storage system. With Isilon IQ, Rice's entire collection of multi-media is accessible to all its servers 24x7x365, ensuring that the media streaming operations are not only efficient and cost-effective, but prepared to meet high user demand.

RLG Program Releases Copyright Investigation Summary Report

OCLC's RLG Program has released the Copyright Investigation Summary Report.

Here's an excerpt from the announcement:

This report summarizes interviews conducted between August and September 2007 with staff RLG Partner institutions. Interviewees shared information about how and why institutions investigate and collect copyright evidence, both for mass digitization projects and for items in special collections.

Google Book Search Book Viewability API Released

Google has released the Google Book Search Book Viewability API.

Here's an excerpt from the API home page:

The Google Book Search Book Viewability API enables developers to:

Link to Books in Google Book Search using ISBNs, LCCNs, and OCLC numbers

Know whether Google Book Search has a specific title and what the viewability of that title is

Generate links to a thumbnail of the cover of a book

Generate links to an informational page about a book

Generate links to a preview of a book

Read more about it at "Book Info Where You Need It, When You Need It."

France's Answer to Mass Digitization Projects: Gallica 2 to Go Live after Paris Book Fair

France's Gallica 2 digital book project will go live after the Paris Book Fair, which ends on March 19th. Initially, it will contain 62,000 digital works, mostly from the Bibliothèque Nationale de France. Publishers will have the option to charge various kinds of access fees.

Read more about it at "France Launches Google Books Rival."

TRLN (Triangle Research Libraries Network) Members Join the Open Content Alliance

TRLN (Triangle Research Libraries Network) has announced that its member libraries (Duke University, North Carolina Central University, North Carolina State University, and The University of North Carolina at Chapel Hill) have joined the Open Content Alliance.

Here's an excerpt from "TRLN Member Libraries Join Open Content Alliance":

In the first year, UNC Chapel Hill and North Carolina State University will each convert 2,700 public domain books into high-resolution, downloadable, reusable digital files that can be indexed locally and by any web search engine. UNC Chapel Hill and NCSU will start by each hosting one state-of-the-art Scribe machine provided by the Internet Archive to scan the materials at a cost of just 10 cents per page. Each university library will focus on historic collection strengths, such as plant and animal sciences, engineering and physical science at NCSU and social sciences and humanities at UNC-Chapel Hill. Duke University will also contribute select content for digitization during the first year of the collaborative project.

Preservation in the Age of Large-Scale Digitization: A White Paper

The Council on Library and Information Resources has published Preservation in the Age of Large-Scale Digitization: A White Paper by Oya Rieger.

Here's an excerpt from the "Preface":

This paper examines large-scale initiatives to identify issues that will influence the availability and usability, over time, of the digital books that these projects create. As an introduction, the paper describes four key large-scale projects and their digitization strategies. Issues range from the quality of image capture to the commitment and viability of archiving institutions, as well as those institutions' willingness to collaborate. The paper also attempts to foresee the likely impacts of large-scale digitization on book collections. It offers a set of recommendations for rethinking a preservation strategy. It concludes with a plea for collaboration among cultural institutions. No single library can afford to undertake a project on the scale of Google Book Search; it can, however, collaborate with others to address the common challenges that such large projects pose.

Although this paper covers preservation administration, digital preservation, and digital imaging, it does not attempt to present a comprehensive discussion of any of these distinct specialty areas. Deliberately broad in scope, the paper is designed to be of interest to a wide range of stakeholders. These stakeholders include scholars; staff at institutions that are currently providing content for large-scale digital initiatives, are in a position to do so in the future, or are otherwise influenced by the outcomes of such projects; and leaders of foundations and government agencies that support, or have supported, large digitization projects. The paper recommends that Google and Microsoft, as well as other commercial leaders, also be brought into this conversation.

A Major Milestone for the University of Michigan Library: One Million Digitized Books

The University of Michigan Library has digitized and made available one million books from its collection.

Here's an excerpt from "One Million Digitized Books":

One million is a big number, but this is just the beginning. Michigan is on track to digitize its entire collection of over 7.5 million bound volumes by early in the next decade. So far we have only glimpsed the kinds of new and innovative uses that can be made of large bodies of digitized books, and it is thrilling to imagine what will be possible when nearly all the holdings of a leading research library are digitized and searchable from any computer in the world.

Columbia University and Microsoft Book Digitization Project

The Columbia University Libraries have announced that they will work with Microsoft to digitize a "large number of books" that are in the public domain.

Here's an excerpt from the press release:

Columbia University and Microsoft Corp. are collaborating on an initiative to digitize a large number of books from Columbia University Libraries and make them available to Internet users. With the support of the Open Content Alliance (OCA), publicly available print materials in Columbia Libraries will be scanned, digitized, and indexed to make them readily accessible through Live Search Books. . . .

Columbia University Libraries is playing a key role in book selection and in setting quality standards for the digitized materials. Microsoft will digitize selected portions of the Libraries’ great collections of American history, literature, and humanities works, with the specific areas to be decided mutually by Microsoft and Columbia during the early phase of the project.

Microsoft will give the Library high-quality digital images of all the materials, allowing the Library to provide worldwide access through its own digital library and to share the content with non-commercial academic initiatives and non-profit organizations.

Read more about it at "Columbia University Joins Microsoft Scan Plan."

Peter Brantley Critiques Google Book Search

In "Reading Bad News Between the Lines of Google Book Search" (Chronicle of Higher Education subscription required), Peter Brantley, Executive Director of the Digital Library Federation, discusses his concerns about Google Book Search.

Here's an excerpt:

Q. Why are you concerned about Google Book Search?

A. The quality of the book scans is not consistently high. The algorithm Google uses to return search results is opaque. Then there's the commercial aspect. Google will attempt to find ways to make money off the service.

Spiro Reviews Last Year's Key Digital Humanities Developments

In a series of three interesting posts, Lisa Spiro, Director of the Digital Media Center and the Educational Technology Research and Assessment Cooperative at Rice University's Fondren Library, has reviewed 2007's major digital humanities developments: "Digital Humanities in 2007 [Part 1 of 3]," "Digital Humanities in 2007 [Part 2 of 3]," and "Digital Humanities in 2007 [Part 3 of 3]."

PublicDomainReprints.org Turns Digital Public Domain Books into Printed Books

PublicDomainReprints.org is offering an experimental service that allows users to convert about 1.7 million digital public domain books in the Internet Archive, Google Book Search, or the Universal Digital Library into printed books using the Lulu print-on-demand service.

Source: "Converting Google Book PDFs to Actual Books."

Columbia University Libraries and Bavarian State Library Become Google Book Search Library Partners

Both the Columbia University Libraries and Bavarian State Library have joined the Google Book Search Library Project.

Here are the announcements:

Bavarian State Library (in German)
Columbia University Libraries

University of Michigan Libraries Make over 100,000 Records for Digitized Books Available for Harvesting

The University of Michigan Libraries have made over 100,000 metadata records from its MBooks collection available for OAI-PMH harvesting. The records are for digitized books in the public domain.

Here's an excerpt from the announcement:

The University of Michigan Library is pleased to announce that records from our MBooks collection are available for OAI harvesting. The MBooks collection consists of materials digitized by Google in partnership with the University of Michigan.

http://quod.lib.umich.edu/cgi/o/oai/oai?verb=Identify

Only records for MBooks available in the public domain are exposed. We have split these into sets containing public domain items according to U.S. copyright law, and public domain items worldwide. There are currently over 100,000 records available for harvesting. We anticipate having 1 million records available when the entire U-M collection has been digitized by Google.