"Empirical Evidences in Citation-Based Search Engines: Is Microsoft Academic Search Dead?"

Enrique Orduna-Malea et al. have self-archived "Empirical Evidences in Citation-Based Search Engines: Is Microsoft Academic Search Dead?"

Here's an excerpt:

The goal of this working paper is to summarize the main empirical evidences provided by the scientific community as regards the comparison between the two main citation based academic search engines: Google Scholar and Microsoft Academic Search, paying special attention to the following issues: coverage, correlations between journal rankings, and usage of these academic search engines. Additionally, selfelaborated data is offered, which are intended to provide current evidence about the popularity of these tools on the Web, by measuring the number of rich files PDF, PPT and DOC in which these tools are mentioned, the amount of external links that both products receive, and the search queries frequency from Google Trends. The poor results obtained by MAS led us to an unexpected and unnoticed discovery: Microsoft Academic Search is outdated since 2013.

Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

"The Number of Scholarly Documents on the Public Web"

Madian Khabsa and C. Lee Giles mail have published "The Number of Scholarly Documents on the Public Web" in PLOS ONE.

Here's an excerpt:

The number of scholarly documents available on the web is estimated using capture/recapture methods by studying the coverage of two major academic search engines: Google Scholar and Microsoft Academic Search. Our estimates show that at least 114 million English-language scholarly documents are accessible on the web, of which Google Scholar has nearly 100 million. Of these, we estimate that at least 27 million (24%) are freely available since they do not require a subscription or payment of any kind. In addition, at a finer scale, we also estimate the number of scholarly documents on the web for fifteen fields: Agricultural Science, Arts and Humanities, Biology, Chemistry, Computer Science, Economics and Business, Engineering, Environmental Sciences, Geosciences, Material Science, Mathematics, Medicine, Physics, Social Sciences, and Multidisciplinary, as defined by Microsoft Academic Search. In addition, we show that among these fields the percentage of documents defined as freely available varies significantly, i.e., from 12 to 50%.

Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

"Checking In With Google Books, HathiTrust, and the DPLA"

Naomi Eichenlaub has published "Checking In With Google Books, HathiTrust, and the DPLA" in Computers in Libraries.

Here's an excerpt:

Google Books and HathiTrust have been making headlines in the library world and beyond for years now, while a new player, the Digital Public Library of America (DPLA), has only recently entered the scene. This article will provide a "state of the environment" update for these digital library projects including project history and background. It will also examine some challenges common to all three projects including copyright, orphan works, metadata, and quality issues.

Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

"Google Scholar as Replacement for Systematic Literature Searches: Good Relative Recall and Precision Are Not Enough"

Martin Boeker, Werner Vach and Edith Motschall have published "Google Scholar as Replacement for Systematic Literature Searches: Good Relative Recall and Precision Are Not Enough" in BMC Medical Research Methodology.

Here's an excerpt:

The objectives of this work are to measure the relative recall and precision of searches with Google Scholar under conditions which are derived from structured search procedures conventional in scientific literature retrieval; and to provide an overview of current advantages and disadvantages of the Google Scholar search interface in scientific literature retrieval. . . .

The reported relative recall must be interpreted with care. It is a quality indicator of Google Scholar confined to an experimental setting which is unavailable in systematic retrieval due to the severe limitations of the Google Scholar search interface. Currently, Google Scholar does not provide necessary elements for systematic scientific literature retrieval such as tools for incremental query optimization, export of a large number of references, a visual search builder or a history function. Google Scholar is not ready as a professional searching tool for tasks where structured retrieval methodology is necessary.

Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

"Just Google It—Digital Research Practices of Humanities Scholars"

Max Kemman, Martijn Kleppe, and Stef Scagliola have self-archived "Just Google It—Digital Research Practices of Humanities Scholars" in arXiv.org.

Here's an excerpt:

The transition from analogue to digital archives and the recent explosion of online content offers researchers novel ways of engaging with data. The crucial question for ensuring a balance between the supply and demand-side of data, is whether this trend connects to existing scholarly practices and to the average search skills of researchers. To gain insight into this process we conducted a survey among nearly three hundred (N= 288) humanities scholars in the Netherlands and Belgium with the aim of finding answers to the following questions: 1) To what extent are digital databases and archives used? 2) What are the preferences in search functionalities 3) Are there differences in search strategies between novices and experts of information retrieval? Our results show that while scholars actively engage in research online they mainly search for text and images. General search systems such as Google and JSTOR are predominant, while large-scale collections such as Europeana are rarely consulted.

Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

"A Perspective on the Merits of the Antitrust Objections to the Failed Google Books Settlement"

Pamela Samuelson has published "A Perspective on the Merits of the Antitrust Objections to the Failed Google Books Settlement" in the Harvard Journal of Law & Technology Occasional Paper Series.

Here's an excerpt:

This Article responds to critics of the antitrust objections to the ASA [Amended Settlement Agreement] by making three main points. Part II explains that Judge Chin's incomplete and unpersuasive analysis of the antitrust objections to the proposed settlement agreement is best understood as an effort to encourage the settling parties to adopt more competitive terms in any revised settlement agreement. Part III points out that the DOJ did not reach definitive conclusions on antitrust issues posed by the ASA. The DOJ was, however, obliged to submit an interim analysis because Judge Chin wanted the government's input before he ruled on whether the settlement should be approved and the DOJ did a creditable job under the circumstances. Part IV contends that there was more merit to the DOJ's antitrust concerns about the proposed settlement than some commentators have recognized.

Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

"Manipulating Google Scholar Citations and Google Scholar Metrics: Simple, Easy and Tempting"

Emilio Delgado López-Cózar, Nicolás Robinson-García, and Daniel Torres-Salinas have self-archived "Manipulating Google Scholar Citations and Google Scholar Metrics: Simple, Easy and Tempting" in arXiv.org.

Here's an excerpt:

The launch of Google Scholar Citations and Google Scholar Metrics may provoke a revolution in the research evaluation field as it places within every researchers reach tools that allow bibliometric measuring. In order to alert the research community over how easily one can manipulate the data and bibliometric indicators offered by Google's products we present an experiment in which we manipulate the Google Citations profiles of a research group through the creation of false documents that cite their documents, and consequently, the journals in which they have published modifying their H index. . . . We analyse the malicious effect this type of practices can cause to Google Scholar Citations and Google Scholar Metrics. Finally, we conclude with several deliberations over the effects these malpractices may have and the lack of control tools these tools offer.

| Digital Scholarship's Digital/Print Books | Digital Scholarship |

Authors Guild et al. v. Google: "Brief of Amici Curiae Academic Authors in Support of Defendant-Appellant and Reversal"

Pamela Samuelson and David R. Hansen have self-archived "Brief of Amici Curiae Academic Authors in Support of Defendant-Appellant and Reversal" in SSRN.

Here's an excerpt:

Summary of argument: Class certification was improperly granted below because the District Court failed to conduct a rigorous analysis of the adequacy of representation factor, as Rule 23(a)(4) requires. The three individual plaintiffs who claim to be class representatives are not academics and do not share the commitment to broad access to knowledge that predominates among academics. . . .

Academic authors desire broad public access to their works such as that which the Google Books project provides. Although the District Court held that the plaintiffs had inadequately represented the interests of academic authors in relation to the proposed settlement, it failed to recognize that pursuit of this litigation would be even more adverse to the interests of academic authors than the proposed settlement was. . . .

In short, a "win" in this case for the class representatives would be a "loss" for academic authors. It is precisely this kind of conflict that courts have long recognized should prevent class certification due to inadequate representation. The District Court failed to adequately address this fundamental conflict in its certification order, though it was well aware of the conflict through submissions and objections received from the settlement fairness hearing through to the hearings on the most recent class certification motions. Because of that failure, the order certifying the class should be reversed

| Google Books Bibliography | Digital Scholarship |

Digital Copyright: Google Asks Court to Reverse Class Certification Decision in The Authors Guild et al. v. Google Inc.

In a brief, Google has asked the U.S. Second Circuit Court of Appeals to reverse the class certification decision by the United States District Court for the Southern District of New York in The Authors Guild et al. v. Google Inc. case.

Here's the brief.

Read more about it at "Google Asks Court to Ax Book-Scanning Suit from Authors Guild."

| Scholarly Electronic Publishing Bibliography 2010 | Digital Scholarship |

"Brief of Digital Humanities and Law Scholars as Amici Curiae in Authors Guild v. Google"

Matthew L. Jockers, Matthew Sag, and Jason Schultz have self-archived "Brief of Digital Humanities and Law Scholars as Amici Curiae in Authors Guild v. Google" in SSRN.

Here's an excerpt:

The brief argues that, just as copyright law has long recognized the distinction between protection for an author's original expression (e.g., the narrative prose describing the plot) and the public's right to access the facts and ideas contained within that expression (e.g., a list of characters or the places they visit), the law must also recognize the distinction between copying books for expressive purposes (e.g., reading) and nonexpressive purposes, such as extracting metadata and conducting macroanalyses. We amici urge the court to follow established precedent with respect to Internet search engines, software reverse engineering, and plagiarism detection software and to hold that the digitization of books for text-mining purposes is a form of incidental or intermediate copying to be regarded as fair use as long as the end product is also nonexpressive or otherwise non-infringing.

| Google Books Bibliography | Digital Scholarship |

Google and Publishers Settle Seven-Year-Old Copyright Lawsuit over Google Library Project

Google and the Association of American Publishers have settled the copyright lawsuit over Google Library Project. The related Authors Guild lawsuit has not been settled.

Here's an excerpt from the Google press release:

The agreement settles a copyright infringement lawsuit filed against Google on October 19, 2005 by five AAP member publishers. As the settlement is between the parties to the litigation, the court is not required to approve its terms.

The settlement acknowledges the rights and interests of copyright-holders. US publishers can choose to make available or choose to remove their books and journals digitized by Google for its Library Project. Those deciding not to remove their works will have the option to receive a digital copy for their use.

Apart from the settlement, US publishers can continue to make individual agreements with Google for use of their other digitally-scanned works. . . .

Google Books allows users to browse up to 20% of books and then purchase digital versions through Google Play. Under the agreement, books scanned by Google in the Library Project can now be included by publishers.

See also the AAP press release.

| Google Books Bibliography | Digital Scholarship |

"It Was Never a Universal Library: Three Years of the Google Book Settlement"

Walt Crawford has published "It Was Never a Universal Library: Three Years of the Google Book Settlement" in Cites & Insights: Crawford at Large.

Here's an excerpt:

Remember the Google Books settlement? It was going to settle a four-year-old pair of lawsuits (four years old then, eight years old now) against Google (by the Association of American Publishers, AAP, and the Authors Guild, AG) asserting that Google was infringing on copyright through its two-line snippets from in-copyright books scanned in the Google Library Project—and by the scanning itself. Later, a third group representing media photographers also sued Google for the same actions. . . .

This is a long set of notes and comments (cites & insights). It strikes me that the topic and complexity deserve that length—but note that I'm offering much briefer excerpts and comments on most items than I normally would in this sort of roundup.

After two sets of general notes and overviews (one before the settlement was rejected, one after) I'm breaking the discussion down by topics rather than chronologically.

| Google Books Bibliography | Digital Scholarship |

"Teaching with Google Books: Research, Copyright, and Data Mining"

Nathan Rinne has self-archived "Teaching with Google Books: Research, Copyright, and Data Mining" in E-LIS.

Here's an excerpt:

Google's Google Books site is a rich resource that is probably underutilized by most educators. It has all kinds of potential for a) getting students into the research process in a way that they will enjoy (for example, they can see how a famous quote has been used/quoted, find out which books cite the journal article they are interested in, or check to see if a specific book covers a topic that they want to explore, etc.); b) teaching them about the deeper civic purpose and the evolving state of copyright law; and, c) exploring, with the help of Google Book's Ngram viewer, the promise and ethics surrounding the issue of data-mining and "non-consumptive" research, or research that is accomplished by "mining" books for data, as opposed to reading them.

| Google Books Bibliography | Digital Scholarship |

Search Engine Use 2012

The Pew Research Center's Internet & American Life Project has released Search Engine Use 2012.

Here's an excerpt:

For more than a decade, Pew Internet data has consistently shown that search engine use is one of the most popular online activities, rivaled only by email as an internet pursuit. In January 2002, 52% of all Americans used search engines. In February 2012 that figure grew to 73% of all Americans. On any given day in early 2012, more than half of adults using the internet use a search engine (59%). That is double the 30% of internet users who were using search engines on a typical day in 2004. And people's frequency of using search engines has jumped dramatically.

| Google Books Bibliography | Digital Scholarship |

Pamela Samuelson et al. Send Letter to US District Court Judge Denny Chin about Authors Guild v. Google Case

Pamela Samuelson, Richard M. Sherman Distinguished Professor of Law and Information at the UC Berkeley School of Law, and other scholars have sent a letter ("Academic Author Objections to Plaintiff's Motion for Class Certification") to US District Court Judge Denny Chin about class certification issues in the Authors Guild v. Google Case.

Here's an excerpt:

We believe that our works of scholarship are more typical of the contents of research library collections than works of the three named plaintiffs in this case. Betty Miles is the author of numerous children's books. Jim Bouton is a former baseball pitcher who has written both fiction and nonfiction books based on his experiences as a baseball player. Joseph Goulden is a professional writer who has written a number of nonfiction books on a variety of subjects, including a book about "superlawyers." None of these three are academic authors. Their books are aimed at a popular, rather than an academic, audience. As professional writers, their motivations and interests in having their books published would understandably be different, and likely more commercial, than those of academic scholars. Hence, our concern is that these three do not share the academic interests that are typical of authors of books in research library collections. As we explain further below, the clearest indication that the named plaintiffs do not share the same priorities typical of academic authors is their insistence on pursuing this litigation.

| Google Books Bibliography | Digital Scholarship |

"Putting 600,000 Books Online: the Large-Scale Digitisation Partnership between the Austrian National Library and Google"

Max Kaiser has published "Putting 600,000 Books Online: the Large-Scale Digitisation Partnership between the Austrian National Library and Google" in the latest issue of LIBER Quarterly.

Here's an excerpt:

In a public-private partnership with Google, the Austrian National Library is digitising its historical book holdings. Some 600,000 volumes from the sixteenth to the nineteenth centuries will be digitised and made available free of charge. The project demonstrates that public-private partnerships can be successful in enabling our heritage institutions to provide large-scale access to their holdings, provided that such partnerships are not exclusive and free access is ensured. The article outlines the preparatory phase and work flows established in the project.

| Google Books Bibliography | Digital Scholarship |

Legal Issues in Mass Digitization: A Preliminary Analysis and Discussion Document

The U.S. Office of the Register of Copyrights has released Legal Issues in Mass Digitization: A Preliminary Analysis and Discussion Document .

Here's the announcement:

The Copyright Office has published a Preliminary Analysis and Discussion Document that addresses the issues raised by the intersection between copyright law and the mass digitization of books. The purpose of the Analysis is to facilitate further discussions among the affected parties and the public discussions that may encompass a number of possible approaches, including voluntary initiatives, legislative options, or both. The Analysis also identifies questions to consider in determining an appropriate policy for the mass digitization of books.

Public discourse on mass digitization is particularly timely. On March 22, 2011, the U.S. District Court for the Southern District of New York rejected a proposed settlement in the copyright infringement litigation regarding Google's mass book digitization project. The court found that the settlement would have redefined the relationship between copyright law and new technology, and it would have encroached upon Congress's ability to set copyright policy with respect to orphan works. Since then, a group of authors has filed a lawsuit against five university libraries that participated in Google's mass digitization project. These developments have sparked a public debate on the risks and opportunities that mass book digitization may create for authors, publishers, libraries, technology companies, and the general public. The Office's Analysis will serve as a basis for further policy discussions on this issue.

| Google Books Bibliography | Digital Scholarship |

"Access to the Agreement between Google Books and the British Library"

In "Access to the Agreement between Google Books and the British Library," Javier Ruiz of the Open Rights Group analyzes the Google Books contract between Google and the British Library (includes a link to contract).

Here's an excerpt:

The British Library recently announced to much fanfare a deal with Google to make available online a quarter of a million books no longer restricted by copyright, thus in the public domain.

The deal is presented as a win-win situation, where Google pays for the costs of scanning the books, which will be available on both Google and BL's websites. This sounds very philanthropic from Google, however the catch is in the detail:

"Once digitised, these unique items will be available for full text search, download and reading through Google Books, as well as being searchable through the Library's website and stored in perpetuity within the Library's digital archive."

In order to find out what this really means we asked the British Library for a copy of the agreement with Google, which was not uploaded to their transparency website with other similar contracts, as it didn't involve monetary exchange.

| Digital Scholarship |

Is the Google Book Settlement Still Possible?

In "Google Books Settlement, 2008-2011," James Grimmelmann analyzes the impact of recent rulings and case resolutions on the Google Book Settlement. The rulings and resolutions are the In re: Literary Works in Electronic Databases Copyright Litigation ruling, the National Music Publishers' Association's resolution of The Football Association Premier League Limited, et al. v. You Tube, Inc. lawsuit (consolidated into Viacom v. YouTube), and the Wal-Mart Stores, Inc. v. Dukes et al. ruling.

Here's an excerpt:

The road to class-wide settlement—even to a much more modest settlement that covers only scanning and searching—now appears to be barred. What is more, in light of the freelancers' case and the Supreme Court's recent Wal-Mart case, the road to class-wide litigation also looks to be extraordinarily difficult. Google will raise many of the same adequacy of representation arguments in its opposition to class certification. It might still be more feasible for a few copyright owners holding large number of copyrights to litigate on an individual basis—but the major publishers, who best fit that bill, have all more or less made their peace with Google through its Partner Program. The odds of the authors being able to see this one through to the end have just dropped precipitously. Google is holding all the cards now, and they're all full houses.

| New: Google Books Bibliography, Version 7 | Digital Scholarship |

Google Books Bibliography, Version 7

Digital Scholarship has released version 7 of the Google Books Bibliography, which presents over 325 selected English-language articles and other works that are useful in understanding Google Books. It primarily focuses on the evolution of Google Books and the legal, library, and social issues associated with it, especially the Google Book Settlement. To better show the development Google Books, it is now organized by year of publication. It primarily includes journal articles, e-prints, magazine articles, and newspaper articles. This version expands coverage of law review articles and legal e-prints. Where possible, links are provided to works that are freely available on the Internet.

The following recent Digital Scholarship publications may also be of interest:

| Digital Scholarship | Digital Scholarship Publications Overview |

"Just Google It!—The Google Book Search Settlement: A Law and Economics Analysis"

Frank Müller-Langer and Marc Scheufen have self-archived "Just Google It!—The Google Book Search Settlement: A Law and Economics Analysis" in SSRN.

Here's an excerpt:

Our law and economics analysis of the Book Search Project suggests that—from a copyright perspective—the proposed settlement may be beneficial to right holders, consumers, and Google. For instance, it may provide a solution to the still unsolved dilemma of orphan works. From a competition policy perspective, we stress the important aspect that Google’s pricing algorithm for orphan and unclaimed works effectively replicates a competitive Nash-Bertrand market outcome under post-settlement, third-party oversight.

| Digital Scholarship | Digital Scholarship Publications Overview | Reviews of Digital Scholarship Publications |Google Books Bibliography |

Association of Research Libraries Sends Letter to FTC about Google Books Privacy Issues

The Association of Research Libraries has sent a letter to the Federal Trade Commission regarding Google Books privacy issues.

Here's an excerpt:

This consent order presents a unique opportunity to shape best practices in reader privacy for a major online service provider. The marketplaces for e-books and for book search are both in formative stages, and the standards adopted by Google can be highly influential for other market participants. We urge the Commission to confirm that reader privacy deserves the same respect in the online world that it has long demanded in the physical world by insisting on strong protections for reader privacy in the comprehensive privacy program.

Read more about it at "In Comments to FTC, ARL Suggests Privacy Oversight for Google Books."

| Digital Scholarship | Digital Scholarship Publications Overview | Google Books Bibliography |

Pamela Samuelson: "Legislative Alternatives to the Google Book Settlement"

Pamela Samuelson has self-archived "Legislative Alternatives to the Google Book Settlement" in SSRN.

Here's an excerpt:

In the aftermath of Judge Chin's rejection of the proposed Google Book settlement, it is time to consider legislative alternatives. This article explores a number of component parts of a legislative package that might accomplish many of the good things that the proposed settlement promised without the downsides that would have attended judicial approval of it. It gives particular attention to the idea of an extended collective licensing regime as a way to make out-of-print but in-copyright books more widely available to the public. But it also considers several other measures, such as one aimed at allowing orphan works to be made available and some new privileges that would allow digitization for preservation purposes and nonconsumptive research uses of a digital library of books from the collections of major research libraries.

| Digital Scholarship | Digital Scholarship Publications Overview | Google Books Bibliography |

A Guide For the Perplexed Part IV: The Rejection of the Google Books Settlement

The Library Copyright Alliance has released A Guide For the Perplexed Part IV: The Rejection of the Google Books Settlement.

Here's an excerpt from the press release:

This guide is the latest in a series prepared by LCA legal counsel Jonathan Band to help inform the library community about this landmark legal dispute.

In the Guide Part IV, Band explains why the Court rejected the proposed class action settlement, which would have allowed Google to engage in a wide variety of activities using scanned books.

As stated in the Guide, "The court concluded that the settlement was unfair because a substantial number of class members [i.e., authors and publishers] voiced significant concerns with the settlement.… However, the validity of the objections seemed less important to the court than the fact that many class members raised them."

As for the impact of the decision on libraries, Band writes that while it is too early to say what the parties will do next, "it appears that both the challenges and the opportunities presented to libraries by the settlement when it was announced in the fall of 2008 are growing narrower and more distant."

| Digital Scholarship | Digital Scholarship Publications Overview | Transforming Scholarly Publishing through Open Access: A Bibliography |

Author’s Guild et al. v. Google Inc. Ruling: Amended Settlement Agreement Denied

Judge Denny Chin of the U.S. District Court Southern District of New York has denied the Amended Settlement Agreement for the Author's Guild et al. v. Google Inc. case.

Here's an excerpt from the ruling:

Before the Court is plaintiffs' motion pursuant to Rule 23 of the Federal Rules of Civil Procedure for final approval of the proposed settlement of this class action on the terms set forth in the Amended Settlement Agreement (the "ASA"). The question presented is whether the ASA is fair, adequate, and reasonable. I conclude that it is not.

While the digitization of books and the creation of a universal digital library would benefit many, the ASA would simply go too far. It would permit this class action—which was brought against defendant Google Inc. ("Google") to challenge its scanning of books and display of "snippets" for on-line searching—to implement a forward-looking business arrangement that would grant Google significant rights to exploit entire books, without permission of the copyright owners. Indeed, the ASA would give Google a significant advantage over competitors, rewarding it for engaging in wholesale copying of copyrighted works without permission, while releasing claims well beyond those presented in the case.

Accordingly, and for the reasons more fully discussed below, the motion for final approval of the ASA is denied. The accompanying motion for attorneys' fees and costs is denied, without prejudice.

Read more about it at "After Rejection, a Rocky Road for Google Settlement"; "GBS March Madness: Paths Forward for the Google Books Settlement"; "Google Books Settlement: Copyright, Congress, and Information Monopolies"; "Google Settlement Is Rejected"; "Inside Judge Chin's Opinion"; "Please Refine Your Search Terms"; and "Publishers Remain Committed to Expanding Online Access to Books and Upholding Copyright Despite Court Decision."

| Digital Scholarship | Digital Scholarship Publications Overview | Scholarly Electronic Publishing Bibliography 2010 |