Archive for the 'Google and Other Search Engines' Category

Massive Yahoo News Feed Dataset Released

Posted in Data Curation, Open Data, and Research Data Management, Google and Other Search Engines on January 19th, 2016

Yahoo has released a massive News Feed dataset.

Here's an excerpt from the announcement:

The Yahoo News Feed dataset is a collection based on a sample of anonymized user interactions on the news feeds of several Yahoo properties, including the Yahoo homepage, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Movies, and Yahoo Real Estate. The dataset stands at a massive ~110B lines (1.5TB bzipped) of user-news item interaction data, collected by recording the user-news item interaction of about 20M users from February 2015 to May 2015.

Digital Scholarship | Digital Scholarship Sitemap

Be Sociable, Share!

    "Google Scholar as a Tool for Discovering Journal Articles in Library and Information Science"

    Posted in Google and Other Search Engines on November 20th, 2015

    Dirk Lewandowski has self-archived "Google Scholar as a Tool for Discovering Journal Articles in Library and Information Science."

    Here's an excerpt:

    We found that only some journals are completely indexed by Google Scholar, that the ratio of versions available depends on the type of publisher, and that availability varies a lot from journal to journal. Google Scholar cannot substitute for abstracting and indexing services in that it does not cover the complete literature of the field. However, it can be used in many cases to easily find available full texts of articles already found using another tool.

    Digital Scholarship | Digital Scholarship Sitemap

    Be Sociable, Share!

      "Fair Use in the Digital Age: Reflections on the Fair Use Doctrine in Copyright Law"

      Posted in Copyright, Digital Copyright Wars, Google and Other Search Engines on November 16th, 2015

      The Program on Information Justice and Intellectual Property at the American University Washington College of Law has released a digital video of Judge Pierre N. Leval's "Fair Use in the Digital Age: Reflections on the Fair Use Doctrine in Copyright Law" lecture.

      Here's an excerpt from the announcement:

      At the Fourth Annual Peter A. Jaszi Distinguished Lecture in Intellectual Property, Judge Pierre N. Leval of the United States Court of Appeals for the Second Circuit will present a lecture on the role of the fair use doctrine within the structure of copyright law. Judge Leval is responsible for introducing the concept of transformative use to United States fair use jurisprudence and will discuss the development of the doctrine to date. He is the author of the court's opinion in Authors Guild Inc., et al. v. Google, Inc. (October 16, 2015) in which the court held that Google's digitization of copyright-protected works, creation of a search functionality, and display of snippets from those works are non-infringing fair uses. Judge Leval also authored Toward a Fair Use Standard, 103 HARV. L. REV. 1105 (1990).

      Digital Scholarship | Digital Scholarship Sitemap

      Be Sociable, Share!

        "Does Google Scholar Contain All Highly Cited Documents (1950-2013)?"

        Posted in Google and Other Search Engines, Scholarly Metrics on November 4th, 2014

        Alberto Martín-Martín et al. have self-archived "Does Google Scholar Contain All Highly Cited Documents (1950-2013)."

        Here's an excerpt:

        The study of highly cited documents on Google Scholar (GS) has never been addressed to date in a comprehensive manner. The objective of this work is to identify the set of highly cited documents in Google Scholar and define their core characteristics: their languages, their file format, or how many of them can be accessed free of charge. We will also try to answer some additional questions that hopefully shed some light about the use of GS as a tool for assessing scientific impact through citations. The decalogue of research questions is shown below:

        1. Which are the most cited documents in GS?
        2. Which are the most cited document types in GS?
        3. What languages are the most cited documents written in GS?
        4. How many highly cited documents are freely accessible?
        4.1 What file types are the most commonly used to store these highly cited documents?
        4.2 Which are the main providers of these documents?
        5. How many of the highly cited documents indexed by GS are also indexed by WoS?
        6. Is there a correlation between the number of citations that these highly cited documents have received in GS and the number of citations they have received in WoS?
        7. How many versions of these highly cited documents has GS detected?
        8. Is there a correlation between the number of versions GS has detected for these documents, and the number citations they have received?
        9. Is there a correlation between the number of versions GS has detected for these documents, and their position in the search engine result pages?
        10. Is there some relation between the positions these documents occupy in the search engine result pages, and the number of citations they have received?

        Digital Scholarship | "A Quarter-Century as an Open Access Publisher"

        Be Sociable, Share!

          Google Settles American Society of Media Photographers, Inc. et al. v. Google Inc.

          Posted in Copyright, Digital Copyright Wars, Google and Other Search Engines, Publishing on September 8th, 2014

          Google has settled the American Society of Media Photographers, Inc. et al. v. Google Inc. lawsuit. The agreement is confidential.

          Here's an excerpt from the press release:

          The agreement resolves a copyright infringement lawsuit filed against Google in April, 2010, bringing to an end more than four years of litigation. It does not involve any admission of liability by Google. As the settlement is between the parties to the litigation, the court is not required to approve its terms.

          This settlement does not affect Google's current litigation with the Authors Guild or otherwise address the underlying questions in that suit.

          Digital Scholarship | "A Quarter-Century as an Open Access Publisher"

          Be Sociable, Share!

            "EFF Urges Appeals Court to Keep Protecting Fair Use"

            Posted in Copyright, Digital Copyright Wars, Google and Other Search Engines, Mass Digitizaton on July 15th, 2014

            EFF has released "EFF Urges Appeals Court to Keep Protecting Fair Use."

            Here's an excerpt:

            In this latest appeal, the Authors Guild (and its supporters) claim that fair use is being unjustly expanded, portraying Judge Chin's ruling and other recent court opinions as some kind of fair-use creep, stretching beyond the original intent of the doctrine. Specifically, the Guild argues that the first of the four statutory fair use factors—the purpose of the use, which asks whether the use of the copyrighted material is transformative and/or non-commercial—weighs against Google. The Authors Guild and its amici insist that a use cannot be transformative if it doesn't add new creative expression to the pre-existing work. But as Judge Chin so rightly recognized, a use can be transformative if serves a new and distinct purpose.

            Digital Scholarship | "A Quarter-Century as an Open Access Publisher"

            Be Sociable, Share!

              "The Dark Side of Open Access in Google and Google Scholar: The Case of Latin-American Repositories"

              Posted in Google and Other Search Engines, Institutional Repositories, Open Access on June 19th, 2014

              Enrique Orduña-Malea et al. have self-archived "The Dark Side of Open Access in Google and Google Scholar: The Case of Latin-American Repositories."

              Here's an excerpt:

              The main objective of this study is to ascertain the presence and visibility of Latin American repositories in Google and Google Scholar through the application of page count and visibility indicators. For a sample of 137 repositories, the results indicate that the indexing ratio is low in Google, and virtually nonexistent in Google Scholar; they also indicate a complete lack of correspondence between the repository records and the data produced by these two search tools. These results are mainly attributable to limitations arising from the use of description schemas that are incompatible with Google Scholar (repository design) and the reliability of web indicators (search engines). We conclude that neither Google nor Google Scholar accurately represent the actual size of open access content published by Latin American repositories; this may indicate a non-indexed, hidden side to open access, which could be limiting the dissemination and consumption of open access scholarly literature.

              Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

              Be Sociable, Share!

                "Empirical Evidences in Citation-Based Search Engines: Is Microsoft Academic Search Dead?"

                Posted in Google and Other Search Engines, Scholarly Metrics on May 27th, 2014

                Enrique Orduna-Malea et al. have self-archived "Empirical Evidences in Citation-Based Search Engines: Is Microsoft Academic Search Dead?"

                Here's an excerpt:

                The goal of this working paper is to summarize the main empirical evidences provided by the scientific community as regards the comparison between the two main citation based academic search engines: Google Scholar and Microsoft Academic Search, paying special attention to the following issues: coverage, correlations between journal rankings, and usage of these academic search engines. Additionally, selfelaborated data is offered, which are intended to provide current evidence about the popularity of these tools on the Web, by measuring the number of rich files PDF, PPT and DOC in which these tools are mentioned, the amount of external links that both products receive, and the search queries frequency from Google Trends. The poor results obtained by MAS led us to an unexpected and unnoticed discovery: Microsoft Academic Search is outdated since 2013.

                Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

                Be Sociable, Share!

                  "The Number of Scholarly Documents on the Public Web"

                  Posted in Google and Other Search Engines, Publishing, Scholarly Communication on May 12th, 2014

                  Madian Khabsa and C. Lee Giles mail have published "The Number of Scholarly Documents on the Public Web" in PLOS ONE.

                  Here's an excerpt:

                  The number of scholarly documents available on the web is estimated using capture/recapture methods by studying the coverage of two major academic search engines: Google Scholar and Microsoft Academic Search. Our estimates show that at least 114 million English-language scholarly documents are accessible on the web, of which Google Scholar has nearly 100 million. Of these, we estimate that at least 27 million (24%) are freely available since they do not require a subscription or payment of any kind. In addition, at a finer scale, we also estimate the number of scholarly documents on the web for fifteen fields: Agricultural Science, Arts and Humanities, Biology, Chemistry, Computer Science, Economics and Business, Engineering, Environmental Sciences, Geosciences, Material Science, Mathematics, Medicine, Physics, Social Sciences, and Multidisciplinary, as defined by Microsoft Academic Search. In addition, we show that among these fields the percentage of documents defined as freely available varies significantly, i.e., from 12 to 50%.

                  Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

                  Be Sociable, Share!

                    "Checking In With Google Books, HathiTrust, and the DPLA"

                    Posted in Digital Curation & Digital Preservation, Digital Libraries, Google and Other Search Engines on November 14th, 2013

                    Naomi Eichenlaub has published "Checking In With Google Books, HathiTrust, and the DPLA" in Computers in Libraries.

                    Here's an excerpt:

                    Google Books and HathiTrust have been making headlines in the library world and beyond for years now, while a new player, the Digital Public Library of America (DPLA), has only recently entered the scene. This article will provide a "state of the environment" update for these digital library projects including project history and background. It will also examine some challenges common to all three projects including copyright, orphan works, metadata, and quality issues.

                    Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

                    Be Sociable, Share!

                      "Google Scholar as Replacement for Systematic Literature Searches: Good Relative Recall and Precision Are Not Enough"

                      Posted in Google and Other Search Engines on October 29th, 2013

                      Martin Boeker, Werner Vach and Edith Motschall have published "Google Scholar as Replacement for Systematic Literature Searches: Good Relative Recall and Precision Are Not Enough" in BMC Medical Research Methodology.

                      Here's an excerpt:

                      The objectives of this work are to measure the relative recall and precision of searches with Google Scholar under conditions which are derived from structured search procedures conventional in scientific literature retrieval; and to provide an overview of current advantages and disadvantages of the Google Scholar search interface in scientific literature retrieval. . . .

                      The reported relative recall must be interpreted with care. It is a quality indicator of Google Scholar confined to an experimental setting which is unavailable in systematic retrieval due to the severe limitations of the Google Scholar search interface. Currently, Google Scholar does not provide necessary elements for systematic scientific literature retrieval such as tools for incremental query optimization, export of a large number of references, a visual search builder or a history function. Google Scholar is not ready as a professional searching tool for tasks where structured retrieval methodology is necessary.

                      Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

                      Be Sociable, Share!

                        "Just Google It—Digital Research Practices of Humanities Scholars"

                        Posted in Google and Other Search Engines, Scholarly Communication on September 12th, 2013

                        Max Kemman, Martijn Kleppe, and Stef Scagliola have self-archived "Just Google It—Digital Research Practices of Humanities Scholars" in arXiv.org.

                        Here's an excerpt:

                        The transition from analogue to digital archives and the recent explosion of online content offers researchers novel ways of engaging with data. The crucial question for ensuring a balance between the supply and demand-side of data, is whether this trend connects to existing scholarly practices and to the average search skills of researchers. To gain insight into this process we conducted a survey among nearly three hundred (N= 288) humanities scholars in the Netherlands and Belgium with the aim of finding answers to the following questions: 1) To what extent are digital databases and archives used? 2) What are the preferences in search functionalities 3) Are there differences in search strategies between novices and experts of information retrieval? Our results show that while scholars actively engage in research online they mainly search for text and images. General search systems such as Google and JSTOR are predominant, while large-scale collections such as Europeana are rarely consulted.

                        Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

                        Be Sociable, Share!

                          Page 1 of 1812345...10...Last »

                          DigitalKoans

                          DigitalKoans

                          Digital Scholarship

                          Copyright © 2005-2015 by Charles W. Bailey, Jr.

                          Creative Commons License
                          This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International license.