Alberto Martín-Martín et al. have self-archived "Does Google Scholar Contain All Highly Cited Documents (1950-2013)."
Here's an excerpt:
The study of highly cited documents on Google Scholar (GS) has never been addressed to date in a comprehensive manner. The objective of this work is to identify the set of highly cited documents in Google Scholar and define their core characteristics: their languages, their file format, or how many of them can be accessed free of charge. We will also try to answer some additional questions that hopefully shed some light about the use of GS as a tool for assessing scientific impact through citations. The decalogue of research questions is shown below:
1. Which are the most cited documents in GS?
2. Which are the most cited document types in GS?
3. What languages are the most cited documents written in GS?
4. How many highly cited documents are freely accessible?
4.1 What file types are the most commonly used to store these highly cited documents?
4.2 Which are the main providers of these documents?
5. How many of the highly cited documents indexed by GS are also indexed by WoS?
6. Is there a correlation between the number of citations that these highly cited documents have received in GS and the number of citations they have received in WoS?
7. How many versions of these highly cited documents has GS detected?
8. Is there a correlation between the number of versions GS has detected for these documents, and the number citations they have received?
9. Is there a correlation between the number of versions GS has detected for these documents, and their position in the search engine result pages?
10. Is there some relation between the positions these documents occupy in the search engine result pages, and the number of citations they have received?