Given that Windows Live Academic Search’s content is limited to computer science, electrical engineering, and physics journals and conferences, a direct comparison of it with other search engines is somewhat difficult.
Although its limitations should be clearly recognized, the following simple experiment in comparing the number of hits for Google Scholar, OAIster (a search engine that indexes open access literature, such as e-prints), and Windows Live Academic Search may help to shed some light on their differences. (Note that OAIster does not typically include content directly provided by commercial publishers, although it does include e-prints for a large number articles published in academic journals.)
The search is for: "OAI-PMH" (entered without quotes).
"OAI-PMH" being, of course, the Open Archives Initiative Protocol for Metadata Harvesting. This is a highly specific search, where many, but not all, hits should fall within the subjects covered by Windows Live Academic Search. A major area that might not be covered is library and information science literature.
To get a better feel for the baseline published literature about OAI-PMH, let’s first do some searching for that term in specialized commercial databases.
- ACM Digital Library (description): 51 hits.
- Engineering Village 2 (description): 66 hits.
- Information Science & Technology Abstracts (description): 36 hits.
- Library Literature & Information Science Index/Full Text (description): 13 hits.
Now, the search engines in question (the links for the below search engine names are for the search, not the search engine):
- Google Scholar (search limited to engineering, computer science, and mathematics): 542 hits. (1,500 hits if the search is not limited.)
- OAIster: 180 hits.
- Windows Live Academic Search: 74 hits.
So, what have we learned? Windows Live Academic Search has a somewhat higher number of hits than the selected commercial databases and, if adjusted downward for publisher versions only (see below), is on the high end. This suggests that it covers the toll-based published literature very well. However, it has a significantly lower number of hits than OAIster and Google Scholar, suggesting that its coverage of open access literature may be weaker than Google Scholar and it is quite likely weaker than OAIster.
Of the 74 hits for the "OAI-PMH" search in Windows Live Academic Search, 54 (73%) were "published versions" (i.e., publisher-supplied works); 20 (27%) were not (i.e., e-prints). Scanning the "Results by Institution" sidebar, it appears that 100% of OAIster’s 180 hits were from open access sources; I didn’t check them all. I didn’t try to break down the 542-hit Google Scholar search result, which has a mix of toll-based and open access materials, although it would be quite interesting to do so. It should be clear that a sample of one search term is a very crude measure (and that this posting won’t grace the pages of JASIST anytime soon).
Of course, this simple experiment tells us nothing about the presence of duplicate entries for the same work in search result sets, which could be important for a meaningful open access comparison. Consider, for example, this group of 11 hits for "A Scalable Architecture for Harvest-Based Digital Libraries—The ODU/Southampton Experiments" from the Google "OAI-PMH" search.
Nor does it tell us the number of items that are not journal articles (or e-prints for them) or conference papers.
An apples-to-apples comparison would adjust for useless duplicates and non-journal/conference literature. (But, of course, it would be quite useful if Windows Live Academic Search had non-journal/conference literature such as technical reports in it.)
However, given the small hit sets, it would not be impossible for someone else to do a deeper analysis on the duplicate entry question and some other tractable questions.
Such comparison does also not compare the relevancy of search results, which is the most important thing.
Google does find more, but in my experience, the Windows Live’s hits are more relevant to what I searched for.
My experience is the other way, with Google being more relevant and MSN full of spam.