Search the scholarly web

Google Scholar is not Google Web search

Most people who search Google Scholar assume that it is a cut-down version of Google Web search that shows only academic content. In fact, Google Scholar is an entirely separate database with a more limited set of search commands. An entirely separate database from Google Web search, Google Scholar indexes the content of articles within subscription databases and so includes a level of indexing not available in Google Web Search.  Google Scholar has been found to comprise a wider coverage of scholarly articles in foreign languages than either Scopus or Web of Science. 

Google Scholar is, however, less accurate than subscription library databases because it ignores publishers’ metadata. This is because when it was first set up, Google Scholar was expected to be a short-lived project that Google wanted to use primarily to teach their nascent machine learning algorithms to efficiently recognise elements such as creator names and publication dates from their expected position in predictably structured documents.  So when publishers offered Google access to their metadata, Google declined. While Google has expressed regret after Google Scholar became something of a permanent feature, Google Scholar continues to this day to try to recognise titles, authors and other document metadata from its position in a document rather than using publishers’ markup.

Woman with laptop.

Limitations of Google Scholar’s search

Unlike Google Web search, Google Scholar almost never includes alternative terms in its search results, nor does it drop search terms.  Most searches are carried out similarly to ‘verbatim’ web searches. 

While Google Web search is limited to 32 search terms, Google Scholar imposes a 256 character limit on search strings, including operators (such as OR) and spaces.  It also uses a different, more limited, search syntax to the Google Web search. 

Google Scholar search syntax

Google Scholar does not search for alternative terms, and while use of the Boolean OR operator seems to work in some cases, cursory testing suggests that running separate searches for alternative terms returns a larger number of more reliable results.

Google Scholar’s Advanced Search search commands are limited to:

  • intitle:
  • author:
  • site:  (note that this finds the hosting site, not the author’s affiliation!)
  • source:

For date limiters, it is necessary to limit the search to the search boxes in the Advanced Search screen or to use the date slider to the left on the search results screen.  The before: and after: commands do not work in Google Scholar.

Pre-prints in Google Scholar

Pre-prints are a treacherous minefield in academic research.  Frequently updated and eventually published, ensuring you are using the most recent version of a manuscript is not always easy.  The first search result should be the most recent version but any variation in title, the author names listed, the order or authors, etc. will cause Google Scholar to create a new, separate search result for the new version.  This often happens when a pre-print is finally published in a journal.  It is therefore important to check alternative search results that might be for the same manuscript/article and to click on the “x published versions” link and follow the link to the hosting site to check whether a new version has been published but not yet fully indexed by Google Scholar.

Alternatives to Google Scholar

The open access directory Core and the academic search engine BASE, together with the Core and Unpaywall browser plugins are useful and free alternatives to Google Scholar, particularly useful for finding pre- and post-prints.

Assistant Librarian (Promotions) at the University Library. An enthusiastic advocate of libraries, diversity, inclusion, equity, and social justice for all, inside and outside the workplace.

Leave a Comment (note: all comments are moderated)

Your email address will not be published.

(you can use <b>bold</b> or <i>italic</i> markers)

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.