How much we can learn from Google search data


I just finished the book Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz, which is a highly rated book. The author devoted a great amount of text to the Google Trends data. My fun part of reading this book is that I could dig the results from the Google Trends website myself.

Here is one example: in the book the author argues that Google search reveals that contemporary American parents are far more focused on their son’s intelligence than on their daughters. The author has an article on New York Times article.

I first reproduce the graph here. The upper panel is the search volume for “Is my daughter a genius?” and “Is my son a genius?” from 2004 to 2017. As can be seen, in most years, there are more searches for “Is my son a genius?” than “Is my daughter a genius?”. The lower panel is the average search volume. The volume for son (1.38) is almost three times than that for the daughter (0.44).

Okay, it seems that American parents care whether their son is genius more than their daughters. What about “myself”? Do people care themselves whether they are genius? I contrast the search volume of “Am I a genius” against the search for son and daughter. After all, we are narcissists, kind of. The figure is shown below.

The search volume for “Am I a genius” is way way higher than the search volume for son and daughter. See, we are narcissistic. Not surprising at all.

Let’s try something crazy: do people care whether their dogs are genius?

There are more searches on dog than daughter! Dog’s intelligence is more important than daughter’s? If this is true, no matter whether you are a feminist, this is jarring. But, wait a minute, really? Drawing conclusions from the data is very tricky. To reach the conclusions that parents are biased, we have to assume the distributions of the intelligence between boys and girls are identical, which may not be the case. Girls actually outperform boys in academically. Don’t get me wrong. I am not saying there is no gender bias issue these days. I am just saying the evidence from the Google search may not precisely reflect this issue.

Why are there more searches on dog than daughter about intelligence? One explanation is people do not apply the word “genius” equally among the objects. That is, the criteria for identifying “genius” is different. An ownder of a dog may start to search whether her dog is genius when she finds the dog can understand simple sign language. The threshold for kids is much higher. Another reason is much simpler: there are more demestic dogs in the US than girls. According to this source, in 2017, a total of about 89.7 million dogs lived in households in the United States as pets, which is more than twice as much as 40.2 million, the number of girls aged between 0 to 19 (see here). But when comparing “genius cat” vs. “genius dog”, there are more searches on “genius dog” than “genius cat” though there are more domestic cat (95.6 million) than dogs (well, dogs are shown to have more neurons in the brain).

All in all, the book raised more questions than it addressed.