Infodemiologists Beware: Recent Changes to the Google Health Trends API Result in Incomparable Data as of 1 January 2022

Pieter Hermanus Myburgh
DOI: https://doi.org/10.3390/ijerph192215396
2022-11-21
Abstract:In an ever-increasingly online world, many Internet users seek information from online search engines such as Google. Accessing such search activity allows infodemiologists a glimpse into the collective online mind. Tools such as Google Trends and Google Health Trends (GHT) can be used to gauge search activity in key geographical regions and for specific periods of time. Recently, Google implemented changes to the GHT platform. Evidence is provided here for an initial exploration of how this change impacted the data obtained from GHT. Comparing 177 weekly probabilities for short search sessions of 421 Freebase IDs in thirty geographies extracted from GHT both before and after the implemented change, a low correlation (median of all Spearman ρ = 0.262 [IQR 0.04; 0.53]) between these data was observed for the year 2022. In general, the extracted values are higher after the implemented changes, compared to the values extracted before the change. Future research using the GHT API should not attribute increases in GHT data from 1 January 2022 onward as being reflective of increased search activity for a specific keyword, but rather attribute it to the implemented change to the GHT sampling strategy.
What problem does this paper attempt to address?