Can Google Trends(GT) be used to predict tourist arrivals?: FB Prophet Machine Learning(ML) for Predicting Tourist Arrivals

Indra Gunawan,Dwi Purnomo Putro,Adhika Pramita Widyassari 
DOI: https://doi.org/10.56910/ictmt.v1i1.57
2023-12-31
Abstract:The big problem in tourism is how to provide appropriate preparations to serve tourists so that when the tourist season is low, resources can be saved and when the tourist season is busy, all resources can be provided effectively. Machine learning is a derivative branch of artificial intelligence, one of whose capabilities can be used to carry out data/dataset-based forecasting. This research uses a dataset obtained from GT from 2013-2023 with several keywords combining city names and tourist destination names in Yogyakarta Indonesia, then it will be compared with a dataset of tourist arrivals in the city of Yogyakarta obtained from the Central Statistics Agency. The Machine Learning model that will be used is Prophet Facebook.. This model uses a Bayessian as a backend algorithm. The results obtained from this research are that GTs can be used to predict tourist arrivals with some tweaks on the dataset. However, to get accurate results, various combinations of keywords are needed for the desired destination, and it is recommended to add some column namely max and mean to the dataset to prevent insufficiency of data of some keywords that make prediction result bad. In this research it can be concluded that the use of an additional max column can increase the COERR, MAPE and R2 values. Meanwhile, we found that the GT dataset can be used for forecasting best in time periods under 200 days. Also we found that using the GT dataset alone produces unstable COERR, MAPE and R2 values. Another finding is that the GT dataset that uses the YouTube filter is only suitable for use in Indonesia for the time period above 2018 considering that Indonesian people's access to YouTube has increased massively over that year and tends to decrease below that year. However, the trend shows that the use of searches on YouTube after 2018 tends to increase drastically, beating searches on the Google web.
What problem does this paper attempt to address?