Integration of Internet search data to predict tourism trends using spatial-temporal XGBoost composite model
Junfeng Kang,Xingyu Guo,Lei Fang,Xiangrong Wang,Zhengqiu Fan
DOI: https://doi.org/10.1080/13658816.2021.1934476
2021-07-05
International Journal of Geographical Information Science
Abstract:<span>Tourism trend prediction facilitates estimation of tourism investment and revenue. Studies on tourism prediction have primarily relied on linear models and historical visitors; however, relationships between tourism trends and their factors may be nonlinear. This study constructed factors from internet search data and predicted tourism trends using a spatiotemporal framework based on the extreme gradient boosting (XGBoost) method. The study first sorted Baidu index data that is computed by weighting the search frequency. The spatial cluster analysis was conducted to incorporate spatial characteristics, and principal component analysis was further performed to identify factors. The next step derived variables using the weighted moving average method to reduce the lag effect between tourism internet search and actual behavior. We applied the proposed spatiotemporal XGBoost composite model to predict Beijing's tourism trends. The R<sup>2</sup> scores of the simple XGBoost model, the autoregressive integrated moving average model, the spatial XGBoost model, and the spatiotemporal XGBoost composite model were 0.517, 0.625, 0.791, and 0.940, respectively. Compared to predictions from different models, the spatiotemporal XGBoost composite model has the best prediction ability. The findings also suggest that machine learning methods may not perform well without considering spatial properties, such as spatial autocorrelation and spatial heterogeneity.</span>
geography, physical,computer science, information systems,information science & library science
What problem does this paper attempt to address?