Forecasting Stock Price Movements with Multiple Data Sources: Evidence from Stock Market in China

Zhongbao Zhou,Meng Gao,Qing Liu,Helu Xiao
DOI: https://doi.org/10.1016/j.physa.2019.123389
IF: 3.778
2019-01-01
Physica A Statistical Mechanics and its Applications
Abstract:We employ multiple heterogeneous data sources, including historical transaction data, technical indicators, stock posts, news and Baidu index, to predict the directions of stock price movements. We focus on the distinctive predicting patterns of active and inactive stocks, and we examine the predictive power of support vector machine (SVM) in different levels of activity for a single stock. We construct a total of 14 data source combinations according to the above 5 heterogeneous data sources, and choose three forecasting horizons, namely 1 day, 2 days and 3 days, so that we can investigate the forecast effects of stock price movements in China A-share market under different data source combinations and forecasting horizons. It is concluded that the optimal data source combinations of active and inactive stocks are different. Active stocks achieve the highest accuracy when combining multiple non-traditional data sources, while inactive stocks obtain the highest accuracy when combining traditional data sources with non-traditional data sources. We further divide each stock into inactive periods, active periods and very active periods, and compare the forecast effects of the same stocks in different periods. We conclude that, for most combinations of data sources, the more active the stock is, the more accurate we achieve, which indicates that our approach is more powerful for predicting the price movements of stocks in active and very active periods.
What problem does this paper attempt to address?