Forecasting the movements of Bitcoin prices: an application of machine learning algorithms

Hakan Pabuccu,Serdar Ongan,Ayse Ongan
DOI: https://doi.org/10.3934/QFE.2020031
2023-03-08
Abstract:Cryptocurrencies, such as Bitcoin, are one of the most controversial and complex technological innovations in today's financial system. This study aims to forecast the movements of Bitcoin prices at a high degree of accuracy. To this aim, four different Machine Learning (ML) algorithms are applied, namely, the Support Vector Machines (SVM), the Artificial Neural Network (ANN), the Naive Bayes (NB) and the Random Forest (RF) besides the logistic regression (LR) as a benchmark model. In order to test these algorithms, besides existing continuous dataset, discrete dataset was also created and used. For the evaluations of algorithm performances, the F statistic, accuracy statistic, the Mean Absolute Error (MAE), the Root Mean Square Error (RMSE) and the Root Absolute Error (RAE) metrics were used. The t test was used to compare the performances of the SVM, ANN, NB and RF with the performance of the LR. Empirical findings reveal that, while the RF has the highest forecasting performance in the continuous dataset, the NB has the lowest. On the other hand, while the ANN has the highest and the NB the lowest performance in the discrete dataset. Furthermore, the discrete dataset improves the overall forecasting performance in all algorithms (models) estimated.
Computational Finance,Machine Learning
What problem does this paper attempt to address?
The paper aims to predict Bitcoin price changes with high accuracy. To achieve this goal, the study applied four different machine learning (ML) algorithms: Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes (NB), and Random Forest (RF), and used Logistic Regression (LR) as a benchmark model for comparison. The study also created a discrete dataset and used various evaluation metrics to compare the performance of these algorithms. The research found that Random Forest had the highest predictive performance in continuous datasets, while Artificial Neural Network performed best in discrete datasets. Additionally, the discrete dataset improved the overall predictive performance of all estimation models.