Assessing the Impact of Technical Indicators on Machine Learning Models for Stock Price Prediction

Akash Deep,Chris Monico,Abootaleb Shirvani,Svetlozar Rachev,Frank J. Fabozzi
2024-12-20
Abstract:This study evaluates the performance of random forest regression models enhanced with technical indicators for high-frequency stock price prediction. Using minute-level SPY data, we assessed 13 models that incorporate technical indicators such as Bollinger bands, exponential moving average, and Fibonacci retracement. While these models improved risk-adjusted performance metrics, they struggled with out-of-sample generalization, highlighting significant overfitting challenges. Feature importance analysis revealed that primary price-based features consistently outperformed technical indicators, suggesting their limited utility in high-frequency trading contexts. These findings challenge the weak form of the efficient market hypothesis, identifying short-lived inefficiencies during volatile periods but its limited persistence across market regimes. The study emphasizes the need for selective feature engineering, adaptive modeling, and a stronger focus on risk-adjusted performance metrics to navigate the complexities of high-frequency trading environments.
Computational Finance,Risk Management
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to evaluate the impact of technical indicators on the performance of machine - learning models in high - frequency stock price prediction. Specifically, the research aims to: 1. **Evaluate the impact of technical indicators on the random forest regression model**: By introducing technical indicators such as Bollinger Bands, Exponential Moving Average (EMA), and Fibonacci Retracement, study whether these indicators can improve the predictive ability of the model. 2. **Improve risk - adjusted performance**: In addition to traditional prediction accuracy, the research also focuses on the risk - management ability of the model, especially by using advanced risk - return indicators (such as Rachev Ratio and Gain - Loss Ratio) to evaluate the performance of the model. 3. **Address the over - fitting challenge**: The research points out that although technical indicators can improve some risk - adjusted performance indicators, there are significant over - fitting problems in out - of - sample generalization. Therefore, the research attempts to find out how to avoid over - fitting and improve the generalization ability of the model. 4. **Explore the applicability of the Efficient Market Hypothesis (EMH)**: The research results challenge the weak - form Efficient Market Hypothesis, revealing market inefficiencies that can be exploited during volatile periods, but these inefficiencies are not persistent in different market states. 5. **Propose improvement strategies**: Based on the above findings, the research emphasizes the need for selective feature engineering, adaptive modeling, and more attention to risk - adjusted performance indicators to deal with the complexity in the high - frequency trading environment. ### Formula Summary - **Logarithmic Return**: \[ \log return(P_t)=\log\left(\frac{P_t}{P_{t - 1}}\right) \] - **Rolling Z - Score**: \[ volz(t)=\frac{volume_t - mean(volume)}{std(volume)} \] - **Per - Minute Risk - Free Interest Rate Conversion**: \[ r_{per - minute}=(1 + r_{daily})^{\frac{1}{1440}}- 1 \] - **Simple Moving Average (SMA)**: \[ SMA_N(t)=\frac{1}{N}\sum_{i = 0}^{N - 1}C_{t - i} \] - **Exponential Moving Average (EMA)**: \[ EMA_t=\alpha C_t+(1-\alpha)EMA_{t - 1},\quad\alpha=\frac{2}{N + 1} \] - **Relative Strength Index (RSI)**: \[ RSI_t = 100-\frac{100}{1+\frac{avggain_t}{avgloss_t}} \] - **Bollinger Bands (BBs)**: \[ UBB_t=SMA_N(t)+2\sigma_t,\quad LBB_t=SMA_N(t)-2\sigma_t \] - **Stochastic Oscillator (SO)**: \[ \%K_t = 100\times\frac{C_t - L_{14}(t)}{H_{14}(t)-L_{14}(t)} \] - **Root Mean Square Error (RMSE)**: \[ RMSE=\sqrt{\frac{1}{n}\sum_{i = 1}^n(\hat{y}_i - y_i)^2} \] - **Sharpe Ratio**: \[ Sharpe Ratio