High-Frequency Trading Liquidity Analysis | Application of Machine Learning Classification

Sid Bhatia,Sidharth Peri,Sam Friedman,Michelle Malen
2024-08-19
Abstract:This research presents a comprehensive framework for analyzing liquidity in financial markets, particularly in the context of high-frequency trading. By leveraging advanced machine learning classification techniques, including Logistic Regression, Support Vector Machine, and Random Forest, the study aims to predict minute-level price movements using an extensive set of liquidity metrics derived from the Trade and Quote (TAQ) data. The findings reveal that employing a broad spectrum of liquidity measures yields higher predictive accuracy compared to models utilizing a reduced subset of features. Key liquidity metrics, such as Liquidity Ratio, Flow Ratio, and Turnover, consistently emerged as significant predictors across all models, with the Random Forest algorithm demonstrating superior accuracy. This study not only underscores the critical role of liquidity in market stability and transaction costs but also highlights the complexities involved in short-interval market predictions. The research suggests that a comprehensive set of liquidity measures is essential for accurate prediction, and proposes future work to validate these findings across different stock datasets to assess their generalizability.
Trading and Market Microstructure
What problem does this paper attempt to address?
The main objective of this paper is to develop a robust framework for analyzing market liquidity using High-Frequency Trading (HFT) data. Specifically, the research aims to address the following key issues: 1. **Liquidity Risk Identification and Management**: Identify liquidity risks through the analysis of high-frequency trading data and propose effective management strategies. 2. **Statistical Model Construction**: Create statistical models based on liquidity analysis to better understand and predict market dynamics. 3. **Financial Network Evaluation**: Generate new input variables for comprehensive financial network evaluation. To achieve these goals, the research team employed a series of methods and techniques, including machine learning classification algorithms such as logistic regression, support vector machines, and random forests, and validated them through extensive datasets. The research results show that utilizing a wide range of liquidity indicators can achieve higher predictive accuracy compared to using only a limited set of indicators. In particular, the Liquidity Ratio, Flow Ratio, and Turnover were found to be the most influential indicators, indicating that these metrics are crucial for predicting market direction changes. Additionally, the study suggests extending this analysis to different stock datasets to verify its generalizability under various market conditions.