Markov-switching decision trees

Timo Adam,Marius Ötting,Rouven Michels
DOI: https://doi.org/10.1007/s10182-024-00501-6
2024-05-30
AStA Advances in Statistical Analysis
Abstract:Decision trees constitute a simple yet powerful and interpretable machine learning tool. While tree-based methods are designed only for cross-sectional data, we propose an approach that combines decision trees with time series modeling and thereby bridges the gap between machine learning and statistics. In particular, we combine decision trees with hidden Markov models where, for any time point, an underlying (hidden) Markov chain selects the tree that generates the corresponding observation. We propose an estimation approach that is based on the expectation-maximisation algorithm and assess its feasibility in simulation experiments. In our real-data application, we use eight seasons of National Football League (NFL) data to predict play calls conditional on covariates, such as the current quarter and the score, where the model's states can be linked to the teams' strategies. R code that implements the proposed method is available on GitHub.
statistics & probability
What problem does this paper attempt to address?
The paper attempts to address the limitations encountered when applying decision trees to time series data, particularly when the data exhibits state-switching and serial correlation over time. Specifically, traditional decision trees are only suitable for cross-sectional data and assume that observations are independent of each other, making them incapable of handling characteristics such as trends or cyclical fluctuations in time series data. The authors propose a method that combines decision trees with Hidden Markov Models (HMMs), known as Markov-switching decision trees, to overcome these limitations. This approach enhances the model's fit, predictive accuracy, and interpretability by introducing a hidden Markov chain to select the decision tree that generates the observations. The effectiveness of this method is validated through simulation experiments, and it is applied in a real-world case study of American football data analysis, demonstrating the advantages of Markov-switching decision trees in predicting game strategies. Additionally, the authors provide R code for implementing this method.