Predicting play calls in the National Football League using hidden Markov models

Marius Ötting
DOI: https://doi.org/10.48550/arXiv.2003.10791
2020-03-24
Abstract:In recent years, data-driven approaches have become a popular tool in a variety of sports to gain an advantage by, e.g., analysing potential strategies of opponents. Whereas the availability of play-by-play or player tracking data in sports such as basketball and baseball has led to an increase of sports analytics studies, equivalent datasets for the National Football League (NFL) were not freely available for a long time. In this contribution, we consider a comprehensive play-by-play NFL dataset provided by <a class="link-external link-http" href="http://www.kaggle.com" rel="external noopener nofollow">this http URL</a>, comprising 289,191 observations in total, to predict play calls in the NFL using hidden Markov models. The resulting out-of-sample prediction accuracy for the 2018 NFL season is 71.5%, which is substantially higher compared to similar studies on play call predictions in the NFL.
Applications
What problem does this paper attempt to address?