iWinRNFL: A Simple, Interpretable & Well-Calibrated In-Game Win Probability Model for NFL

Konstantinos Pelechrinis
DOI: https://doi.org/10.48550/arXiv.1704.00197
2017-04-01
Applications
Abstract:During the last few sports seasons a lot of discussion has been generated for the several, high-profile, "comebacks" that were observed in almost all sports. The Cavaliers won the championship after being down 3-1 in the 2016 NBA finals' series against the Golden State Warriors, which was exactly the case for Chicago Cubs and the World Series. The Patriots won the Super Bowl in 2016 even though they were trailing by 25 points late in the third quarter, while FC Barcelona in the top-16 round of the 2016-17 Champions League scored 3 goals during the last 7 minutes of the game (including stoppage time) against PSG to advance in the tournament. This has brought the robustness and accuracy of the various probabilistic prediction models under high scrutiny. Many of these models are proprietary, which makes it hard to evaluate. In this paper, we build a simple and open, yet robust and well-calibrated, in-game probability model for predicting the winner in an NFL (iWinRNFL) game. In particular, we build a logistic regression model that utilizes a set of 10 variables to predict the running win probability for the home team. We train our model using detailed play-by-play data from the last 7 NFL seasons obtained through the league's API. Our results indicate that in 75% of the cases iWinRNFL provides an accurate winner projection, as compared to a 63% accuracy of a baseline pre-game win probability model. Most importantly the probabilities that iWinRNFL provides are well-calibrated. Finally, we have also evaluated more complex, non-linear, models using the same set of features, without any significant improvement in performance.
What problem does this paper attempt to address?