Towards smart-data: Improving predictive accuracy in long-term football team performance

Anthony Constantinou,Norman Fenton
DOI: https://doi.org/10.1016/j.knosys.2017.03.005
2017-05-01
Abstract:Despite recent promising developments with large datasets and machine learning, the idea that automation alone can discover all key relationships between factors of interest remains a challenging task. Indeed, in many real-world domains, experts can often understand and identify key relationships that data alone may fail to discover, no matter how large the dataset. Hence, while pure machine learning provides obvious benefits, these benefits may come at a cost of accuracy. Here we focus on what we call smart-data; a method which supports data engineering and knowledge engineering approaches that put greater emphasis on applying causal knowledge and real-world ‘facts’ to the process of model development, driven by what data are really required for prediction, rather than by what data are available. We demonstrate how we exploited knowledge to develop a model that generates accurate predictions of the evolving performance of football teams based on limited data. The model enables us to predict, before a season starts, the total league points a team is expected to accumulate throughout the season. The results compare favourably against a number of other relevant and different types of models, and are on par with some other models which use far more data. The model results also provide a novel and comprehensive attribution study of the factors most influencing change in team performance, and partly address the cause of the widely accepted favourite-longshot bias observed in bookies odds.
computer science, artificial intelligence
What problem does this paper attempt to address?