Exploring CrossFit performance prediction and analysis via extensive data and machine learning

Byunggul Lim,Wook Song
DOI: https://doi.org/10.23736/S0022-4707.24.15786-6
Abstract:Background: The analysis of athletic performance has always aroused great interest from sport scientist. This study utilized machine learning methods to build predictive models using a comprehensive CrossFit (CF) dataset, aiming to reveal valuable insights into the factors influencing performance and emerging trends. Methods: Random forest (RF) and multiple linear regression (MLR) were employed to predict performance in four key weightlifting exercises within CF: clean and jerk, snatch, back squat, and deadlift. Performance was evaluated using R-squared (R2) values and mean squared error (MSE). Feature importance analysis was conducted using RF, XGBoost, and AdaBoost models. Results: The RF model excelled in deadlift performance prediction (R2=0.80), while the MLR model demonstrated remarkable accuracy in clean and jerk (R2=0.93). Across exercises, clean and jerk consistently emerged as a crucial predictor. The feature importance analysis revealed intricate relationships among exercises, with gender significantly impacting deadlift performance. Conclusions: This research advances our understanding of performance prediction in CF through machine learning techniques. It provides actionable insights for practitioners, optimize performance, and demonstrates the potential for future advancements in data-driven sports analytics.
What problem does this paper attempt to address?