Development and Validation of a Machine Learning Wrist-worn Step Detection Algorithm with Deployment in the UK Biobank

Scott R Small,Shing Chan,Rosemary Walmsley,Lennart von Fritsch,Aidan Acquah,Gert Mertes,Benjamin G Feakins,Andrew Creagh,Adam Strange,Charles E Matthews,David A Clifton,Andrew J Price,Sara Khalid,Derrick Bennett,Aiden Doherty
DOI: https://doi.org/10.1101/2023.02.20.23285750
2023-02-24
MedRxiv
Abstract:Background: Step count is an intuitive measure of physical activity frequently quantified in a range of health-related studies; however, accurate quantification of step count can be difficult in the free-living environment, with step counting error routinely above 20% in both consumer and research-grade wrist-worn devices. This study aims to describe the development and validation of step count derived from a wrist-worn accelerometer, and to assess its association with cardiovascular and all-cause mortality in a large prospective cohort study. Methods: We developed and externally validated a hybrid step detection model that involves self-supervised machine learning, trained on a new ground truth annotated, free-living step count dataset (OxWalk, n=39, aged 19 81) and tested against other open-source step counting algorithms. This model was applied to ascertain daily step counts from raw wrist-worn accelerometer data of 75,493 UK Biobank participants without a prior history of cardiovascular disease (CVD) or cancer. Cox regression was used to obtain hazard ratios and 95% confidence intervals for the association of daily step count with fatal CVD and all-cause mortality after adjustment for potential confounders. Findings: The novel step algorithm demonstrated a mean absolute percent error of 12.5% in free-living validation, detecting 98.7% of true steps and substantially outperforming other recent wrist-worn, open-source algorithms. Our data are indicative of an inverse dose-response association, where, for example, taking 6,596 to 8,474 steps per day was associated with a 39% [24-52%] and 27% [16 36%] lower risk of fatal CVD and all-cause mortality, respectively, compared to those taking fewer steps each day. Interpretation: An accurate measure of step count was ascertained using a machine learning pipeline that demonstrates state-of-the-art accuracy in internal and external validation. The expected associations with CVD and all-cause mortality indicate excellent face validity. This algorithm can be used widely for other studies that have utilised wrist-worn accelerometers and an open-source pipeline is provided to facilitate implementation.
What problem does this paper attempt to address?