Using unsupervised machine learning to quantify physical activity from accelerometry in a diverse and rapidly changing population
Christopher B. Thornton,Niina Kolehmainen,Kianoush Nazarpour
DOI: https://doi.org/10.1371/journal.pdig.0000220
2023-04-06
PLOS Digital Health
Abstract:Accelerometers are widely used to measure physical activity behaviour, including in children. The traditional method for processing acceleration data uses cut points to define physical activity intensity, relying on calibration studies that relate the magnitude of acceleration to energy expenditure. However, these relationships do not generalise across diverse populations and hence they must be parametrised for each subpopulation (e.g., age groups) which is costly and makes studies across diverse populations and over time difficult. A data-driven approach that allows physical activity intensity states to emerge from the data, without relying on parameters derived from external populations, offers a new perspective on this problem and potentially improved results. We applied an unsupervised machine learning approach, namely a hidden semi-Markov model, to segment and cluster the raw accelerometer data recorded (using a waist-worn ActiGraph GT3X+) from 279 children (9–38 months old) with a diverse range of developmental abilities (measured using the Paediatric Evaluation of Disability Inventory–Computer Adaptive Testing measure). We benchmarked this analysis with the cut points approach, calculated using thresholds from the literature which had been validated using the same device and for a population which most closely matched ours. Time spent active as measured by this unsupervised approach correlated more strongly with PEDI-CAT measures of the child's mobility (R 2 : 0.51 vs 0.39), social-cognitive capacity (R 2 : 0.32 vs 0.20), responsibility (R 2 : 0.21 vs 0.13), daily activity (R 2 : 0.35 vs 0.24), and age (R 2 : 0.15 vs 0.1) than that measured using the cut points approach. Unsupervised machine learning offers the potential to provide a more sensitive, appropriate, and cost-effective approach to quantifying physical activity behaviour in diverse populations, compared to the current cut points approach. This, in turn, supports research that is more inclusive of diverse or rapidly changing populations. Physical activity participation in young children has often been measured using parent reports. Accelerometry provides a more objective measurement but the traditional methods used to quantify this require calibration and struggle to generalise to diverse or rapidly changing populations such as young children. In recent years unsupervised machine learning methods have been shown to be able to segment and cluster accelerometry, allowing categories of activity intensity to emerge from a data-driven process. Here we show that an unsupervised machine learning technique (the hidden semi-Markov model) can be used to estimate categories of activity intensity in accelerometry data recorded from a diverse population of children age 9–36 months. We also show that this approach better captures the variance of movement abilities in the population than the traditional cut points approach. The hidden semi-Markov model approach provides a more effective approach for processing and analysing accelerometer data in rapidly changing and diverse populations such as young children, compared to the more traditional cut points approach. As it does not require calibration studies to incorporate new populations it has the potential to facilitate inclusion of unrepresented populations in research, as well as being less resource intensive.
English Else