Challenges in Using mHealth Data From Smartphones and Wearable Devices to Predict Depression Symptom Severity: Retrospective Analysis

Shaoxiong Sun,Amos A. Folarin,Yuezhou Zhang,Nicholas Cummins,Rafael Garcia-Dias,Callum Stewart,Yatharth Ranjan,Zulqarnain Rashid,Pauline Conde,Petroula Laiou,Heet Sankesara,Faith Matcham,Daniel Leightley,Katie M. White,Carolin Oetzmann,Alina Ivan,Femke Lamers,Sara Siddi,Sara Simblett,Raluca Nica,Aki Rintala,David C. Mohr,Inez Myin-Germeys,Til Wykes,Josep Maria Haro,Brenda W. J. H. Penninx,Srinivasan Vairavan,Vaibhav A. Narayan,Peter Annas,Matthew Hotopf,Richard J. B. Dobson
DOI: https://doi.org/10.2196/45233
2023-08-15
Abstract:A number of challenges exist for the analysis of mHealth data: maintaining participant engagement over extended time periods and therefore understanding what constitutes an acceptable threshold of missing data; distinguishing between the cross-sectional and longitudinal relationships for different features to determine their utility in tracking within-individual longitudinal variation or screening individuals at high risk; and understanding the heterogeneity with which depression manifests itself in behavioral patterns quantified by the passive features. From 479 participants with MDD, we extracted 21 features capturing mobility, sleep, and smartphone use. We investigated the impact of the number of days of available data on feature quality using the intraclass correlation coefficient and Bland-Altman analysis. We then examined the nature of the correlation between the 8-item Patient Health Questionnaire (PHQ-8) depression scale (measured every 14 days) and the features using the individual-mean correlation, repeated measures correlation, and linear mixed effects model. Furthermore, we stratified the participants based on their behavioral difference, quantified by the features, between periods of high (depression) and low (no depression) PHQ-8 scores using the Gaussian mixture model. We demonstrated that at least 8 (range 2-12) days were needed for reliable calculation of most of the features in the 14-day time window. We observed that features such as sleep onset time correlated better with PHQ-8 scores cross-sectionally than longitudinally, whereas features such as wakefulness after sleep onset correlated well with PHQ-8 longitudinally but worse cross-sectionally. Finally, we found that participants could be separated into 3 distinct clusters according to their behavioral difference between periods of depression and periods of no depression.
Quantitative Methods
What problem does this paper attempt to address?