Abstract:Depressive disorders are highly prevalent but demand nuanced personalized treatment that traditional approaches in psychiatry cannot address. This gap has prompted a surge of interest in leveraging digital technology, particularly smartphones, for remote monitoring to enhance outpatient care. This study utilizes the BRIGHTEN dataset to construct interpretable prediction models for overall depression severity, measured by PHQ-9, and various depression dimensions using a factor modelling approach. Our factor model unveils a three-factor solution encompassing mood, somatic, and concentration/psychomotor-related factors. Machine learning models effectively predict both the PHQ-9 scores and individual factors, with feature importance methods analyses underscoring the influence of the PHQ-2 scale and communication-related features. These findings are corroborated by models trained on data subsets. Through nested multi-level models, we identify between-subject effects for the PHQ-2 and select communication-related features, along with within-subject effects for these features. In summary, this study underscores the robust predictive capacity of ecological momentary assessments and highlights features of potential relevance for future investigations, such as communication-related features. We advocate for future studies to assess the cost-effectiveness and intervention potential of these models.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to predict different dimensions of depression using smartphone data to improve remote monitoring and outpatient care for depression patients. Specifically, the research objectives include: 1. **Constructing Interpretable Predictive Models**: Using the BRIGHTEN dataset to build models that can predict overall depression severity (measured by the PHQ-9 scale) and different dimensions of depression. 2. **Exploring Factor Structure**: Revealing specific domains of depressive symptoms (such as emotional, somatic symptoms, attention/psychomotor-related symptoms) through factor analysis methods. 3. **Evaluating Feature Importance**: Analyzing the importance of features through various methods to guide feature selection in future research. 4. **Multilevel Model Analysis**: Identifying between-individual and within-individual effects through nested multilevel models, assessing the utility of different features in modeling. ### Background and Motivation Depression is a common mental illness affecting millions of people worldwide and is expected to become the leading source of global disease burden by 2030. Traditional psychiatric methods struggle to meet the needs for personalized treatment, thus increasing the demand for remote monitoring using digital technologies, especially smartphones. These technologies can provide real-time data collection and analysis, helping doctors better understand patients' conditions and intervene promptly. ### Research Methods 1. **Data Source**: Using the BRIGHTEN dataset, which includes two studies (BRIGHTEN-V1 and BRIGHTEN-V2), each lasting 12 weeks, collecting participants' daily PHQ-2 surveys, passive data, and weekly (first 4 weeks) and bi-weekly PHQ-9 surveys. 2. **Factor Analysis**: Conducting factor analysis on baseline PHQ-9 assessments to determine the dimensions of depressive symptoms. 3. **Predictive Modeling**: Using various machine learning models (such as linear regression, autoregressive integrated moving average, support vector regression, random forest regression, gradient boosting regression, etc.) to predict PHQ-9 total scores and individual factor scores. 4. **Feature Importance Analysis**: Calculating feature importance using SHAP values to understand which features have the greatest impact on model predictions. 5. **Subset Models**: Training predictive models based on different data types (such as EMA data, communication data, activity data, etc.) to evaluate their performance. 6. **Inference Modeling**: Using multilevel regression models to analyze the statistical relationship between participants' symptoms and various digital measurements, considering between-individual and within-individual effects. ### Expected Contributions Through this research, the authors hope to: 1. **Improve Predictive Ability**: Demonstrate the effectiveness of ecological momentary assessment (EMA) and other smartphone data in predicting depression. 2. **Guide Future Research**: Identify the most valuable features for future digital phenotyping research, such as communication-related features. 3. **Promote Clinical Application**: Evaluate the cost-effectiveness and intervention potential of these models, providing a basis for future clinical applications. In summary, this paper aims to use smartphone data to more accurately predict different dimensions of depression through a combination of factor analysis, machine learning, and multilevel modeling methods, thereby providing new tools and methods for personalized treatment and remote monitoring of depression.

Predicting dimensions of depression from smartphone data

Personalized mood prediction from patterns of behavior collected with smartphones

Challenges in Using mHealth Data From Smartphones and Wearable Devices to Predict Depression Symptom Severity: Retrospective Analysis

From smartphone data to clinically relevant predictions: A systematic review of digital phenotyping methods in depression

Mood ratings and digital biomarkers from smartphone and wearable data differentiates and predicts depression status: A longitudinal data analysis

Predicting and Monitoring Symptoms in Diagnosed Depression Using Mobile Phone Data: An Observational Study

Predicting Depression in Adolescents Using Mobile and Wearable Sensors: Multimodal Machine Learning–Based Exploratory Study

Subphenotyping Depression Using Machine Learning and Electronic Health Records

Depression screening using mobile phone usage metadata: a machine learning approach

Predicting Symptoms of Depression and Anxiety Using Smartphone and Wearable Data

Predicting Depressive Symptom Severity through Individuals' Nearby Bluetooth Devices Count Data Collected by Mobile Phones: A Preliminary Longitudinal Study

Digital Biomarkers for Depression Screening With Wearable Devices: Cross-sectional Study With Machine Learning Modeling

Using Machine Learning to Predict Antidepressant Treatment Outcome from Electronic Health Records

Towards Personalised Mood Prediction and Explanation for Depression from Biophysical Data

Depression Diagnosis and Forecast based on Mobile Phone Sensor Data

Multimodal Depression Severity Prediction from medical bio-markers using Machine Learning Tools and Technologies

Large-scale digital phenotyping: identifying depression and anxiety indicators in a general UK population with over 10,000 participants

Monitoring Depression in Bipolar Disorder using Circadian Measures from Smartphone Accelerometers

Joint Modeling of Heterogeneous Sensing Data for Depression Assessment via Multi-task Learning

Toward personalizing treatment for depression: predicting diagnosis and severity