Early Screening of Adolescents at Risk for Depression: an Efficient Machine-Learning-Based Identification and Subgroup Discovery System

Peng Zhang,Bingdong Li,Xinyang Miao,Xin Meng,Hao Yuan,Wei Yan,Kaiping Peng
DOI: https://doi.org/10.2139/ssrn.4117348
2022-01-01
Abstract:Background: Early screening of adolescent depression is a high public health priority worldwide. Few existing methods are available to provide targeted estimation of depression risk in subgroups with different characteristics. The current study aimed to identify characteristics of adolescents at high risk of depression, and define subgroups to build group-specific models that are optimized to detect adolescents with a current, high risk of depression. Methods: We developed the Interpretable Search Tree family of algorithms (ISTs) that apply subgroup analysis to predict the absence or presence of depression risk (Patient Health Questionnaire-2 [PHQ-2] score <3 or PHQ-2 ≥3, respectively) for a large-scale population of Chinese adolescents (n=414188) from the 2021 Tsinghua Adolescence Health Survey (TAHS). In addition to identifying subgroups with the highest risk of depression using IST-Prevalence (IST-P), we were able to segment the whole population of adolescents with meaningful sets of feature-value pairs using IST-Learnability (IST-L) to build depression prediction models that are subgroup-specific. Findings: We have identified a large number of subgroups with elevated risk of depression and retrieved the best features for subgroup divisions. We also found that it is possible to retain only the top 10% to 30% of best predictors (N = 4 and N = 10, respectively) for most subgroups without sacrificing model performance. Models developed for the subgroups improved substantially from the baseline models developed for the whole population. Interpretation: Our findings supported that subgroups of adolescents with divergent individual, family and social backgrounds are at remarkably different levels of depression risk, and that developing subgroup-specific models with easily measurable features can reduce the costs of screening implementation while providing more reliable estimations of depression risk in adolescents. Funding: The study was supported by Tsinghua University Spring Breeze Fund (Project ID: 2020Z99CFG013).Declaration of Interest: We declare no competing interests.Ethical Approval: TAHS was approved by the Ethics Committee from the Department of Psychology at Tsinghua University.
What problem does this paper attempt to address?