Abstract:BACKGROUND Depression is a significant global public health issue that affects the physical and mental well-being of hundreds of millions of people worldwide. However, a substantial number of individuals with depression on social media often go undiagnosed and struggle to access timely and effective treatment, increasingly becoming a major societal health concern. OBJECTIVE This paper aims to explore and develop an online depression risk detection method based on deep learning technology to identify individuals at risk of depression on the Chinese social media platform Sina Weibo. METHODS We initially collected approximately 527,333 posts publicly shared over one year from 1600 individuals with depression and 1600 individuals without depression on the Sina Weibo platform. Subsequently, we developed a hierarchical Transformer network to learn semantic features for each user. This network comprises two levels of Transformer structures, one at the word level and the other at the sentence level. These Transformers are employed to extract the textual semantic features of each post, and the aggregated features of all posts for each user generate user-level semantic features. A classifier is then applied to predict the risk of depression. Finally, we conducted statistical and linguistic analyses of the content of posts from individuals with and without depression using the Chinese LIWC. RESULTS We divided the original dataset into training, validation, and test sets. The training set consists of 1000 individuals with depression and 100 individuals without depression. The validation and test set each includes 600 users, with 300 individuals with depression and 300 without depression. Our method achieved an accuracy of 84.62%, precision of 84.43%, recall of 84.50%, and F1 score of 84.32% on the test set without applying sampling techniques. After applying our proposed retrieval-based sampling strategy, our method achieved an accuracy of 95.46%, precision of 95.30%, recall of 95.70%, and F1 score of 95.43%. These results strongly demonstrate the effectiveness and superiority of our proposed depression risk detection model and retrieval-based sampling technique. This provides new insights for large-scale depression detection through social media. Through language behavior analysis, it is observed that individuals with depression are more likely to use negation words (the value of "swear" is 0.001253). This may indicate the presence of negative emotions, rejection, doubt, disagreement, or aversion expressed by individuals with depression. Additionally, we also found that individuals with depression tend to use negative emotional vocabulary in their expressions (NegEmo: 0.022306, Anx: 0.003829, Anger: 0.004327, Sad: 0.005740), which may reflect their internal negative emotions and psychological state. This frequent use of negative vocabulary could be a way for individuals with depression to express negative feelings towards life, themselves, or their surrounding environment. CONCLUSIONS The research results indicate the feasibility and effectiveness of deep learning methods in detecting the risk of depression. This provides insights into the potential for large-scale, automated, and non-invasive prediction of depression among users of online social media.

Machine learning prediction models for depression symptoms among Chinese health care workers during the COVID-19 outbreak (Preprint)

Machine Learning-Based Prediction Models for Depression Symptoms Among Chinese Healthcare Workers During the Early COVID-19 Outbreak in 2020: A Cross-Sectional Study

A Prediction Model Based on Machine Learning for Diagnosing the Early COVID-19 Patients

Prediction of Mental Health in Medical Workers During COVID-19 Based on Machine Learning

Construction and validation of machine learning algorithm for predicting depression among home-quarantined individuals during the large-scale COVID-19 outbreak: based on Adaboost model

A machine learning analysis of COVID-19 mental health data

Identifying the Predictors of Severe Psychological Distress by Auto-Machine Learning Methods

Risk factors for depression in China based on machine learning algorithms: A cross-sectional survey of 264,557 non-manual workers

Machine learning-enabled mental health risk prediction for youths with stressful life events: A modelling study

Exploring Social Media for Early Detection of Depression in COVID-19 Patients

Machine Learning for Depression Risk Monitoring on Chinese Social Media: A Comprehensive Evaluation and Analysis (Preprint)

Construction of a machine learning-based risk prediction model for depression in middle-aged and elderly hypertensive people in China: a longitudinal study

Machine learning models predict the emergence of depression in Argentinean college students during periods of COVID-19 quarantine

A risk model to predict the mental health of older people in Chinese communities based on machine learning

Predicting Depression and Anxiety of Chinese Population During COVID-19 in Psychological Evaluation Data by XGBoost

Machine Learning–Based Predictive Modeling of Anxiety and Depressive Symptoms During 8 Months of the COVID-19 Global Pandemic: Repeated Cross-sectional Survey Study

Prediction of Online Psychological Help-Seeking Behavior During the COVID-19 Pandemic: An Interpretable Machine Learning Method

Prediction Modeling of Mental Well-Being Using Health Behavior Data of College Students

Machine learning based model for detecting depression during Covid-19 crisis

Interpretable Machine Learning for COVID-19: An Empirical Study on Severity Prediction Task