Sentiment Informed Sentence BERT-Ensemble Algorithm for Depression Detection

Bayode Ogunleye,Hemlata Sharma,Olamilekan Shobayo
DOI: https://doi.org/10.3390/bdcc8090112
2024-09-07
Abstract:The World Health Organisation (WHO) revealed approximately 280 million people in the world suffer from depression. Yet, existing studies on early-stage depression detection using machine learning (ML) techniques are limited. Prior studies have applied a single stand-alone algorithm, which is unable to deal with data complexities, prone to overfitting, and limited in generalization. To this end, our paper examined the performance of several ML algorithms for early-stage depression detection using two benchmark social media datasets (D1 and D2). More specifically, we incorporated sentiment indicators to improve our model performance. Our experimental results showed that sentence bidirectional encoder representations from transformers (SBERT) numerical vectors fitted into the stacking ensemble model achieved comparable F1 scores of 69% in the dataset (D1) and 76% in the dataset (D2). Our findings suggest that utilizing sentiment indicators as an additional feature for depression detection yields an improved model performance, and thus, we recommend the development of a depressive term corpus for future work.
Computation and Language,Machine Learning,Statistics Theory,Applications
What problem does this paper attempt to address?
The problems that this paper attempts to solve are several key challenges in early depression detection. Specifically: 1. **Limitations of Existing Methods**: Most of the existing depression detection research uses a single independent algorithm. These algorithms have limited ability in dealing with data complexity, are prone to over - fitting, and have poor generalization ability. This has led to less - than - satisfactory results of the model in practical applications. 2. **Lack of Utilization of Emotional Features**: Previous studies have rarely used sentiment analysis as an additional feature to improve the performance of depression detection models. However, emotional information is of great significance in understanding an individual's mental state. 3. **Improving Model Performance and Generalization Ability**: To overcome the above problems, this paper proposes a stacking ensemble model that combines sentiment analysis and Sentence BERT embeddings, aiming to improve the accuracy and generalization ability of depression detection through the combination of multiple algorithms. ### Research Objectives The main contributions of this paper include: 1. **Experimental Comparison**: Conduct experimental comparisons on a variety of state - of - the - art machine - learning algorithms to evaluate their performance in the depression detection task. 2. **Proposing a New Model**: Demonstrate how to use the Sentence BERT - Ensemble model to achieve better depression detection results. 3. **Introducing Emotional Features**: Prove the effectiveness of sentiment analysis indicators as external features in depression detection. ### Conclusions and Recommendations The authors have proven through experiments that the stacking ensemble model that combines emotional features and Sentence BERT embeddings can achieve relatively good F1 scores (69% and 76% respectively) on two benchmark social media datasets. Therefore, they suggest that future research can further develop a depression - term corpus to improve the performance of the model and enhance the early detection ability of depression. Through this method, researchers hope to identify patients with depression at an early stage, thereby providing timely support for prevention, diagnosis or treatment and improving the quality of life of patients.