A Depression Detection Method Based on Multi-Modal Feature Fusion Using Cross-Attention

Shengjie Li,Yinhao Xiao
2024-07-02
Abstract:Depression, a prevalent and serious mental health issue, affects approximately 3.8\% of the global population. Despite the existence of effective treatments, over 75\% of individuals in low- and middle-income countries remain untreated, partly due to the challenge in accurately diagnosing depression in its early stages. This paper introduces a novel method for detecting depression based on multi-modal feature fusion utilizing cross-attention. By employing MacBERT as a pre-training model to extract lexical features from text and incorporating an additional Transformer module to refine task-specific contextual understanding, the model's adaptability to the targeted task is enhanced. Diverging from previous practices of simply concatenating multimodal features, this approach leverages cross-attention for feature integration, significantly improving the accuracy in depression detection and enabling a more comprehensive and precise analysis of user emotions and behaviors. Furthermore, a Multi-Modal Feature Fusion Network based on Cross-Attention (MFFNC) is constructed, demonstrating exceptional performance in the task of depression identification. The experimental results indicate that our method achieves an accuracy of 0.9495 on the test dataset, marking a substantial improvement over existing approaches. Moreover, it outlines a promising methodology for other social media platforms and tasks involving multi-modal processing. Timely identification and intervention for individuals with depression are crucial for saving lives, highlighting the immense potential of technology in facilitating early intervention for mental health issues.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the challenging issue of early diagnosis of depression. Specifically: 1. **Depression Detection Methods**: A new method for depression detection based on multimodal feature fusion is proposed, utilizing a cross-attention mechanism to improve the accuracy of depression recognition. 2. **Improvement of Traditional Methods**: Unlike previous methods that simply concatenate multimodal features, this study employs a cross-attention mechanism for feature fusion, significantly enhancing the precision of depression detection and providing a more comprehensive analysis of users' emotions and behaviors. 3. **Model Construction and Experimental Validation**: A multimodal feature fusion network based on the cross-attention mechanism (MFFNC) was constructed and tested on actual datasets. The results show that this method achieved an accuracy of 0.9495 in the depression recognition task, significantly outperforming existing methods. Through this approach, researchers hope to timely identify individuals with depression on social media platforms for early intervention, potentially saving lives, and providing technical means for early intervention in mental health issues.