A depression detection model based on multimodal graph neural network
Yujing Xia,Lin Liu,Tao Dong,Juan Chen,Yu Cheng,Lin Tang
DOI: https://doi.org/10.1007/s11042-023-18079-7
IF: 2.577
2024-01-13
Multimedia Tools and Applications
Abstract:Depression is a prevalent mental illness, especially major depression, which has a negative impact on individuals and society. In clinical practice, doctors diagnose depression primarily based on self-reported scores, which can be highly subjective. Therefore, developing a framework for diagnosing and identifying depression is a highly significant study. However, existing studies in this field face the challenges of lack of sample size and multimodal data fusion due to difficulties in obtaining patient data. To address these challenges, we propose a multimodal graph neural network-based model for depression detection. In this model, we solve the few-shot learning problem based on a GNN, which can recursively aggregate and transform neighboring nodes to refine the node representation and is very effective for few-shot learning. For multimodal fusion in depression recognition, a pre-fusion strategy is used to fuse three different modal features (audio, text, and video), and input them into the Bi-LSTM fusion network to learn high-level global features of multimodal information to form a multimodal fusion representation. Finally, we embedded the multimodal fusion module into a GNN to predict depression. This study not only solves the multimodal fusion problem but also can effectively improve the generalization performance of few-shot learning. The method achieved an accuracy of 0.861 on the publicly available depression-based dataset DAIC-WOZ, and the final prediction results far exceeded the baseline level, this shows that our model is highly applicable when dealing with small amounts of multimodal medical data.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering