Enhancing depression detection: A multimodal approach with text extension and content fusion

Jinyan Chen,Shuxian Liu,Meijia Xu,Peicheng Wang
DOI: https://doi.org/10.1111/exsy.13616
IF: 3.3
2024-06-05
Expert Systems
Abstract:Background With ubiquitous social media platforms, people express their thoughts and emotions, making social media data valuable for studying and detecting depression symptoms. Objective First, we detect depression by leveraging textual, visual, and auxiliary features from the Weibo social media platform. Second, we aim to comprehend the reasons behind the model's results, particularly in medicine, where trust is crucial. Methods To address challenges such as varying text lengths and abundant social media data, we employ a text extension technique to standardize text length, enhancing model robustness and semantic feature learning accuracy. We utilize tree‐long short‐term memory and bidirectional gate recurrent unit models to capture long‐term and short‐term dependencies in text data, respectively. To extract emotional features from images, the integration of optical character recognition (OCR) technology with an emotion lexicon is employed, addressing the limitations of OCR technology in accuracy when dealing with complex or blurred text. In addition, auxiliary features based on social behaviour are introduced. These modalities' output features are fed into an attention fusion network for effective depression indicators. Results Extensive experiments validate our methodology, showing a precision of 0.987 and recall rate of 0.97 in depression detection tasks. Conclusions By leveraging text, images, and auxiliary features from Weibo, we develop text picture sentiment auxiliary (TPSA), a novel depression detection model. we ascertained that the emotional features extracted from images and text play a pivotal role in depression detection, providing valuable insights for the detection and assessment of the psychological disorder.
computer science, artificial intelligence, theory & methods
What problem does this paper attempt to address?