End-to-end multimodal system for depression detection from online recordings

Mateusz Kowalewski,Maciej Stroinski,Kamil Kwarciak,Volodymyr Laptiev,Daria Hemmerling
DOI: https://doi.org/10.1109/EMBC40787.2023.10340782
Abstract:Depression is one of the most occurring civilizational diseases. In this paper, we propose a new approach for detecting depression through the analysis of social media content using face analysis, emotion recognition neural networks, and speech processing. We utilized audio-visual analysis and acquired more than 605 features in the time domain. Those are fed to machine learning and deep learning models for depression classification. Our approach outperforms the other state-of-the-art models, achieving the F1-score 0.77. The results have the potential to provide valuable insights for mental health professionals, offer early detection and intervention, and serve as a resource for individuals seeking help with their mental health. This study enables real-time analysis and represents a significant advancement in mental health and technology and has the potential to impact society.Clinical relevance-The system aims to provide a fast and accurate way to detect depression in individuals through online recordings. The use of multimodal information (e.g. audio, image) enhances the performance of the non-verbal behavioral analysis. The end-to-end system reduces the need for manual analysis by mental health professionals and increases the efficiency of depression screening. The system can potentially help identify individuals who are at risk for depression, enabling early intervention and treatment. The results from the system can complement traditional assessments and support mental health professionals in making a diagnosis. The system can be used in real-time processing, f.e. during online calls, and provide objective measurements summarizing the overall behavior based on computer vision and audio analysis.
What problem does this paper attempt to address?