A Multi-Frame Rate Network with Attention Mechanism for Depression Severity Estimation.

Ruibin Wang,Jing Guo,Jiashun Wang,Lang He,Yun Yang
DOI: https://doi.org/10.1109/BIBM58861.2023.10385423
2023-01-01
Abstract:The diagnosis of depression mainly depends on clinicians’ experience and questionnaire results, making it a time-consuming and subjective process that demands significant allocation of human resources. Numerous automatic depression estimation (ADE) systems, based on facial cues, have been introduced to estimate the severity and assist clinicians in diagnosis. However, traditional methods adopt a single sampling frame rate, which makes it leads to a tradeoff between the loss of critical vision information and calculation redundancy. In this paper, we propose a Multi-Frame Rate Attention Convolutional Neural Network (MFRA) to effectively mine the facial cues of patients for estimating the severity of depression. Specifically, we adpot a two-branch network structure, in which one branch uses high frame sampling rate to capture more nuances of facial changes, and the other uses low frame sampling rate to focus more on the spatial information of the video. Furthermore, considering the manually selected local facial features will introduce noise, we introduce attention modules to make MFRA concentrate on the facial regions related to depression. Finally, the feature vectors extracted from the two branches are aggregated to output the severity of depression. The experimental results on two datasets, AVEC 2013 and AVEC 2014, show that this method can effectively capture spatiotemporal features, and the prediction results are superior to most video-based depression prediction methods.
What problem does this paper attempt to address?