Selective Element and Two Orders Vectorization Networks for Automatic Depression Severity Diagnosis Via Facial Changes

Mingyue Niu,Ziping Zhao,Jianhua Tao,Ya Li,Bjoern W. Schuller
DOI: https://doi.org/10.1109/tcsvt.2022.3182658
IF: 5.859
2022-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Physiological studies have shown that healthy and depressed individuals present different facial changes. Thus, many researchers have attempted to use Convolutional Neural Networks (CNNs) to extract high-level facial dynamic representations for predicting depression severity. However, the max-pooling (or average-pooling) layers in the CNN lead to the loss of subtle depression cues. Without pooling layers, the CNN cannot extract multi-scale information and has difficulties for tensor vectorization. To this end, we propose a Selective Element and Two Orders Vectorization (SE-TOV) network. For the SE-TOV network, an SE block is constructed to adaptively select the effective elements from the tensors obtained by receptive fields of different sizes. Moreover, we propose a TOV block for vectorizing a high-dimensional tensor. On the one hand, TOV block inputs a tensor into the Global Average Pooling layer to obtain the first-order vectorization result. On the other hand, it takes principal components of the correlation matrix of channels in a tensor as the second-order vectorization result. Experimental results on AVEC 2013 (RMSE $=7.42$ , MAE $=6.09$ ) and AVEC 2014 (RMSE $=7.39$ , MAE $=5.87$ ) depression databases illustrate the superiority of our approach over previous works.
What problem does this paper attempt to address?