Learning Content-Adaptive Feature Pooling for Facial Depression Recognition in Videos

Xiuzhuang Zhou,Peng Huang,Haoming Liu,Sihua Niu
DOI: https://doi.org/10.1049/el.2019.0443
2019-01-01
Electronics Letters
Abstract:Recently, a deep representation of facial depression built on convolutional neural networks has shown impressive performance in video-based depression recognition. However, most existing approaches either fix the weights or using a certain heuristics to integrate the frame-level facial features, resulting in suboptimal feature aggregation in encoding the helpful while discarding noisy information in videos. To address this issue, the authors introduce the memory attention mechanism in a regression network to learn a deep discriminative depression representation, where the residual network module aims at learning frame-level deep feature, while the attention module acts as a pooling layer by adaptively learning the weights emphasising or suppressing face images with varying poses and imaging conditions. They empirically evaluate the proposed approach on a benchmark depression dataset, and the results demonstrate the superiority of their approach over the state-of-the-art alternatives.
What problem does this paper attempt to address?