End-to-End Continuous Emotion Recognition from Video Using 3D Convlstm Networks

Jian Huang,Ya Li,Jianhua Tao,Zheng Lian,Jiangyan Yi
DOI: https://doi.org/10.1109/icassp.2018.8461963
2018-01-01
Abstract:Conventional continuous emotion recognition consists of feature extraction step followed by regression step. However, the objective of the two steps is not consistent as they are parted. Besides, there is still no consensus about appropriate emotional features. In this study, we propose an end-to-end continuous emotion recognition framework which merges feature extraction and regressor into a unified system. We employ 3D convolutional networks with Long Short-Term Memory Neutral Network (ConvLSTM) to handle spatiotemporal information for continuous emotion recognition. This model is applied on AVEC 2017 database. The experiment results reveal that ConvLSTM model makes a positive effect on the performance improvement, which outperforms the baseline results for arousal of 0.583 vs 0.525 (baseline) and for valence of 0.h54 vs 0.507.
What problem does this paper attempt to address?