Video-based Emotion Recognition Using Multi-dichotomy RNN-DNN

Taorui Ren,Huabin Ruan,Wenjing Han,Tao Yang,Dongmei Jiang
DOI: https://doi.org/10.1109/ACIIAsia.2018.8470349
2018-01-01
Abstract:This paper presents a work on the video-based emotion recognition task introduced in the Multimodal Emotion Recognition Challenge 2017. Encouraged by the widely used convoluational neural network based feature extraction methods in computer vision tasks, we leverage a fine-tuned VGGFace-16 network to generate features for each face image. Then, we explore a multi-dichotomy Recurrent Neural Network-Deep Neural Network (RNN-DNN) based framework for emotion classification.This framework first aggregate VGG Face-based face features from a same video to a global featurerepresentation via its RNN layer, and further map the global feature representation to an emotional category using its dichotomy DNN layers.Experimental results on the challenge database demonstrate the effectiveness of our proposed systemwhen compared to the baseline. Specifically, ourbest results reach the macro average precisions of 52.3% and 42.7% respectively onthe validation and test data.
What problem does this paper attempt to address?