Combining Multimodal Features Within A Fusion Network For Emotion Recognition In The Wild

Bo Sun,Liandong Li,Guoyan Zhou,Xuewen Wu,Jun He,Lejun Yu,Dongxue Li,Qinglan Wei
DOI: https://doi.org/10.1145/2818346.2830586
2015-01-01
Abstract:In this paper, we describe our work in the third Emotion Recognition in the Wild (EmotiW 2015) Challenge. For each video clip, we extract MSDF, LBP-TOP, HOG, LPQ-TOP and acoustic features to recognize the emotions of film characters. For the static facial expression recognition based on video frame, we extract MSDF, DCNN and RCNN features. We train linear SVM classifiers for these kinds of features on the AFEW and SFEW dataset, and we propose a novel fusion network to combine all the extracted features at decision level. The final achievement we gained is 51.02% on the AFEW testing set and 51.08% on the SFEW testing set, which are much better than the baseline recognition rate of 39.33% and 39.13%.
What problem does this paper attempt to address?