Combining feature-level and decision-level fusion in a hierarchical classifier for emotion recognition in the wild

Bo Sun,Liandong Li,Xuewen Wu,Tian Zuo,Ying Chen,Guoyan Zhou,Jun He,Xiaoming Zhu
DOI: https://doi.org/10.1007/s12193-015-0203-6
2015-01-01
Journal on Multimodal User Interfaces
Abstract:Emotion recognition in the wild is a very challenging task. In this paper, we investigate a variety of different multimodal features (acoustic and visual) from video clips to evaluate their discriminative abilities in human emotion analysis. For each clip, we extract MSDF BoW, LBP-TOP, PHOG, LPQ-TOP and Audio features. We train different classifiers for every type of feature on the AFEW dataset from the ICMI 2014 EmotiW Challenge, and we propose a novel hierarchical classification framework, which combines the feature-level and decision-level fusion strategy for all of the extracted multimodal features. The final achievement we gain on the AFEW test set is 47.17 %, which is considerably better than the best baseline recognition rate of 33.7 %. Among all of the teams participating in the ICMI 2014 EmotiW challenge, our recognition performance won the first runner-up award. Furthermore, we test our method on FERA and CK datasets, the experimental results also show good performance.
What problem does this paper attempt to address?