Emotion Recognition Using Multimodal Features

Jinming Zhao,Shizhe Chen,Shuai Wang,Qin Jin
DOI: https://doi.org/10.1109/aciiasia.2018.8470385
2018-01-01
Abstract:In this paper, we present our solutions for the 2017 Multimodal Emotion Recognition Challenge (MEC 2017). This challenge task aims to recognize the emotional state for short video segments extracted from Chinese films, TV plays and talk shows. There are eight target emotional states: angry, anxious, disgust, surprise, worried, sad, happy and neutral. We extract various features from multiple modalities including audio, text, facial expression and visual context. Both traditional hand-crafted features and learned deep features are explored from different modalities. We also apply temporal models to capture the temporal information in facial expression. Our proposed methods for visual and multimodal sub-challenges outperform the MEC2017 official baseline system by 19.06% and 8.93% on MAP respectively on the test set, which shows the effectiveness of our solutions.
What problem does this paper attempt to address?