Two-Stage Fuzzy Fusion Based-Convolution Neural Network for Dynamic Emotion Recognition

Min Wu,Wanjuan Su,Luefeng Chen,Witold Pedrycz,Kaoru Hirota
DOI: https://doi.org/10.1109/TAFFC.2020.2966440
IF: 13.99
2022-01-01
IEEE Transactions on Affective Computing
Abstract:The two-stage fuzzy fusion based-convolution neural network is proposed for dynamic emotion recognition by using both facial expression and speech modalities, which not only can extract discriminative emotion features which contain spatio-temporal information, but also can effectively fuse facial expression and speech modalities. Moreover, the proposal is able to handle situations where the contributions of each modality data to emotion recognition are very imbalanced. The local binary patterns coming from three orthogonal planes and spectrogram are considered first to extract low-level dynamic emotion, so that the spatio-temporal information of these modalities can be obtained. To reveal more discriminative features, two deep convolution neural networks are constructed to extract high-level emotion semantic features. Moreover, the two stage fuzzy fusion strategy is developed by integrating canonical correlation analysis and fuzzy broad learning system, so as to take into account the correlation and difference between different modal features, as well as handle the ambiguity of emotional state information. The experimental results obtained on benchmark databases show that the accuracies of the proposed method are higher than those of existing methods (such as the hybrid deep model, and the rule-based and machine learning method) on SAVEE, eNTERFACE'05, and AFEW databases.
What problem does this paper attempt to address?