An efficient model-level fusion approach for continuous affect recognition from audiovisual signals

Ercheng Pei,Dongmei Jiang,Hichem Sahli
DOI: https://doi.org/10.1016/j.neucom.2019.09.037
IF: 6
2020-01-01
Neurocomputing
Abstract:Continuous affect recognition has a huge potential in human computer interaction applications. How to efficiently fuse speech and facial information for inferring the affective state of a person from data captured in real-world conditions is a very important issue for continuous affect recognition. Currently, late fusion is usually used in multi-modal continuous affect recognition to improve system performance. However, late fusion ignores the complementarity and redundancy between multiple streams from the different modalities. In this work, we propose an efficient model-level fusion approach for audiovisual continuous affect recognition.
What problem does this paper attempt to address?