Recognizing human emotion from audiovisual information

Yongjin Wang,L. Guan
DOI: https://doi.org/10.1109/ICASSP.2005.1415607
2005-03-18
Abstract:In this paper, we present an emotion recognition system to classify human emotional state from audiovisual signals. We extract prosodic, mel-frequency cepstral coefficient (MFCC), and formant frequency features to represent the audio characteristics of the emotional speech. A face detection scheme, based on the HSV color model, is used to detect the face from the background. The facial expressions are represented by Gabor wavelet features. We perform feature selection by using a stepwise method based on Mahalanobis distance. A classification scheme involving the analysis of individual class and combinations of different classes is proposed. Our emotion recognition system is tested over a language and race independent database, and an overall recognition accuracy of 82.14% is achieved.
What problem does this paper attempt to address?