Speech Emotion Recognition Based on Acoustic Segment Model.

Siyuan Zheng,Jun Du,Hengshun Zhou,Xue Bai,Chin-Hui Lee,Shipeng Li
DOI: https://doi.org/10.1109/iscslp49672.2021.9362119
2021-01-01
Abstract:Accurate detection of emotion from speech is a challenging task due to the variability in speech and emotion. In this paper, we propose a speech emotion recognition (SER) method based on acoustic segment model (ASM) to deal with this issue. Specifically, speech with different emotions is segmented more finely by ASM. Each of these acoustic segments is modeled by Hidden Markov Models (HMMs) and decoded into a series of ASM sequences in an unsupervised way. Then feature vectors are obtained from these sequences above by latent semantic analysis (LSA). Finally, these feature vectors are fed to a classifier. Validated on the IEMOCAP corpus, results demonstrate the proposed method outperforms the state-of-the-art methods with a weighted accuracy of 73.9% and an unweighted accuracy of 70.8% respectively.
What problem does this paper attempt to address?