Speech Emotion Recognition using Channel Attention Mechanism
Ruifeng Zhu,Caixia Sun,Xiaopeng Wei,Lasheng Zhao
DOI: https://doi.org/10.1109/ICCEA58433.2023.10135192
2023-01-01
Abstract:In order to improve the accuracy of speech emotion recognition, this paper proposes a speech emotion recognition method based on the channel attention mechanism. Firstly, Mel Frequency Ceptral Coefficient(MFCC), speech spectrograms and spectral envelopes are selected as the initial input features; then, multiple depth network models are used to extract feature maps from different angles in parallel; then, weights are assigned and fused to the feature maps output from each sub-depth network model by the channel attention mechanism; finally, the fused feature maps are used to predict emotion categories. The experimental results on CASIA, Emo-DB, and SAVEE emotion datasets show that the method achieves 88.3
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">%</sup>
, 85.1%, and 64.5% recognition accuracy, respectively, with better recognition performance compared to recent comparative literature models.