Application of Machine Learning Algorithms in Speech Emotion Recognition

Junyi Cao
DOI: https://doi.org/10.1109/CONF-SPML54095.2021.00031
2021-01-01
Abstract:Speech emotion recognition has been widely used in recent years and has become a heated topic for research. Focused on the convolutional neural network model using spectrograms as input, the CNN-LSTM model based on feature vectors, original speech signal and Log-mel spectrograms, the performance of different models is compared as well as analyzed. The study found that there are some common problems existing in the classification performance of the model. The features and algorithms currently used can effectively distinguish emotions with varied “arousal”, but it is difficult to identify the feelings with similar arousal, among the models. The CNN-LSTM model with Log-mel spectrograms as input achieved the highest accuracy.
What problem does this paper attempt to address?