Music emotion recognition using deep convolutional neural networks

Ting Li
DOI: https://doi.org/10.3233/jcm-247551
2024-08-18
Journal of Computational Methods in Sciences and Engineering
Abstract:Traditional music emotion recognition (MER) faces problems such as lack of contextual information, inaccurate recognition of music emotions, and difficulty in handling nonlinear relationships. This article first used long short-term memory (LSTM) networks to capture global information and contextual relationships of music. Subsequently, the DCNN was chosen to process sequence data and capture global dependencies to improve the accuracy of MER. Finally, a MER model was constructed based on DCNN to recognize and classify music emotions. This article obtained the impact of different parameter values on model training iterations by adjusting hyperparameters related to training. The optimal values for learning rate μ, momentum coefficient α, weight attenuation coefficient γ, and Dropout coefficient were 0.01, 0.7, 0.0003, and 0.5, respectively. The DCNN used in this article was iteratively trained with recurrent neural networks, convolutional recurrent neural networks, and transform domain neural networks for audio spectrograms, and the results were compared. The experimental findings indicated that the spectral recognition accuracy of DCNN was stable at 95.68%, far higher than the other three different networks. The results showed that the DCNN method used in this article could more accurately distinguish different negative emotions and positive emotions.
What problem does this paper attempt to address?