Music Emotion Recognition Based on a Neural Network with an Inception-GRU Residual Structure

Xiao Han,Fuyang Chen,Junrong Ban
DOI: https://doi.org/10.3390/electronics12040978
IF: 2.9
2023-02-16
Electronics
Abstract:As a key field in music information retrieval, music emotion recognition is indeed a challenging task. To enhance the accuracy of music emotion classification and recognition, this paper uses the idea of inception structure to use different receptive fields to extract features of different dimensions and perform compression, expansion, and recompression operations to mine more effective features and connect the timing signals in the residual network to the GRU module to extract timing features. A one-dimensional (1D) residual Convolutional Neural Network (CNN) with an improved Inception module and Gate Recurrent Unit (GRU) was presented and tested on the Soundtrack dataset. Fast Fourier Transform (FFT) was used to process the samples experimentally and determine their spectral characteristics. Compared with the shallow learning methods such as support vector machine and random forest and the deep learning method based on Visual Geometry Group (VGG) CNN proposed by Sarkar et al., the proposed deep learning method of the 1D CNN with the Inception-GRU residual structure demonstrated better performance in music emotion recognition and classification tasks, achieving an accuracy of 84%.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?