Abstract:The amount of digital music available on the internet has grown significantly with the rapid development of digital multimedia technology. Managing these massive music resources is a thorny problem that powerful music media platforms need to face where music genre classification plays an important role, and a good music genre classifier is indispensable for the research and application of music resources in the related aspects, such as efficient organization, retrieval, recommendation, etc. Due to convolutional networks’ powerful feature extraction capability, more and more researchers are devoting their efforts to music genre classification models based on convolutional neural networks (CNNs). However, many models do not combine the musical signal features for effective design of the convolutional structure, which cause a simpler convolutional network part of the model and weaker local feature extraction ability. To solve the above problem, our group proposes a model using a 1D res-gated CNN to extract local information of audio sequences rather than the traditional CNN architecture. Meanwhile, to aggregate the global information of audio feature sequences, our group applies the Transformer to the music genre classification model and modifies the decoder structure of the Transformer according to the task. The experiments utilize the benchmark datasets, including GTZAN and Extended Ballroom. Our group conducted contrastive experiments to verify our model, and experimental results demonstrated that our model outperforms most of the previous approaches and can improve the performance of music genre classification.

Music Genre Classification Based on Fusing Audio and Lyric Information

Improving Music Genre Classification from Multi-Modal Properties of Music and Genre Correlations Perspective

Convolution channel separation and frequency sub-bands aggregation for music genre classification

Content-Based Information Fusion for Semi-Supervised Music Genre Classification

Automatic Music Emotion Classification Using a New Classification Algorithm

Multimodal Music Mood Classification by Fusion of Audio and Lyrics.

Music Genres Classification Using Text Categorization Method

Music Genre Classification: A Comparative Analysis of CNN and XGBoost Approaches with Mel-frequency cepstral coefficients and Mel Spectrograms

A Hybrid Parallel Computing Architecture Based on CNN and Transformer for Music Genre Classification

Genre Classification Empowered by Knowledge-Embedded Music Representation

Statistical and Visual Analysis of Audio, Text, and Image Features for Multi-Modal Music Genre Recognition

Boosting for Multi-Modal Music Emotion Classification.

Music Genre Classification Based on Chroma Features and Deep Learning

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space.

Music genre classification based on MPEG-7 audio features.

Knowledge-Graph Augmented Music Representation for Genre Classification

Automatic Music Mood Classification by Learning Cross-Media Relevance Between Audio and Lyrics

Course genres classification of music e-learning platform based on deep learning big data intelligent processing algorithm

Long Short-Term Memory Recurrent Neural Network Based Segment Features for Music Genre Classification

Research on Music Emotional Expression Based on Reinforcement Learning and Multimodal Information

Music Genre Classification Based on Res-Gated CNN and Attention Mechanism