Rage Music Classification and Analysis using K-Nearest Neighbour, Random Forest, Support Vector Machine, Convolutional Neural Networks, and Gradient Boosting

Akul Kumar
2024-08-20
Abstract:We classify rage music (a subgenre of rap well-known for disagreements on whether a particular song is part of the genre) with an extensive feature set through algorithms including Random Forest, Support Vector Machine, K-nearest Neighbour, Gradient Boosting, and Convolutional Neural Networks. We compare methods of classification in the application of audio analysis with machine learning and identify optimal models. We then analyze the significant audio features present in and most effective in categorizing rage music, while also identifying key audio features as well as broader separating sonic variations and trends.
Sound,Audio and Speech Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is **how to classify "rage music" through machine - learning algorithms**. Specifically, the author focuses on: 1. **Music classification problem**: Classifying a specific song as "rage music" or non - "rage music". This task is challenging because there are disputes about whether some songs belong to "rage music" or not. 2. **Feature extraction and analysis**: Identifying and analyzing the most effective audio features in distinguishing "rage music" from other music types. These features include but are not limited to: - Tempo - Beat strength - Onset rate - RMS energy - Spectral centroid - Spectral rolloff - Spectral flatness - Zero - crossing rate - Mel - Frequency Cepstral Coefficients (MFCCs) - Chroma mean - Chroma standard deviation - Pitch features 3. **Model comparison and selection**: Comparing the performance of multiple machine - learning algorithms in audio classification tasks, including K - Nearest Neighbour (KNN), Random Forest, Support Vector Machine (SVM), Convolutional Neural Networks (CNNs) and Gradient Boosting. The aim is to find the optimal classification model. 4. **Key feature identification**: Determining which audio features are the most important for classifying "rage music" and analyzing the specific manifestations of these features in music. For example, several important features mentioned in the paper include: - Song length - Harmonic ratio - Percussive ratio - Chroma mean - Mel - Frequency Cepstral Coefficient 3 (MFCC3) 5. **Classification threshold and model calibration**: Exploring the threshold effects of different features in the classification process and the probability output calibration of the model. For example, the paper points out that some features such as beat strength are more indicative of "rage music" than tempo, and some features show obvious threshold effects. By solving these problems, the paper aims to provide a systematic method for automatically identifying and classifying "rage music", thereby providing valuable insights for musicology research and machine - learning applications.