Brazilian Lyrics-Based Music Genre Classification Using a BLSTM Network

Raul de Araújo Lima,Rômulo César Costa de Sousa,Simone Diniz Junqueira Barbosa,Hélio Cortês Vieira Lopes
DOI: https://doi.org/10.48550/arXiv.2003.05377
2020-03-06
Abstract:Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distributed in 14 genres. We apply SVM, Random Forest and a Bidirectional Long Short-Term Memory (BLSTM) network combined with different word embeddings techniques to address this classification task. Our experiments show that the BLSTM method outperforms the other models with an F1-score average of $0.48$. Some genres like "gospel", "funk-carioca" and "sertanejo", which obtained 0.89, 0.70 and 0.69 of F1-score, respectively, can be defined as the most distinct and easy to classify in the Brazilian musical genres context.
Computation and Language,Information Retrieval,Machine Learning
What problem does this paper attempt to address?