Cross-Dataset and Cross-Cultural Music Mood Prediction: A Case on Western and Chinese Pop Songs

Xiao Hu,Yi-Hsuan Yang
DOI: https://doi.org/10.1109/taffc.2016.2523503
IF: 13.99
2017-04-01
IEEE Transactions on Affective Computing
Abstract:In music mood prediction, regression models are built to predict values on several mood-representing dimensions such as valence (level of pleasure) and arousal (level of energy). Many studies have shown that music mood is generally predictable based on music acoustic features, but these experiments were mostly conducted on datasets with homogeneous music. Little research has been done to explore the generalizability of mood regression models cross datasets, especially those with music in different cultures. In the increasingly global market of music listening, generalizable models are highly desirable for automated processing, searching and managing music collections with heterogeneous characteristics. In this study, we evaluated mood regression models built on fifteen acoustic features in five mood-related musical aspects, with a focus on cross-dataset generalizability. Specifically, three distinct datasets were involved in a series of five experiments to examine the effects of dataset size, reliability of annotations and cultural backgrounds of music and annotators on mood regression performances and model generalizability. The results reveal that the size of the training dataset and the annotation reliability of the testing dataset affect mood regression performances. When both factors are controlled, regression models are generalizable between datasets sharing a common cultural background of music or annotators.
computer science, cybernetics, artificial intelligence
What problem does this paper attempt to address?