Automatic Music Mood Classification by Learning Cross-Media Relevance Between Audio and Lyrics

Yu Xiong,Feng Su,Qianqian Wang
DOI: https://doi.org/10.1109/icme.2017.8019341
2017-01-01
Abstract:Automatic analysis of the mood of a piece of music is of great value in music searching, understanding, recommendation and some other music-related applications. Different from most of previous methods that adopted a discriminative mood classification scheme, in this paper, we propose a generative multimodal method for automatically classifying the mood of a piece of music based on effective learning of the relevance (i.e. the joint distribution) between the audio and the lyrics modalities of music. We present effective algorithms for computing the word-to-audio and word-to-word relations in the music as well as a priori probability of lyrics words, which altogether form the multimodal joint distribution that distinctively captures the intrinsic characteristic of one specific music mood. A piece of music is then classified to the mood category that maximizes this joint probability of different modalities of music data. The experiment results demonstrated the effectiveness of the proposed method.
What problem does this paper attempt to address?