Fusing Similarity Functions for Cover Song Identification

Ning Chen,Wei Li,Haidong Xiao
DOI: https://doi.org/10.1007/s11042-017-4456-9
IF: 2.577
2017-01-01
Multimedia Tools and Applications
Abstract:Cover Song Identification (CSI) technique, refers to the process of identifying an alternative version, performance, rendition, or recording of a previously recorded musical composition by measuring and modeling the musical similarity between them quantitatively and objectively. However, it is not possible to describe the similarity between tracks comprehensively and reliably with only one similarity function. In this paper, the Similarity Network Fusion (SNF) technique, which was originally proposed for combining different kernels for predicting drug-target interactions, is adopted to fuse different similarities based on the same descriptor and different similarity functions. First, the Harmonic Pitch Class Profile (HPCP) is extracted from each track. Next, the similarities, in terms of Qmax and Dmax measures, between the HPCP descriptors of any two tracks are calculated, respectively. Then, the track-by-track similarity networks based on Qmax and on Dmax similarity are constructed separately and then fused into one network by SNF. Finally, the fused similarities obtained from the fused similarity network are adopted to train a classifier, which can then be used to identify whether the input two tracks belong to reference/cover or reference/non-cover pair. Experimental results on Covers80 (http:// labrosa. ee. columbia. edu/projects/coversongs/ covers80/), subset of SecondHandSongs (SHS) (http:// labrosa. ee. columbia. edu/millionsong/secondhand), and the Mixed Collection and Mazurka Cover Collection provided by MIREX (http:// www. music-ir.org/mirex/wiki/2016: Audio Cover Song Identification) demonstrate that the proposed scheme performs comparably with or even better than state-of-the-art CSI schemes.
What problem does this paper attempt to address?