Fully Fused Cover Song Identification Model via Feature Fusing and Clustering.

Qiang Yuan,Shibiao Xu,Li Guo
DOI: https://doi.org/10.1145/3571662.3571672
2022-01-01
Abstract:In recent years, Cover Song Identification (CSI) based on Siamese Network and music representation learning has achieved good performance, however, there are still many problems such as limited feature fusion, missing decision threshold and single data label. In this paper, we propose a novel fully fused cover song identification model via feature fusing and clustering. In our proposed model, there are a fusion feature extraction structure, a channel separation decision structure, and a music feature clustering structure. First, we combine the pre-processing features of the dual input along the channel dimension to achieve full feature fusion and increase the fusion degree of the two songs in the feature extraction process. Secondly, we introduce channel separation to calculate multi-channel cross-features to improve the ability of the model to learn the difference between feature channels, and combined with the binary decision network to avoid the shortcomings of lack of decision thresholds in music representation learning. Finally, feature clustering generates invisible feature labels to enriches the types of cover data labels and reduces the difficulty of training. The model is trained in stages to optimize the clustering loss and the classification loss for cover and non-cover pairs, respectively. The model is validated on three public datasets, and experiments show that our model could achieve competitive results.
What problem does this paper attempt to address?