Perceptual Similarity Between Audio Clips and Feature Selection for Its Measurement

Qinghua Wu,Xiaolei Zhang,Ping Lv,Ji Wu
DOI: https://doi.org/10.1109/iscslp.2012.6423476
2012-01-01
Abstract:In this paper, we explore the retrieval of perceptually similar audio. It focuses on finding sounds according to human perceptions. Thus such retrieval is more "human-centered" [1] than previous audio retrievals which intend to find homologous sounds. We make comprehensive use of various acoustic features to measure the perceptual similarity. Since some acoustic features may be redundant or even adverse to the similarity measurement, we propose to find a complementary and effective combination of acoustic features via SFFS (Sequential Floating Forward Selection) method. Experimental results show that LSP, MFCC, and PLP are the three most effective acoustic features. Moreover, the optimal combination of features can improve the accuracy of similarity classification by about 2% compared with the best performance of a single acoustic feature.
What problem does this paper attempt to address?