A novel audio fingerprinting method robust to time scale modification and pitch shifting.

Bilei Zhu,Wei Li,Zhurong Wang,Xiangyang Xue
DOI: https://doi.org/10.1145/1873951.1874130
2010-01-01
Abstract:A novel audio fingerprinting method that is highly robust to Time Scale Modification (TSM) and pitch shifting is proposed. Instead of simply employing spectral or tempo-related features, our system is based on computer-vision techniques. We transform each 1-D audio signal into a 2-D image and treat TSM and pitch shifting of the audio signal as stretch and translation of the corresponding image. Robust local descriptors are extracted from the image and matched against those of the reference audio signals. Experimental results show that our system is highly robust to various audio distortions, including the challenging TSM and pitch shifting.
What problem does this paper attempt to address?