Abstract:Audio identification via fingerprint has been an active research field with wide applications for years. Many technical papers were published and commercial software systems were also employed. However, most of these previously reported methods work on the raw audio format in spite of the fact that nowadays compressed format audio, especially MP3 music, has grown into the dominant way to store on personal computers and transmit on the Internet. It would be interesting if a compressed unknown audio fragment is able to be directly recognized from the database without the fussy and time-consuming decompression-identification-recompression procedure. So far, very few algorithms run directly in the compressed domain for music information retrieval, and most of them take advantage of MDCT coefficients or derived energy type of features. As a first attempt, we propose in this paper utilizing compressed-domain spectral entropy as the audio feature to implement a novel audio fingerprinting algorithm. The compressed songs stored in a music database and the possibly distorted compressed query excerpts are first partially decompressed to obtain the MDCT coefficients as the intermediate result. Then by grouping granules into longer blocks, remapping the MDCT coefficients into 192 new frequency lines to unify the frequency distribution of long and short windows, and defining 9 new subbands which cover the main frequency bandwidth of popular songs in accordance with the scale-factor bands of short windows, we calculate the spectral entropy of all consecutive blocks and come to the final fingerprint sequence by means of magnitude relationship modeling. Experiments show that such fingerprints exhibit strong robustness against various audio signal distortions like recompression, noise interference, echo addition, equalization, band-pass filtering, pitch shifting, and slight time-scale modification etc. For 5s-long query examples which might be severely degraded, an average top-five retrieval precision rate of more than 90% can be obtained in our test data set composed of 1822 popular songs.

Robust Audio Fingerprinting Based On Local Spectral Luminance Maxima Scheme

Daubechies Wavelets Based Robust Audio Fingerprinting for Content-Based Audio Retrieval

A novel audio fingerprinting method robust to time scale modification and pitch shifting.

Temporal Coding of Local Spectrogram Features for Robust Sound Recognition

SIFT-based local spectrogram image descriptor: a novel feature for robust music identification

Robust Audio Identification For Mp3 Popular Music

Robust and lightweight audio fingerprint for Automatic Content Recognition

Effective Audio Fingerprint Retrieval Based on the Spectral Sub-Band Centroid Feature

A robust audio watermarking algorithm based on DCT and vector quantization

A robust audio fingerprinting algorithm in MP3 compressed domain

A Low-Frequency Construction Watermarking Based on Histogram

Robust Audio Watermark Detection Based On Dempster-Shafer Theory Of Evidence

A Robust Compressed-Domain Music Fingerprinting Technique Based on MDCT Spectral Entropy

A Robust Feature Extraction Algorithm for Audio Fingerprinting

Robust Music Identification Based On Low-Order Zernike Moment In The Compressed Domain

Generalized Time-Series Active Search with Kullback–Leibler Distance for Audio Fingerprinting

Realization of audio fingerprint based on power spectrum feature

A Multipurpose Audio Watermarking Algorithm Based on Vector Quantization in DCT Domain

Low-order Auditory Zernike Moment: a Novel Approach for Robust Music Identification in the Compressed Domain

Dct-Domain Global Feature And Dwt-Domain Least-Squares Line Fitting Based Local Feature For Robust Image Hashing

Robust online music identification using spectral entropy in the compressed domain