Audio hash function based on non-negative matrix factorisation of mel-frequency cepstral coefficients.

Ning Chen,He-D. Xiao,Wanggen Wan
DOI: https://doi.org/10.1049/iet-ifs.2010.0097
2011-01-01
IET Information Security
Abstract:Robust audio hash function defines a feature vector that characterises the audio signal, independent of content preserving manipulations, such as MP3 compression, amplitude boosting/cutting, low-pass filtering etc. In this study, the authors propose a new audio hash function based on the non-negative matrix factorisation (NMF) of mel-frequency cepstral coefficients (MFCCs). Their work is motivated by the fact that the orthogonality constraints in the singular value decomposition (SVD) make the low-rank singular vectors of audio with distinct local difference be the same. Thus, the available hash function based on SVD of MFCCs cannot achieve satisfactory discrimination. Although the non-negative constraints of NMF result in the basis that captures the local feature of the audio, thereby significantly reducing misclassification. Experimental results over large audio databases demonstrate that the proposed scheme achieves better performances, in terms of perceptual robustness and discrimination, than the available SVD-MFCCs-based hash function. © 2011 The Institution of Engineering and Technology.
What problem does this paper attempt to address?