Improve Audio Representation by Using Feature Structure Patterns

R Cai,L Lu,HJ Zhang,LH Cai
DOI: https://doi.org/10.1109/icassp.2004.1326834
2004-01-01
Abstract:Although statistical characteristics of audio features are widely used for audio representation in most current audio analysis systems and have been proved to be effective, they only utilize the average feature variations over time, and thus lead to ambiguities in some cases. Structure patterns, which describe the representative structure characteristics of both temporal and spectral features, are proposed to improve audio representation. In this paper, three structure patterns, including energy envelope pattern, sub-band spectral shape pattern and harmonicity prominence pattern, are proposed or refined, as successive development of our previous work. Evaluations on a content-based audio retrieval system with more than 1500 clips showed very encouraging results.
What problem does this paper attempt to address?