Audio Segmentation in AAC Domain for Content Analysis

Rong Zhu,Haojun Ai,Ruimin Hu
DOI: https://doi.org/10.1109/wicom.2009.5301778
2009-01-01
Abstract:We focus the attention on the audio scene segmentation in AAC domain for audio-based multimedia indexing and retrieval applications. In particular, a MFCC extraction method is proposed, which is adaptive to the window switch in AAC encoding process, and independent of the audio sampling frequency. We discuss the fusion method of MFCC features, which came from different window type in order to keep the balance of the frequency and temporal resolution. A series of experiments via the probability distribution of MFCC were implemented to test the effective in audio scene segmentation. The experimental results show that such approach based on compression domain can approach the performance of the system based on PCM audio, and the CPU overload decreased dramatically. It is meaningful to the real time analysis of audio content.
What problem does this paper attempt to address?