Robust Polyphonic Sound Event Detection by Using Multi Frame Size Denoising Autoencoder

Jianchao Zhou,Xiaoou Chen,Deshun Yang
DOI: https://doi.org/10.1109/MMSP.2018.8547060
2018-01-01
Abstract:Over the past few years, lots of research has been done on polyphonic sound event detection. A main problem with sound event detection is that the detection performance sharply degrades in the presence of noise. As denoising autoencoder reportedly has superior performance in noisy environments, this paper proposes to use denoising autoencoder, which is trained by multi frame size information of audio signals, to extract robust features in a task of polyphonic sound event detection under noisy conditions. Performance of the extracted feature is evaluated by polyphonic sound event detection experiments with different noise levels, and compared with that of baseline features including Mel-band Energy (Mel), Log mel-band Energy (Logmel) and mel-frequency cepstral coefficients (MFCC). The experiemntal results show that the proposed feature has the best robustness among all features and achieves the best detection effect under noisy conditions.
What problem does this paper attempt to address?