Salient environmental sound detection framework for machine awareness.

Jingyu Wang,Ke Zhang,Kurosh Madani,Christophe Sabourin
DOI: https://doi.org/10.1016/j.neucom.2014.09.046
IF: 6
2015-01-01
Neurocomputing
Abstract:Auditory perception is an essential part of environment perception, in which the saliency detection is not only the fundamental basis but also an efficient way of achieving this task. For artificial machines, intelligent perception approach of sound is required to provide awareness as the initiatory step of artificial consciousness. In this paper, a novel salient environment sound detection framework for machine awareness is proposed. The framework is based on the heterogeneous saliency features from both image and acoustic channels. To improve the efficiency of proposed framework, (1) a global informative saliency estimation approach is initially proposed based on short-term Shannon entropy; (2) a series of auditory saliency detection methods is presented to obtain the spectral and temporal saliency features from power spectral density and mel-frequency cepstral coefficients, respectively; (3) a computational bio-inspired inhibition of return model is proposed for saliency verification to improve the accuracy of detection; (4) a heterogeneous saliency feature fusion approach is introduced to form the final auditory saliency map by combining the acoustic and image saliency features together. Environmental sounds which collected from real world are applied to verify the superiority of the proposed framework. The results show that, the proposed framework is more effective for the detection of the overlapped salient sounds, and is more robust to the background noise compared with the conventional approach.
What problem does this paper attempt to address?