Video-Guided Sound Source Separation

Junfeng Zhou,Feng Wang,Di Guo,Huaping Liu,Fuchun Sun
DOI: https://doi.org/10.1007/978-3-030-27526-6_36
2019-01-01
Abstract:A major aim of separating sound source is to separate the sound of interest out of mixture, such as the sound of objects on the screen. In this paper we put forward a method incorporating sound-indicated object detection and using the detection result to separate the on screen sounds and the off screen ones. After training, the object detection network could recognize which object is sounding just like human learns what object making what sound. And then using the temporal information of sounds in a video segment, we separate out sound of the object that is not shown in the video. At last, experiments are carried out in data from AudioSet and we demonstrate that the method works well in given scenarios.
What problem does this paper attempt to address?