Multi-Scale Convolutional Recurrent Neural Network with Ensemble Method for Weakly Labeled Sound Event Detection

Yingmei Guo,Mingxing Xu,Zhiyong Wu,Jianming Wu,Bin Su
DOI: https://doi.org/10.1109/aciiw.2019.8925176
2019-01-01
Abstract:In this paper, we describe our contributions to the challenge of detection and classification of acoustic scenes and events. We propose multi-scale convolutional recurrent neural network(Multi-scale CRNN), a novel weakly-supervised learning framework for sound event detection. By integrating information from different time resolutions, the multi-scale method can capture both the fine-grained and coarse-grained features of sound events and model the temporal dependency including fine-grained dependency and long-term dependency. Furthermore, the ensemble method proposed in the paper reduces the frame-level prediction errors using classification results. The proposed method achieves 29.2% in the event-based F1-score and 1.40 in event-based error rate in development set of DCASE2018 task4 compared to the baseline of 14.1% F-value and 1.54 error rate [1].
What problem does this paper attempt to address?