MULTI-SCALE CONVOLUTION BASED ATTENTION NETWORK FOR SEMI-SUPERVISED SOUND EVENT DETECTION Technical Report

Xiujuan Zhu,Xinghao Sun,Ying Hu,Yadong Chen,Wenbo Qiu,Yu Tang,Liang He,Minqiang Xu
2021-01-01
Abstract:Deep Convolutional Recurrent Neural Networks (CRNN) have drawn great attention in sound event detection (SED). Due to the variation in duration for acoustic events is relatively large, It is critically important to design a good operator that can extract multiscale feature more efficiently for SED. However, most CRNN-based models lack discriminative ability for different types of acoustic events and deal with them equally, which results in the representational capacity of the models being limited. Inspired by this, We proposed a Multi-Scale Convolution based Attention Network(MSCA). By using Multi-Scale Convolution, a more effective feature representation ability can be obtained, Which can naturally learn coarse-to-fine multi-scale features to helps the model recognize different sound events. On the other hand, a channel-wise attention module is designed, which can adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels.
What problem does this paper attempt to address?