Acoustic Scene Classification Using Deep Convolutional Neural Network and Multiple Spectrograms Fusion.

Weiping Zheng,Yi Jiantao,Xiaotao Xing,Xiangtao Liu,Shao-Hu Peng
2017-01-01
Abstract:Making sense of the environment by sounds is an important research in machine learning community. In this work, a Deep Convolutional Neural Network (DCNN) model is presented to classify acoustic scenes along with a multiple spectrograms fusion method. Firstly, the generations of standard spectrogram and CQT spectrogram are introduced separately. Corresponding features can then be extracted by feeding these spectrogram data into the proposed DCNN model. To fuse these multiple spectrogram features, two fusing mechanisms, namely the voting and the SVM methods, are designed. By fusing DCNN features of the standard and CQT spectrograms, the accuracy is significantly improved in our experiments, comparing with the single spectrogram schemes. This proves the effectiveness of the proposed multi-spectrograms fusion method.
What problem does this paper attempt to address?