Improved Bird Sound Classification Based on Deep Cascade Feature

Jie Xie,Kai Hu,Mingying Zhu
DOI: https://doi.org/10.1109/icicn56848.2022.10006599
2022-01-01
Abstract:In this paper, we propose to classify bird sounds based on the deep cascade feature. Specifically, we first construct the multi-view spectrogram. Here, the first representation is log-Mel spectrogram. In addition to log-Mel spectrogram, we apply harmonic-percussive source separation to obtain harmonic and percussive components as the second and third representations. Next, we use VGG16 as the feature extractor to obtain features. Furthermore, mix-up is used to improve classification performance. Experimental results on 43 bird species demonstrate that our proposed method can effectively increase bird classification performance. The best model can achieve a balanced accuracy of 90.56%.
What problem does this paper attempt to address?