Deep Neural Network Derived Bottleneck Features For Accurate Audio Classification

Bihong Zhang,Lei Xie,Yougen Yuan,Huaiping Ming,Dongyan Huang,Mingli Song
DOI: https://doi.org/10.1109/ICMEW.2016.7574769
2016-01-01
Abstract:In this paper, we propose to use deep neural network (DNN) as an effective tool for audio feature extraction. The DNN-derived features can be effectively used in a subsequent classifier (e.g., an SVM in this study) for audio classification. Specifically, we learn bottleneck features from a multi-layer perceptron (MLP), in which Mel filter bank feature is used as network input and one of the hidden layers has a small number of hidden units, compared to the size of the other hidden layers. The narrow hidden layer is served as a bottleneck layer, which creates a constriction in the network that forces the information pertinent to classification into a compact feature representation. We study both unsupervised and supervised bottleneck feature extraction methods and demonstrate that the supervised bottleneck features outperform conventional hand-crafted features and achieve the state-of-the-art performance in audio classification.
What problem does this paper attempt to address?