A Hybrid Approach to Acoustic Scene Classification Based on Universal Acoustic Models.

Xue Bai,Jun Du,Zi-Rui Wang,Chin-Hui Lee
DOI: https://doi.org/10.21437/interspeech.2019-2171
2019-01-01
Abstract:For the acoustic scenes classification, the main challenge is distinguishing similar acoustic segments between different scenes. To solve this problem, many deep learning based approaches have been proposed without considering the relevance of different acoustic scenes. In this paper, we propose a novel acoustic segment model (ASM) for acoustic scene classification. ASM aims at giving finer segmentation and covering all acoustic scenes through searching for the underlying phoneme like acoustic units. Furthermore, acoustic segments are modeled by Hidden Markov Models (HMMs) and each audio is decoded into ASM sequences without prior linguistic knowledge. Similar to the term vector of a text document, these ASM sequences are converted into co-occurrence statistics feature vectors and SVM/DNN is used as classifier back-end. Validated on the DCASE 2018 task, the proposed approach can achieve a competitive performance with single model and no data augment. By using visualization analysis, we excavate the potential similar units hidden in auditory sense.
What problem does this paper attempt to address?