Audio Signals Encoding for Cough Classification Using Convolutional Neural Networks: A Comparative Study

Hui-Hui Wang,Jia-Ming Liu,Mingyu You,Guo-Zheng Li
DOI: https://doi.org/10.1109/bibm.2015.7359724
2015-01-01
Abstract:Cough detection has considerable clinical value, which can provide an objective basis for assessment and diagnosis of respiratory diseases. Motivated by the great achievements of convolutional neural networks (CNNs) in recent years, we adopted 5 different ways to encode audio signals as images and treated them as the input of CNNs, so that image processing technology could be applied to analyze audio signals. In order to explore the optimal audio signals encoding method, we performed comparative experiments on medical dataset containing 70000 audio segments from 26 patients. Experimental results show that RASTA-PLP spectrum is the best method to encode audio signals as images with respect to cough classification task, which gives an average accuracy of 0.9965 in 200 iterations on test batches and a F1-score of 0.9768 on samples re-sampled from the test set. Therefore, the image processing based method is shown to be a promising choice for the process of audio signals.
What problem does this paper attempt to address?