Environmental Sound Classification Based on CAR-Transformer Neural Network Model

Huaicheng Li,Aibin Chen,Jizheng Yi,Wenjie Chen,Daowu Yang,Guoxiong Zhou,Weixiong Peng
DOI: https://doi.org/10.1007/s00034-023-02339-w
2023-04-28
Abstract:Environment Sound Classification (ESC) has been a challenging task in the audio field due to the different types of ambient sounds involved. In this paper, we propose a method for the ESC tasks based on the CAR-Transformer neural network model, which includes stages of sound sample pre-processing, deep learning-based feature extraction, and classifier classification. We convert the one-dimensional audio signal into two-dimensional Mel Frequency Cepstral Coefficients (MFCC) and use them as the feature map of the audio. The CAR-Transformer model was used for feature extraction, and after dimensionality reduction of the extracted feature map, we use the fully connected layer as a classifier of the feature map to obtain the final results. The method achieves a classification accuracy of 96.91% on the UrbanSound8K dataset, while the number of parameters in the model is only 0.16 M. The results of this paper were compared with other state-of-art research.
engineering, electrical & electronic
What problem does this paper attempt to address?