Environmental Sound Classification Based on Continual Learning

Yadong Sun,Xinyi Chang,Guixun Xu,Yanjiang Wang
DOI: https://doi.org/10.1109/NTCI60157.2023.10403679
2023-01-01
Abstract:With the rapid development of industry, new types of environmental sounds are emerging. Learning and recognizing various sounds is particularly important. However, if neural networks learn new knowledge like humans, they will almost forget the learned knowledge, which is known as the phenomenon of catastrophic forgetting. Continual learning methods based on generative replay could alleviate this problem, but in the face of complex, large datasets, the performance is not ideal. In response, inspired by the interaction between the hippocampus and neocortex, this paper constructs a feature generative replay model for environmental sound classification (FGR-ES). Specifically, during the knowledge learning process, the sound features are extracted using a feature extractor, and then a W-GAN framework is trained with the extracted features to simulate old data. Then, when a new task appears, the encoder is utilized to extract new task features and interact with the generated fake data feature to achieve the fusion of new and old task information. Finally, this information is employed to update the classifier and realize the environmental sound classification. Besides, experimental results on ESC10, ESC50, and Urbansound8K show that the proposed method achieves competitive results.
What problem does this paper attempt to address?