Unsupervised and Semi-Supervised Few-Shot Acoustic Event Classification

Hsin-Ping Huang,Krishna C. Puvvada,Ming Sun,Chao Wang
DOI: https://doi.org/10.1109/icassp39728.2021.9414546
2021-01-01
Abstract:Few-shot Acoustic Event Classification (AEC) aims to learn a model to recognize novel acoustic events using very limited labeled data. Previous works utilize supervised pre-training as well as meta-learning approaches, which heavily rely on labeled data. Here, we study unsupervised and semi-supervised learning approaches for few-shot AEC. Our work builds upon recent advances in unsupervised representation learning introduced for speech recognition and language modeling. We learn audio representations from a large amount of unlabeled data, and use the resulting representations for few-shot AEC. We further extend our model in a semi-supervised fashion. Our unsupervised representation learning approach outperforms supervised pre-training methods, and our semi-supervised learning approach outperforms meta-learning methods for few-shot AEC. We also show that our work is more robust under domain mismatch.
What problem does this paper attempt to address?