Metric Learning Based Data Augmentation for Environmental Sound Classification.

Rui Lu,Zhiyao Duan,Changshui Zhang
DOI: https://doi.org/10.1109/waspaa.2017.8169983
2017-01-01
Abstract:Deep neural networks have been widely applied in the field of environmental sound classification. However, due to the scarcity of carefully labeled data, their training process suffers from over-fitting. Data augmentation is a technique that alleviates this issue. It augments the training set with synthetic data that are created by modifying some parameters of the real data. However, not all kinds of augmentations are helpful, and some are in fact harmful for the recognition of certain sound concepts. Figuring out the appropriate augmentations for the appropriate training data is thus an interesting question. In this paper, we propose a framework for data augmentation through metric learning. The idea is to first learn a metric from the original training data, and then use it to filter out augmented data samples that are far from original ones in the same class. Experiments on a widely used dataset show that our framework achieves the same performance compared to other augmentation strategies while reducing the amount of training data by a large margin.
What problem does this paper attempt to address?