Multiscale CNN based Deep Metric Learning for Bioacoustic Classification: Overcoming Training Data Scarcity Using Dynamic Triplet Loss

Padmanabhan Rajan,Daksh Thapar,A. Nigam,Anshul Thakur
DOI: https://doi.org/10.1121/1.5118245
2019-03-26
Abstract:Bioacoustic classification often suffers from the lack of labeled data. This hinders the effective utilization of state-of-the-art deep learning models in bioacoustics. To overcome this problem, the authors propose a deep metric learning-based framework that provides effective classification, even when only a small number of per-class training examples are available. The proposed framework utilizes a multiscale convolutional neural network and the proposed dynamic variant of the triplet loss to learn a transformation space where intra-class separation is minimized and inter-class separation is maximized by a dynamically increasing margin. The process of learning this transformation is known as deep metric learning. The triplet loss analyzes three examples (referred to as a triplet) at a time to perform deep metric learning. The number of possible triplets increases cubically with the dataset size, making triplet loss more suitable than the cross-entropy loss in data-scarce conditions. Experiments on three different publicly available datasets show that the proposed framework performs better than existing bioacoustic classification methods. Experimental results also demonstrate the superiority of dynamic triplet loss over cross-entropy loss in data-scarce conditions. Furthermore, unlike existing bioacoustic classification methods, the proposed framework has been extended to provide open-set classification.
Medicine,Computer Science,Engineering
What problem does this paper attempt to address?