Abstract:Although the Prototypical Network (ProtoNet) has demonstrated effectiveness in few-shot biological event detection, two persistent issues remain. Firstly, there is difficulty in constructing a representative negative prototype due to the absence of explicitly annotated negative samples. Secondly, the durations of the target biological vocalisations vary across tasks, making it challenging for the model to consistently yield optimal results across all tasks. To address these issues, we propose a novel adaptive learning framework with an adaptive learning loss to guide classifier updates. Additionally, we propose a negative selection strategy to construct a more representative negative prototype for ProtoNet. All experiments ware performed on the DCASE 2023 TASK5 few-shot bioacoustic event detection dataset. The results show that our proposed method achieves an F-measure of 0.703, an improvement of 12.84%.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on two aspects: 1. **Difficulty in constructing representative negative sample prototypes**: - In Few - Shot Bioacoustic Event Detection (FSBED), due to the lack of explicitly labeled negative samples, it is difficult to construct a representative negative sample prototype. This may lead to poor performance of the model in distinguishing between positive and negative samples. 2. **Challenges in dealing with the variation in the duration of target biological sounds in different tasks**: - The duration of target biological sounds varies greatly in different tasks, making it difficult for the model to consistently obtain optimal results in all tasks. For example, sound segments in some tasks may be short, while those in other tasks may be long. To solve these problems, the author proposes a new adaptive learning framework and introduces the following innovations: - **Negative Selection Strategy**: By selecting more representative negative samples from the query set, the representativeness of the negative sample prototype is enhanced. - **Adaptive Learning Framework**: Combining the teacher model and the student model, the classifier parameters are dynamically adjusted to better handle tasks with different durations. - **Adaptive Learning Loss Function**: The degree of knowledge transfer is adjusted according to the duration of the task, thereby optimizing the model performance. These methods work together to enable the model to achieve significant improvement on the DCASE 2023 TASK5 dataset, with the F - measure reaching 0.703, a 12.84% improvement compared to the baseline method. ### Formula Summary - **Prototype Calculation Formula**: \[ w_k=\frac{1}{|S_k|}\sum_{x_i\in S_k}\frac{f_\varphi(x_i)}{\|f_\varphi(x_i)\|_2} \] where \( w_k \) is the prototype of the \( k \) - th class, \( f_\varphi(x_i) \) is the embedded feature output by the feature extractor, and \( \| \cdot \|_2 \) represents the L2 norm. - **Prediction Probability Distribution Formula**: \[ p_{ik}=\frac{\exp(d(w_k,z_i))}{\sum_{c = 1}^K\exp(d(w_c,z_i))} \] where \( d(x,y) \) represents the Euclidean distance. - **Negative Selection Strategy Formula**: \[ Z'_n=\left\{z_j\mid d(z_j,w_1)>d(w_1,w_2)\cap d(z_j,w_1)-d(z_j,w_2)>\frac{d(w_1,w_2)}{2},0 < j\leq|Q|\right\} \] - **Adaptive Learning Loss Function**: \[ \text{Loss}=\frac{\text{seglen}}{T}\times(D_{KL}-\lambda\times I(X_{st};Y_{st})) \] where \( D_{KL} \) is the Kullback - Leibler divergence, \( I(X_{st};Y_{st}) \) is the mutual information entropy, and \( \lambda \) and \( T \) are hyperparameters. Through these methods, the paper effectively solves the key problems in FSBED and significantly improves the performance of the model.

Adaptive Learning via a Negative Selection Strategy for Few-Shot Bioacoustic Event Detection

Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection

Enhancing Vision-Language Few-Shot Adaptation with Negative Learning

Automatic bioacoustics noise reduction method based on a deep feature loss network

Understanding Negative Proposals in Generic Few-Shot Object Detection

Few-shot bioacoustic event detection at the DCASE 2022 challenge

Meta-Learning Adversarial Domain Adaptation Network for Few-Shot Text Classification.

The More, the Better? Active Silencing of Non-Positive Transfer for Efficient Multi-Domain Few-Shot Classification

A Real Negative Selection Algorithm with Evolutionary Preference for Anomaly Detection

Few-shot Bioacoustic Event Detection with Machine Learning Methods

Regularized Contrastive Pre-training for Few-shot Bioacoustic Sound Detection

Episodic fine-tuning prototypical networks for optimization-based few-shot learning: Application to audio classification

Learning to detect an animal sound from five examples

A Biological Immunity-Based Neuro Prototype for Few-Shot Anomaly Detection with Character Embedding

Few-shot Class-incremental Audio Classification Using Adaptively-refined Prototypes

Pretraining Representations for Bioacoustic Few-shot Detection using Supervised Contrastive Learning

AS-Net: Class-Aware Assistance and Suppression Network for Few-Shot Learning

Few-shot bioacoustic event detection at the DCASE 2023 challenge

AudioProtoPNet: An interpretable deep learning model for bird sound classification

Few-Shot Bioacoustic Event Detection with Frame-Level Embedding Learning System

Adaptive Noise Reduction for Sound Event Detection Using Subband-Weighted NMF.