Patch-level Contrastive Embedding Learning for Respiratory Sound Classification

Wenjie Song,Jiqing Han
DOI: https://doi.org/10.1016/j.bspc.2022.104338
IF: 5.1
2023-01-01
Biomedical Signal Processing and Control
Abstract:Nowadays, due to the difficulty of data acquisition and expensive manual annotation, respiratory sound clas-sification suffers from limited training samples, which restrains the performance improvement of existing methods. To learn more information from the limited samples, we previously proposed a method of contrastive embedding learning to incorporate additional out-of-class information into the model. However, since the method mapped each entire sample to a deep embedding vector and modelled the distribution of the embed -dings, it hardly learned the detailed information within the samples. In fact, a sample is a finite combination of various components, and the classification task essentially is to detect the presence of components that contain adventitious sounds, where detailed component-wise information is crucial. To this end, a method of patch-level contrastive embedding learning based on finer-grained patches is further proposed in this paper. It divides each sample into multiple patches and maps the patches to the embedding space. The patches are split into different subclasses, according to the type of adventitious sounds contained in each patch. Considering that there might be no patch-level labels provided in most cases, a Multi-Instance Learning (MIL) based approach is designed to estimate the labels. Then by modelling intra-and inter-subclass distance between the patch-level embeddings, the method learns the detailed information about the difference between patches, which benefits the identifi-cation task. The results following random and official splitting on the ICBHI dataset show that our method achieves the performance of 79.99% and 52.95%, exceeding the previous one by 1.81% and 1.58%, respectively.
What problem does this paper attempt to address?