Intelligent Stethoscope Using Full Self-Attention Mechanism for Abnormal Respiratory Sound Recognition

Changyi Wu,Dongmin Huang,Xiaoting Tao,Kun Qiao,Hongzhou Lu,Wenjin Wang
DOI: https://doi.org/10.1109/bhi58575.2023.10313454
2023-01-01
Abstract:Machine learning automates the recognition of abnormal respiratory sounds and pulmonary diseases for wireless stethoscopes. However, most learning-based methods have unbalanced performance between low sensitivity (SEN) and high specificity (SPE). Recently, the full self-attention mechanism-based Transformer made significant progress in various medical tasks, but its role in respiratory sound recognition still remains unknown. It can extract the contextual information from segments with arbitrary length in a signal, especially with long-range dependencies. This is typically suitable for mining the pattern of temporally-continuous pathological respiratory sounds, including stridor, wheezes, and rhonchi. Thus in this paper, we explore the feasibility of using full self-attention mechanism of Audio Spectrogram Transformer (AST) to improve the performance of respiratory sound recognition, where FNN, CNN and AST are benchmarked on the dataset of ICBHI 2017. In our proposed framework, the input samples are generated by a new respiratory cycle-based segmentation in order to preserve the consistency of input representation; a dual-input AST model is designed to enhance the robustness to disturbances by extracting the complementary information between the spectrograms and log Mel spectrograms. Extensive experiments show that AST outperforms other methods in the task of respiratory sound recognition. Moreover, the proposed respiratory cycle-based segmentation considerably improves SEN by almost 10%.
What problem does this paper attempt to address?