Abstract:Deep learning techniques have shown promising results in the automatic classification of respiratory sounds. However, accurately distinguishing these sounds in real-world noisy conditions poses challenges for clinical deployment. Additionally, predicting signals with only background noise could undermine user trust in the system. This paper aims to investigate the feasibility and effectiveness of incorporating a deep learning-based audio enhancement preprocessing step into automatic respiratory sound classification systems to improve robustness and clinical applicability. Multiple experiments were conducted using different audio enhancement model structures and classification models. The classification performance was compared to the baseline method of noise injection data augmentation. Experiments were performed on two datasets: the ICBHI respiratory sound dataset, which includes 5.5 hours of recordings, and the Formosa Archive of Breath Sounds (FABS) dataset, comprising 14.6 hours of recordings. Additionally, a physician validation study was conducted by 7 senior physicians to assess the clinical utility of the <a class="link-external link-http" href="http://system.The" rel="external noopener nofollow">this http URL</a> integration of the audio enhancement pipeline resulted in a 21.88% increase in the ICBHI classification score on the ICBHI dataset and a 4.10% improvement on the FABS dataset in multi-class noisy scenarios. Quantitative analysis from the physician validation study revealed improvements in efficiency, diagnostic confidence, and trust during model-assisted diagnosis, with workflows integrating enhanced audio leading to an 11.61% increase in diagnostic sensitivity and facilitating high-confidence diagnoses. Incorporating an audio enhancement algorithm significantly enhances the robustness and clinical utility of automatic respiratory sound classification systems, improving performance in noisy environments and fostering greater trust among medical professionals.

Optimizing medical personnel speech recognition models using speech synthesis and reinforcement learning

The Sound of Healthcare: Improving Medical Transcription ASR Accuracy with Large Language Models

Improving Dysarthric Speech Segmentation With Emulated and Synthetic Augmentation

ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation

High-precision medical speech recognition through synthetic data and semantic correction: UNITED-MEDASR

Toward a Reinforcement-Learning-Based System for Adjusting Medication to Minimize Speech Disfluency

Using RLHF to align speech enhancement approaches to mean-opinion quality scores

Leveraging supplementary text data to kick-start automatic speech recognition system development with limited transcriptions

Performant ASR Models for Medical Entities in Accented Speech

Speech recognition for medical conversations

Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation

Fine-Tuning Automatic Speech Recognition for People with Parkinson's: An Effective Strategy for Enhancing Speech Technology Accessibility

Exploiting Pre-Trained ASR Models for Alzheimer's Disease Recognition Through Spontaneous Speech

Sequence-to-Sequence ASR Optimization via Reinforcement Learning

Enhancing AAC Software for Dysarthric Speakers in e-Health Settings: An Evaluation Using TORGO

MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues

Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model

Assessing the Effectiveness of Automatic Speech Recognition Technology in Emergency Medicine Settings: A Comparative Study of Four AI-powered Engines

Improving Medical Speech-to-Text Accuracy using Vision-Language Pre-training Models

On-Device Personalization of Automatic Speech Recognition Models for Disordered Speech

Improving Robustness and Clinical Applicability of Automatic Respiratory Sound Classification Using Deep Learning-Based Audio Enhancement: Algorithm Development and Validation Study