Pre-processing techniques to enhance the classification of lung sounds based on deep learning

Alessandra Fava,Behnood Dianat,Alessandro Bertacchini,Andreina Manfredi,Marco Sebastiani,Marco Modena,Fabrizio Pancaldi
DOI: https://doi.org/10.1016/j.bspc.2024.106009
IF: 5.1
2024-02-09
Biomedical Signal Processing and Control
Abstract:Deep learning has recently proved a huge potential in the classification of lung sounds. Most studies rely on publicly available data sets that are usually well-cleaned and annotated by expert physicians. The result of annotation is subjective by definition and, above all, large and public data sets are not collected in the scope of a very specific clinical investigation. Other works rely on private and suitably collected data sets that either may or may not stem from clinical studies. The main issue in these cases is represented by the reliability and noisiness of auscultations. This paper delves into the significant impact of quantitative, systematic and reproducible cleaning of data sets of lung sounds. For "cleaning a data set" we mean discarding the records that carry mostly noise and interfering signals, since machine learning can be significantly impaired by outliers. The developed pre-processing techniques are tested on several data sets of lung sounds. We designed a deep neural network (DNN) for the diagnosis of interstitial lung diseases (ILD) in patients affected by connective tissue diseases (CTD). The devised DNN can provide significant performance on the clean data set with impressive accuracy, F1-score, and F2-score of 97% with respect to the high-resolution computer tomography. Considering that the screening of ILD in patients affected by chronic autoimmune diseases is still an open issue, the proposed pipeline represents the enabling technology for the early, safe, reliable and cheap diagnosis of CTD-ILD.
engineering, biomedical
What problem does this paper attempt to address?