Performance of using Mel-Frequency Cepstrum Based Features in Nonlinear Classifiers for Phonocardiography Recordings

Ibrahim Ozkan,Atila Yilmaz
DOI: https://doi.org/10.23919/EUSIPCO58844.2023.10289832
2024-02-20
Abstract:Cardiovascular system diseases can be identified by using a specialized diagnostic process utilizing a digital stethoscope. Digital stethoscopes provide phonocardiography (PCG) recordings for further inspection, besides filtering and amplification of heart sounds. In this paper, a framework that is useful to develop feature extraction and classification of PCG recordings is presented. This framework is built upon a previously proposed segmentation algorithm that processes a feature vector produced by the agglutinate application of Mel-frequency cepstrum and discrete wavelet transform (DWT). The performance of the segmentation algorithm is also tested on a new data set and compared to the previously reported results. After identifying the fundamental heart sounds and segmenting the PCG recordings, five principal features are extracted from the time domain signal and Mel-Frequency cepstral coefficients (MFCC) of each cardiac cycle. Classification outcomes are reported for three nonlinear models: k nearest neighbor (k-NN), support vector machine (SVM), and multilayer perceptrons (MLP) classifiers in comparison with a linear approach, namely Mahalanobis distance linear classifier. The results underline that although neural networks and linear classifier show compatible performance in basic classification problems, with the increase in the nonlinearity of the classification problem their performance significantly vary.
Signal Processing
What problem does this paper attempt to address?
This paper discusses the effect of using Mel-Frequency Cepstrum (MFCC) features in nonlinear classifiers for the analysis of heart sound recordings. The main objective of the study is to develop a framework for feature extraction and classification of heart sound signals. The paper proposes a segmentation algorithm that combines MFCC and Discrete Wavelet Transform (DWT), and tests its performance on a new dataset, comparing the results with previously reported ones. The paper begins by introducing the diagnosis of cardiovascular diseases using digital stethoscopes, where phonocardiography (PCG) recordings aid in further analysis. The authors then present a framework for identifying basic heart sounds and extracting five main features from the time-domain signals and MFCC of each cardiac cycle. They use three nonlinear models (k-nearest neighbors, support vector machine, and multilayer perceptron) as well as a linear Mahalanobis distance classifier for classification, and find significant differences in performance between neural networks and linear classifiers as the classification problem becomes more nonlinear. In the experimental section, the paper uses a new dataset containing normal and abnormal heart sounds to evaluate the performance of the segmentation algorithm and compares the results with the previous dataset. In the first-level classification, all classifiers perform similarly, but in the second-level classification (differentiation of abnormal types), the neural network outperforms the linear classifier. Furthermore, the addition of MFCC features improves the classification performance. In conclusion, this paper attempts to address how to effectively utilize MFCC and DWT features to improve the segmentation and classification of heart sound recordings, particularly in distinguishing different cardiac abnormal types, demonstrating the advantage of nonlinear models in complex classification problems.