Comparing ECG-Lead Subsets for Heart Arrhythmias/ECG Patterns Classification: Convolutional Neural Networks and Random Forest

Serhii Reznichenko,John Whitaker,Zixuan Ni,Shijie Zhou
DOI: https://doi.org/10.1016/j.cjco.2024.10.012
2024-01-01
CJC Open
Abstract:Background Despite Deep Learning (DL) growth in popularity, limited research has compared the performance of DL to Conventional Machine Learning (CML) methods in heart arrhythmia/ECG patterns classification. Additionally, the classification of heart arrhythmias/ECG patterns is often dependent on specific ECG leads for accurate classification, and it remains unknown how DL and CML methods perform on reduced subsets of ECG leads. To assess the accuracy of convolutional neural network (CNN) and random forest (RF) models for classifying arrhythmias/ECG patterns using reduced ECG-lead subsets, representing DL and CML methods. Methods We used a public dataset from the PhysioNet Cardiology Challenge 2020. For the DL method, we trained a CNN classifier extracting features for each ECG lead, which were then used in a feedforward neural network. We employed a Random Forest classifier with manually extracted features for the CML method. Optimal ECG-lead subsets were identified using recursive feature elimination for both methods. Results The CML method required 19% more leads (equating to approximately 2 leads) when compared to the DL method. Four common leads (I, II, V5, V6) were identified in each of the subsets of ECG leads using the CML method, and no common leads were consistently present for the DL method. The average macro F1 score was 0.761 for the DL and 0.759 for the CML. Conclusions Optimal ECG-lead subsets provide comparable classification accuracy to using all 12 leads across DL and CML methods. The DL method achieved slightly higher classification accuracy on larger datasets and required fewer ECG leads compared to the CML method.
What problem does this paper attempt to address?