Application of federated learning techniques for arrhythmia classification using 12-lead ECG signals

Daniel Mauricio Jimenez Gutierrez,Hafiz Muuhammad Hassan,Lorella Landi,Andrea Vitaletti,Ioannis Chatzigiannakis
2024-01-06
Abstract:Artificial Intelligence-based (AI) analysis of large, curated medical datasets is promising for providing early detection, faster diagnosis, and more effective treatment using low-power Electrocardiography (ECG) monitoring devices information. However, accessing sensitive medical data from diverse sources is highly restricted since improper use, unsafe storage, or data leakage could violate a person's privacy. This work uses a Federated Learning (FL) privacy-preserving methodology to train AI models over heterogeneous sets of high-definition ECG from 12-lead sensor arrays collected from six heterogeneous sources. We evaluated the capacity of the resulting models to achieve equivalent performance compared to state-of-the-art models trained in a Centralized Learning (CL) fashion. Moreover, we assessed the performance of our solution over Independent and Identical distributed (IID) and non-IID federated data. Our methodology involves machine learning techniques based on Deep Neural Networks and Long-Short-Term Memory models. It has a robust data preprocessing pipeline with feature engineering, selection, and data balancing techniques. Our AI models demonstrated comparable performance to models trained using CL, IID, and non-IID approaches. They showcased advantages in reduced complexity and faster training time, making them well-suited for cloud-edge architectures.
Machine Learning
What problem does this paper attempt to address?
The paper aims to address the problem of arrhythmia classification, specifically by analyzing 12-lead electrocardiogram (ECG) signals using Federated Learning (FL) technology while protecting patient privacy. Specifically, the researchers aim to train artificial intelligence models using federated learning methods to achieve effective detection and classification of arrhythmias without sharing sensitive medical data. Additionally, they evaluated the performance of the proposed method under different data distribution scenarios, including Independent and Identically Distributed (IID) and Non-Independent and Identically Distributed (Non-IID) datasets, and compared it with Centralized Learning (CL) methods. The main contributions of the paper include: 1. **Privacy Protection**: By using federated learning, model training is conducted without transmitting raw data to a central server, ensuring the security and privacy of medical data. 2. **Model Performance**: The performance of the federated learning model in the arrhythmia classification task is validated to see if it can match that of centralized learning methods. 3. **Feature Extraction and Selection**: A multi-step data preprocessing method is proposed, including feature extraction, normalization, selection, and data balancing steps, to improve the model's generalization ability and training efficiency. 4. **Dataset Application**: The PhysioNet 2020 Challenge dataset is used, which includes high-resolution 12-lead ECG recordings from six different geographic centers, covering various types of cardiac abnormalities, helping to train a model capable of recognizing a wide range of arrhythmias. In summary, this study aims to explore how to effectively detect and classify arrhythmias using federated learning technology while ensuring data privacy, providing new ideas and technical means for the early diagnosis of cardiovascular diseases.