Federated Learning and Differential Privacy Techniques on Multi-hospital Population-scale Electrocardiogram Data

Vikhyat Agrawal,Sunil Vasu Kalmady,Venkataseetharam Manoj Malipeddi,Manisimha Varma Manthena,Weijie Sun,Saiful Islam,Abram Hindle,Padma Kaul,Russell Greiner
2024-05-15
Abstract:This research paper explores ways to apply Federated Learning (FL) and Differential Privacy (DP) techniques to population-scale Electrocardiogram (ECG) data. The study learns a multi-label ECG classification model using FL and DP based on 1,565,849 ECG tracings from 7 hospitals in Alberta, Canada. The FL approach allowed collaborative model training without sharing raw data between hospitals while building robust ECG classification models for diagnosing various cardiac conditions. These accurate ECG classification models can facilitate the diagnoses while preserving patient confidentiality using FL and DP techniques. Our results show that the performance achieved using our implementation of the FL approach is comparable to that of the pooled approach, where the model is trained over the aggregating data from all hospitals. Furthermore, our findings suggest that hospitals with limited ECGs for training can benefit from adopting the FL model compared to single-site training. In addition, this study showcases the trade-off between model performance and data privacy by employing DP during model training. Our code is available at
Signal Processing,Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of applying Federated Learning (FL) and Differential Privacy (DP) techniques to analyze large-scale electrocardiogram (ECG) data in a multi-hospital environment. Specifically, the research aims to: 1. **Develop accurate ECG classification models**: Using 1,565,849 ECG records from 7 hospitals in Alberta, Canada, to establish a multi-label classification model capable of diagnosing various common cardiovascular and metabolic diseases. 2. **Protect patient privacy**: Achieve collaborative training among hospitals without sharing raw data through federated learning methods, and further enhance data security using differential privacy techniques to ensure patient information confidentiality. 3. **Improve the performance of small-scale data hospitals**: For hospitals with limited data, the federated learning framework can enhance their model performance, allowing them to benefit from the data of other hospitals, thereby improving diagnostic accuracy. 4. **Explore the trade-off between model performance and data privacy**: By applying differential privacy techniques during model training, demonstrate how to minimize the impact on model performance while ensuring data privacy.