Comparison of Privacy-Preserving Distributed Deep Learning Methods in Healthcare

Manish Gawali,Arvind C S,Shriya Suryavanshi,Harshit Madaan,Ashrika Gaikwad,Bhanu Prakash KN,Viraj Kulkarni,Aniruddha Pant
DOI: https://doi.org/10.48550/arXiv.2012.12591
2020-12-23
Abstract:In this paper, we compare three privacy-preserving distributed learning techniques: federated learning, split learning, and SplitFed. We use these techniques to develop binary classification models for detecting tuberculosis from chest X-rays and compare them in terms of classification performance, communication and computational costs, and training time. We propose a novel distributed learning architecture called SplitFedv3, which performs better than split learning and SplitFedv2 in our experiments. We also propose alternate mini-batch training, a new training technique for split learning, that performs better than alternate client training, where clients take turns to train a model.
Machine Learning,Cryptography and Security
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the challenges encountered when applying privacy - protected distributed deep - learning methods in the healthcare field. Specifically, the paper aims to compare three privacy - protected distributed learning techniques: Federated Learning (FL), Split Learning (SL) and SplitFed Learning, and evaluate their performance, communication and computational costs, and training time in detecting tuberculosis (TB) in chest X - rays. ### Summary of Main Problems: 1. **Data Privacy and Compliance**: - Medical data are usually distributed among different medical institutions and are strictly protected by laws and regulations (such as GDPR and HIPAA) and cannot be freely shared. Therefore, how to use these data for model training without violating privacy regulations is a key issue. 2. **Effectiveness of Distributed Learning**: - By comparing the three distributed learning methods of FL, SL and SplitFed, the paper explores their effectiveness and feasibility in practical applications. In particular, the paper proposes a new distributed learning architecture, SplitFedv3, and introduces a new training technique - alternate mini - batch training - to improve model performance. 3. **Performance Evaluation**: - The paper not only focuses on classification performance (such as AUROC, AUPRC, F1 - score and Kappa coefficient), but also considers important factors in practical applications such as training time, data communication volume and computational cost to comprehensively evaluate the advantages and disadvantages of different methods. ### Specific Research Contents: - **Dataset**: Use chest X - ray image datasets from five different sources, including three private datasets and two public datasets (MIMIC and Padchest). - **Model Architecture**: The experiment uses two neural network architectures, DenseNet - 121 and U - Net, and tests them at two different resolutions respectively. - **Experimental Setup**: Describe in detail the specific configurations and training processes of each distributed learning method, including the federated averaging algorithm in federated learning, different configurations in split learning (label - sharing and non - label - sharing), and different versions of SplitFed (SFLv2 and SFLv3). Through these studies, the paper provides a valuable reference for distributed deep learning in the medical field and points out the key factors that need to be considered in practical applications.