A Survey of Incremental Transfer Learning: Combining Peer-to-Peer Federated Learning and Domain Incremental Learning for Multicenter Collaboration

Yixing Huang,Christoph Bert,Ahmed Gomaa,Rainer Fietkau,Andreas Maier,Florian Putz
DOI: https://doi.org/10.48550/arXiv.2309.17192
2023-09-29
Abstract:Due to data privacy constraints, data sharing among multiple clinical centers is restricted, which impedes the development of high performance deep learning models from multicenter collaboration. Naive weight transfer methods share intermediate model weights without raw data and hence can bypass data privacy restrictions. However, performance drops are typically observed when the model is transferred from one center to the next because of the forgetting problem. Incremental transfer learning, which combines peer-to-peer federated learning and domain incremental learning, can overcome the data privacy issue and meanwhile preserve model performance by using continual learning techniques. In this work, a conventional domain/task incremental learning framework is adapted for incremental transfer learning. A comprehensive survey on the efficacy of different regularization-based continual learning methods for multicenter collaboration is performed. The influences of data heterogeneity, classifier head setting, network optimizer, model initialization, center order, and weight transfer type have been investigated thoroughly. Our framework is publicly accessible to the research community for further development.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the data - sharing dilemma caused by data privacy limitations in multi - center collaborations, which hinders the development of high - performance deep - learning models. Specifically, the paper explores how to achieve efficient cooperation among multiple centers and maintain model performance without violating data privacy by combining Peer - to - Peer Federated Learning (P2PFL) and Domain Incremental Learning (DIL). ### Problem Background 1. **Data Privacy Limitations** - Due to the limitations of data privacy and management regulations (such as the EU Medical Device Regulation and the EU General Data Protection Regulation), data sharing among multiple clinical centers is strictly restricted. - Such restrictions impede multi - center cooperation in developing high - performance deep - learning models. 2. **Traditional Weight Transfer Methods** - Simple weight transfer methods can transfer intermediate model weights without sharing the original data, thus bypassing data privacy limitations. - However, when the model is transferred from one center to another, performance degradation usually occurs because of the "catastrophic forgetting" problem. 3. **Incremental Transfer Learning (ITL)** - Incremental Transfer Learning combines Peer - to - Peer Federated Learning and Domain Incremental Learning, aiming to overcome data privacy issues and maintain model performance through continuous learning techniques. ### Main Contributions of the Paper 1. **Comprehensive Survey** - A comprehensive survey of the effects of different regularization - based continuous learning methods in multi - center cooperation has been conducted, which can be a supplement to the surveys of task - incremental learning and class - incremental learning. 2. **Analysis of Influencing Factors** - The influences of factors such as data heterogeneity, classifier head settings, network optimizers, model initialization, center order, and weight transfer types have been thoroughly studied. 3. **Framework Establishment** - A framework combining P2PFL and Domain Incremental Learning has been established and made public to the research community to promote further development. ### Solutions 1. **Peer - to - Peer Federated Learning (P2PFL)** - Compared with Center - to - Point Federated Learning (C2PFL), P2PFL is more feasible in multi - center cooperation in the medical field because it does not require a central server and each center has sufficient computing resources. 2. **Domain Incremental Learning (DIL)** - Through continuous learning techniques, especially regularization methods (such as LWF, EWC, SI, etc.), the forgetting problem of the model when trained on new data sets or tasks is solved. 3. **Framework Components** - **Single - Head Setup**: Avoid the mismatch between the feature extractor and the classifier and reduce catastrophic forgetting. - **Reload Optimizer**: Reload the optimizer from the previous center to achieve a smooth learning rate transition. - **Adaptive Optimizer**: Use an adaptive optimizer (such as Adam) to automatically adjust the learning rate of each model parameter. - **Overfitting Monitoring**: Monitor overfitting. If there is no performance improvement within a certain period, decay the learning rate and stop training early. - **Cyclic Weight Transfer (CWT)**: Allow the model to revisit the data of each center and improve performance. ### Experimental Verification The paper conducted experiments on the Tiny ImageNet data set to verify the effects of different continuous learning regularization methods. The experimental results show that the method combining P2PFL and DIL performs excellently in multi - center cooperation, can effectively overcome data privacy limitations, and at the same time maintain model performance. ### Conclusion By combining P2PFL and DIL, this paper proposes an effective solution to solve the data - sharing dilemma caused by data privacy limitations in multi - center cooperation, and at the same time maintains model performance through continuous learning techniques. This framework provides an important reference and tool for future multi - center cooperation.