Parameterizing Federated Continual Learning for Reproducible Research

Bart Cox,Jeroen Galjaard,Aditya Shankar,Jérémie Decouchant,Lydia Y. Chen
2024-06-04
Abstract:Federated Learning (FL) systems evolve in heterogeneous and ever-evolving environments that challenge their performance. Under real deployments, the learning tasks of clients can also evolve with time, which calls for the integration of methodologies such as Continual Learning. To enable research reproducibility, we propose a set of experimental best practices that precisely capture and emulate complex learning scenarios. Our framework, Freddie, is the first entirely configurable framework for Federated Continual Learning (FCL), and it can be seamlessly deployed on a large number of machines thanks to the use of Kubernetes and containerization. We demonstrate the effectiveness of Freddie on two use cases, (i) large-scale FL on CIFAR100 and (ii) heterogeneous task sequence on FCL, which highlight unaddressed performance challenges in FCL scenarios.
Machine Learning,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the reproducibility and performance challenges of Federated Continual Learning (FCL) in practical deployment. Specifically, the author points out the following problems: 1. **Environmental Heterogeneity**: Federated learning systems usually operate in heterogeneous and constantly changing environments, which poses challenges to the performance of the system. 2. **Task Evolution**: In practical deployment, the learning tasks of clients will change over time, which requires the introduction of continual learning methods. 3. **Catastrophic Forgetting**: A key challenge in continual learning is catastrophic forgetting, that is, the model forgets previously learned tasks when learning new tasks. 4. **Experimental Reproducibility**: Existing FCL research is difficult to reproduce because the experimental environment is usually strictly controlled and static, while the real - world environment is dynamic and heterogeneous. 5. **Limitations of Existing Frameworks**: Existing federated learning and continual learning frameworks are difficult to scale to support the heterogeneity of data, tasks, and hardware platforms, and the cost of managing large - scale experiments is high. To solve these problems, the author proposes a set of experimental best practices and develops a framework named Freddie. Freddie is the first fully configurable FCL framework that can be seamlessly deployed to a large number of machines through Kubernetes and containerization technologies. It can simulate complex learning scenarios and provide support for resource and data heterogeneity, thereby improving the reproducibility and efficiency of FCL research. ### Main Contributions - **Identifying Key Requirements**: Clearly defined the key requirements for FL and FCL simulation, including ease of use, reproducibility, complex workload support, and resource heterogeneity. - **Developing the Freddie Framework**: This is an open - source framework that supports small - scale and large - scale simulations and can be deployed using Kubernetes in self - managed and cloud systems. - **Benchmarking Method**: Provided a workload generation method for data and task heterogeneity in FCL to explore FCL systems in the real world. Through these contributions, Freddie aims to provide a powerful and flexible tool for FCL research, helping researchers better understand and solve the performance problems faced by FCL systems.