Federated Learning for Clinical Structured Data: A Benchmark Comparison of Engineering and Statistical Approaches

Siqi Li,Di Miao,Qiming Wu,Chuan Hong,Danny D'Agostino,Xin Li,Yilin Ning,Yuqing Shang,Huazhu Fu,Marcus Eng Hock Ong,Hamed Haddadi,Nan Liu
2023-11-06
Abstract:Federated learning (FL) has shown promising potential in safeguarding data privacy in healthcare collaborations. While the term "FL" was originally coined by the engineering community, the statistical field has also explored similar privacy-preserving algorithms. Statistical FL algorithms, however, remain considerably less recognized than their engineering counterparts. Our goal was to bridge the gap by presenting the first comprehensive comparison of FL frameworks from both engineering and statistical domains. We evaluated five FL frameworks using both simulated and real-world data. The results indicate that statistical FL algorithms yield less biased point estimates for model coefficients and offer convenient confidence interval estimations. In contrast, engineering-based methods tend to generate more accurate predictions, sometimes surpassing central pooled and statistical FL models. This study underscores the relative strengths and weaknesses of both types of methods, emphasizing the need for increased awareness and their integration in future FL applications.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: **Compare and evaluate the performance differences of Federated Learning (FL) frameworks in engineering and statistics in clinical structured data**. Specifically, the research aims to fill the following two gaps: 1. **Comparison between engineering and statistical methods**: Although the concept of federated learning was initially proposed by the engineering community and has performed well in prediction tasks, the field of statistics has also developed similar privacy - protection algorithms, which have unique advantages in point estimation and confidence interval estimation. However, the application of statistical methods in the medical field has not received sufficient attention. Therefore, by comparing different types of FL frameworks, this paper hopes to provide a selection guide for future research. 2. **Importance of non - prediction tasks**: In addition to prediction tasks, the medical field also needs to accurately estimate the association between factors and clinical outcomes (i.e., point estimation), which is crucial for formulating intervention measures and resource allocation strategies. However, most of the existing research mainly focuses on prediction performance and ignores the needs of non - prediction tasks. This paper emphasizes the differences between these two types of tasks and their impact on the selection of FL frameworks. ### Research Objectives - **Evaluate five FL frameworks**: including GLORE, FedAvg, FedAvgM, 𝑞 - FedAvg and FedProx. - **Use simulated and real - world data**: Conduct controlled experiments with simulated data to verify the bias of model parameter estimation and the accuracy of confidence intervals; use real - world electronic health record (EHR) data to evaluate prediction performance. - **Provide practical suggestions**: Based on the experimental results, provide guidance for future researchers on how to select FL frameworks suitable for different types of clinical tasks. ### Main Findings - **Advantages of statistical methods**: In terms of point estimation and confidence interval estimation, statistical methods (such as GLORE) show smaller biases and higher confidence levels. - **Advantages of engineering methods**: In prediction tasks, engineering methods (such as 𝑞 - FedAvg) can sometimes achieve higher prediction accuracy than centralized models. - **Communication cost**: GLORE performs better in communication efficiency and usually requires fewer communication rounds to converge. ### Conclusion Through systematic benchmarking, this paper reveals the respective advantages and disadvantages of engineering and statistical FL frameworks and provides valuable references for future clinical research. In particular, for research involving non - prediction tasks, it is recommended to give priority to statistical methods; for prediction tasks, engineering methods can be considered to obtain better prediction performance. At the same time, the research also proposes the possibility of integrating the two methods in order to further improve the application effect of FL in the future.