Padé approximant meets federated learning: A nearly lossless, one-shot algorithm for evidence synthesis in distributed research networks with rare outcomes

Qiong Wu,Martijn J Schuemie,Marc A Suchard,Patrick Ryan,George M Hripcsak,Charles A Rohde,Yong Chen,Martijn J. Schuemie,Marc A. Suchard,George M. Hripcsak,Charles A. Rohde
DOI: https://doi.org/10.1016/j.jbi.2023.104476
IF: 8
2023-08-01
Journal of Biomedical Informatics
Abstract:OBJECTIVE: We developed and evaluated a novel one-shot distributed algorithm for evidence synthesis in distributed research networks with rare outcomes.MATERIALS AND METHODS: Fed-Padé, motivated by a classic mathematical tool, Padé approximants, reconstructs the multi-site data likelihood via Padé approximant whose key parameters can be computed distributively. Thanks to the simplicity of [2,2] Padé approximant, Fed-Padé requests an extremely simple task and low communication cost for data partners. Specifically, each data partner only needs to compute and share the log-likelihood and its first 4 gradients evaluated at an initial estimator. We evaluated the performance of our algorithm with extensive simulation studies and four observational healthcare databases.RESULTS: Our simulation studies revealed that a [2,2]-Padé approximant can well reconstruct the multi-site likelihood so that Fed-Padé produces nearly identical estimates to the pooled analysis. Across all simulation scenarios considered, the median of relative bias and rate of instability of our Fed-Padé are both <0.1%, whereas meta-analysis estimates have bias up to 50% and instability up to 75%. Furthermore, the confidence intervals derived from the Fed-Padé algorithm showed better coverage of the truth than confidence intervals based on the meta-analysis. In real data analysis, the Fed-Padé has a relative bias of <1% for all three comparisons for risks of acute liver injury and decreased libido, whereas the meta-analysis estimates have a substantially higher bias (around 10%).CONCLUSION: The Fed-Padé algorithm is nearly lossless, stable, communication-efficient, and easy to implement for models with rare outcomes. It provides an extremely suitable and convenient approach for synthesizing evidence in distributed research networks with rare outcomes.
medical informatics,computer science, interdisciplinary applications
What problem does this paper attempt to address?