Learning competing risks across multiple hospitals: one-shot distributed algorithms

Dazheng Zhang,Jiayi Tong,Naimin Jing,Yuchen Yang,Chongliang Luo,Yiwen Lu,Dimitri A Christakis,Diana Güthe,Mady Hornig,Kelly J Kelleher,Keith E Morse,Colin M Rogerson,Jasmin Divers,Raymond J Carroll,Christopher B Forrest,Yong Chen
DOI: https://doi.org/10.1093/jamia/ocae027
2024-04-19
Abstract:Objectives: To characterize the complex interplay between multiple clinical conditions in a time-to-event analysis framework using data from multiple hospitals, we developed two novel one-shot distributed algorithms for competing risk models (ODACoR). By applying our algorithms to the EHR data from eight national children's hospitals, we quantified the impacts of a wide range of risk factors on the risk of post-acute sequelae of SARS-COV-2 (PASC) among children and adolescents. Materials and methods: Our ODACoR algorithms are effectively executed due to their devised simplicity and communication efficiency. We evaluated our algorithms via extensive simulation studies as applications to quantification of the impacts of risk factors for PASC among children and adolescents using data from eight children's hospitals including the Children's Hospital of Philadelphia, Cincinnati Children's Hospital Medical Center, Children's Hospital of Colorado covering over 6.5 million pediatric patients. The accuracy of the estimation was assessed by comparing the results from our ODACoR algorithms with the estimators derived from the meta-analysis and the pooled data. Results: The meta-analysis estimator showed a high relative bias (∼40%) when the clinical condition is relatively rare (∼0.5%), whereas ODACoR algorithms exhibited a substantially lower relative bias (∼0.2%). The estimated effects from our ODACoR algorithms were identical on par with the estimates from the pooled data, suggesting the high reliability of our federated learning algorithms. In contrast, the meta-analysis estimate failed to identify risk factors such as age, gender, chronic conditions history, and obesity, compared to the pooled data. Discussion: Our proposed ODACoR algorithms are communication-efficient, highly accurate, and suitable to characterize the complex interplay between multiple clinical conditions. Conclusion: Our study demonstrates that our ODACoR algorithms are communication-efficient and can be widely applicable for analyzing multiple clinical conditions in a time-to-event analysis framework.
What problem does this paper attempt to address?