The impact of competing risks in kidney allograft failure prediction

Agathe Truchot,Marc Raynaud,Ilkka Helantera,Olivier Aubert,Nassim Kamar,Christophe Legendre,Alexandre Hertig,Matthias Buchler,Marta Crespo,Enver Akalin,Gervasio Soler Pujol,Maria Cristina Ribeiro de Castro,Arthur J. Matas,Camilo Ulloa,Stanley C. Jordan,Edmund Huang,Ivana Juric,Nikolina Basic-Jukic,Maarten Coemans,Maarten Naesens,John J. Friedewald,Helio Tedesco Silva Jr.,Carmen Lefaucheur,Dorry L. Segev,Gary S. Collins,Alexandre Loupy
DOI: https://doi.org/10.1101/2024.05.13.24307280
2024-05-14
Abstract:Background Prognostic models are becoming increasingly relevant in clinical trials as potential surrogate endpoints, and for patient management as clinical decision support tools. However, the impact of competing risks on model performance remains poorly investigated. We aimed to carefully assess the performance of competing risks and non-competing risks models in the context of kidney transplantation, where allograft failure and death with a functioning graft are two competing outcomes. Methods We included 10 546 adult kidney transplant recipients enrolled in 10 countries (3941 patients in the derivation cohort, 6605 patients in international external validation cohorts). We developed prediction models for long-term kidney graft failure prediction, without accounting (i.e., censoring) and accounting for the competing risk of death with a functioning graft, using Cox and Fine-Gray regression models. To this aim, we followed a detailed and transparent analytical framework for competing and non-competing risks modelling, and carefully assessed the models' development, stability, discrimination, calibration, overall fit, and generalizability in external validation cohorts and subpopulations. In total, 15 metrics were used to provide an exhaustive assessment of model performance. Results Among the 3941 recipients included in the derivation cohort, 538 (13.65%) lost their graft and 414 (10.50%) died after a median follow-up post-risk evaluation of 5.77 years (IQR 3.52-7.00). In the external validation cohorts, 896 (13.56%) graft losses and 525 (7.95%) deaths occurred after a median follow-up post-risk evaluation of 4.25 years (IQR 2.35-6.59). At 7 years post-risk evaluation, overestimation of the cumulative incidence was moderate when using Kaplan-Meier, compared to the Aalen-Johansen estimate (16.71% versus 15.67% in the derivation cohort). Cox and Fine-Gray models for predicting the long-term graft failure exhibited similar and stable risk estimates (average MAPE of 0.0140 and 0.0138 for Cox and Fine-Gray models, respectively). At 7 years post-risk evaluation, discrimination and overall fit were good and comparable in the external validation cohorts (concordance index ranging from 0.76 to 0.86, Brier Scores ranging from 0.102 to 0.141). In a large series of subpopulations and clinical scenarios, both models performed well and similarly. Conclusions Competing and non-competing risks models performed similarly in predicting long-term kidney graft failure. These results should be interpreted in light of the low rate of the competing event in our cohort, and do not stand as a general conclusion for competing risks modelling. Depending on the clinical scenario and the population considered, competing risks may be crucial to consider for accurate risk predictions.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to consider the impact of competing risks (especially the competing risks between death and a normally functioning transplanted kidney) on model performance when making long - term predictions of kidney transplant failure after kidney transplantation. Specifically, the study aims to evaluate the performance differences between prediction models that do not consider these risks (i.e., using the traditional Kaplan - Meier estimation method) and those that consider these risks (i.e., using the Aalen - Johansen estimation method or the Fine - Gray model) in the presence of competing risks. The study evaluates the stability, discrimination, calibration, and overall fit of the models by comparing the performance of the two modeling methods in multiple international validation cohorts, thereby providing a comprehensive model performance evaluation. This research is of great significance for improving the effectiveness of clinical decision - support tools for kidney transplant patients.