Calibration plots for multistate risk predictions models

Alexander Pate,Matthew Sperrin,Richard D. Riley,Niels Peek,Tjeerd Van Staa,Jamie C. Sergeant,Mamas A. Mamas,Gregory Y. H. Lip,Martin O'Flaherty,Michael Barrowman,Iain Buchan,Glen P. Martin
DOI: https://doi.org/10.1002/sim.10094
2024-05-09
Statistics in Medicine
Abstract:Introduction There is currently no guidance on how to assess the calibration of multistate models used for risk prediction. We introduce several techniques that can be used to produce calibration plots for the transition probabilities of a multistate model, before assessing their performance in the presence of random and independent censoring through a simulation. Methods We studied pseudo‐values based on the Aalen‐Johansen estimator, binary logistic regression with inverse probability of censoring weights (BLR‐IPCW), and multinomial logistic regression with inverse probability of censoring weights (MLR‐IPCW). The MLR‐IPCW approach results in a calibration scatter plot, providing extra insight about the calibration. We simulated data with varying levels of censoring and evaluated the ability of each method to estimate the calibration curve for a set of predicted transition probabilities. We also developed evaluated the calibration of a model predicting the incidence of cardiovascular disease, type 2 diabetes and chronic kidney disease among a cohort of patients derived from linked primary and secondary healthcare records. Results The pseudo‐value, BLR‐IPCW, and MLR‐IPCW approaches give unbiased estimates of the calibration curves under random censoring. These methods remained predominately unbiased in the presence of independent censoring, even if the censoring mechanism was strongly associated with the outcome, with bias concentrated in low‐density regions of predicted transition probability. Conclusions We recommend implementing either the pseudo‐value or BLR‐IPCW approaches to produce a calibration curve, combined with the MLR‐IPCW approach to produce a calibration scatter plot. The methods have been incorporated into the "calibmsm" R package available on CRAN.
public, environmental & occupational health,medicine, research & experimental,medical informatics,mathematical & computational biology,statistics & probability
What problem does this paper attempt to address?