Abstract:Background: Survival analysis, also known as 'time to event' analysis, is commonly used in evidence-based medicine to estimate the time until events of interest, such as mortality and disease recurrence, occur. In particular, survival analysis with competing risks is a challenging problem designed to deal with situations where there are multiple possible outcomes during the follow up of survival data, and the occurrence of one event can be precluded or impacted by another. Existing studies have attempted to address this issue by modeling the relationship between covariates and the distribution of first hit times for events of interest. However, popular parameter-based methods suffer from an overlooked flaw: competing risks are confounders that can mislead the model to learn spurious correlations between covariates and events of interest, resulting in performance degradation. Therefore, there is an urgent need to adjust survival analysis models to mitigate the bias of spurious correlations to obtain more accurate estimations.Methods: To address the problem of spurious correlations introduced by competing risks, we propose a novel paradigm: Causal Interventional Survival Analysis with competing risks. Specifically, by formalizing survival analysis under the framework of structural causal models (SCM), competing risks may introduce backdoor paths connecting covariates and events of interest, resulting in spurious correlations. Such backdoor paths can be effectively identified and removed through causal intervention - backdoor adjustment. In this way, only the true causal relations between covariates and events of interest are preserved and used for the probabilistic calculation, so that to yield accurate estimations. This solution is general and can be conveniently implemented and integrated into existing models, e.g., cs-Cox, Fine- Gray, Deep Survival Machine (DSM), DeepHit, and Dynamic-DeepHit, etc. The performance of our solution was evaluated on two inpatient datasets, MIMIC-IV and eICU, and an outpatient dataset, SEER, by performing five-fold cross-validation on each dataset. Specifically, the extracted MIMIC-IV dataset included 9,357 patients with 64 covariates and 11 competing risks; the eICU dataset included 15,731 patients with 46 covariates and 11 competing risks; the SEER dataset included 122,815 patients with 19 covariates and 3 competing risks. Each individual in the datasets had at least two competing risks. The clinical significance of the proposed solution was assessed in terms of Concordance-Index (CIndex), Net Reclassification Index (NRI), calibration and Decision Curve Analysis (DCA) of distinct competing events for all three datasets. In addition, model-agnostic Kernel SHAP (Shapley Additive Explanations) values were calculated for each covariate in the presence of each competing risk, to assess whether the causal intervention could reduce spurious correlations between covariates and events of interest.Findings: Overall, the five survival analysis models equipped with causal interventions were well calibrated and achieved significant performance gains as measured by C-Index (average performance gain of 4.66% - 11.85% for MIMIC-IV; 15.85% - 19.94% for eICU; 1.28% - 1.98% for SEER) and NRI (average improvement of 0.104 - 0.210 for MIMIC-IV; 0.068 - 0.431 for eICU; 0.014 - 0.026 for SEER) in all three datasets. The results of calibration and DCA also demonstrated the effectiveness of the proposed solution. Using Fine-Gray with/out causal intervention, the SHAP values obtained showed that casual intervention helps mitigate spurious correlations and reveal actual correlations between covariates and events of interest.Interpretation: We developed a debiasing solution for survival analysis with competing risks. Our solution learned a debiased model with causal intervention, conducting backdoor adjustment to remove spurious correlations introduced by risk confounders. Experimental results showed that the causal approach outperformed previous models in improving performance and reducing bias. The findings suggest that the debiasing solution has the potential to alleviate problems of existing models by removing/weakening the influence of covariates that are positively correlated with events of interest but not having any causal relations, and identifying covariates that are negatively correlated with events of interest but having true causal relations.Funding: We acknowledge support from the National Nature Science Foundation of China 61672450. GPUs provided by Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare.Declaration of Interest: We declare no competing interests.

Correct deconfounding enables causal machine learning for precision medicine and beyond

Causal Inference and Counterfactual Prediction in Machine Learning for Actionable Healthcare

Confounder control in biomedicine necessitates conceptual considerations beyond statistical evaluations

Explainable biology for improved therapies in precision medicine: AI is not enough

Causal Discovery Analysis: A Promising Tool for Precision Medicine

Improving the accuracy of medical diagnosis with causal machine learning

Learning Personalized Treatment Decisions in Precision Medicine: Disentangling Treatment Assignment Bias in Counterfactual Outcome Prediction and Biomarker Identification

Causal inference for multiple risk factors and diseases from genomics data

Causal machine learning for predicting treatment outcomes

Novel multi-omics deconfounding variational autoencoders can obtain meaningful disease subtyping

Causal machine learning for healthcare and precision medicine

Big Data, Data Science, and Causal Inference: A Primer for Clinicians

Causal Debiasing for Unknown Bias in Histopathology - A Colon Cancer Use Case

Improving generalization of machine learning-identified biomarkers using causal modelling with examples from immune receptor diagnostics

When no answer is better than a wrong answer: a causal perspective on batch effects

Causal inference with multiple versions of treatment and application to personalized medicine

Exploration, inference and prediction in neuroscience and biomedicine

How Much Time to Survive under Competing Risks: A Causal Debiasing Paradigm

Machine Learning in Modeling Disease Trajectory and Treatment Outcomes: An Emerging Enabler for Model‐Informed Precision Medicine