Abstract:Over the last years, utilizing deep learning for the analysis of survival data has become attractive to many researchers. This has led to the advent of numerous network architectures for the prediction of possibly censored time-to-event variables. Unlike networks for cross-sectional data (used e.g., in classification), deep survival networks require the specification of a suitably defined loss function that incorporates typical characteristics of survival data such as censoring and time-dependent features. Here, we provide an in-depth analysis of the cross-entropy loss function, which is a popular loss function for training deep survival networks. For each time point <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="0.84ex" height="2.009ex" style="vertical-align: -0.338ex;" viewBox="0 -719.6 361.5 865.1" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-74" x="0" y="0"></use></g></svg></span>t, the cross-entropy loss is defined in terms of a binary outcome with levels "event at or before <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="0.84ex" height="2.009ex" style="vertical-align: -0.338ex;" viewBox="0 -719.6 361.5 865.1" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-74" x="0" y="0"></use></g></svg></span>t" and "event after <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="0.84ex" height="2.009ex" style="vertical-align: -0.338ex;" viewBox="0 -719.6 361.5 865.1" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-74" x="0" y="0"></use></g></svg></span>t". Using both theoretical and empirical approaches, we show that this definition may result in a high prediction error and a heavy bias in the predicted survival probabilities. To overcome this problem, we analyze an alternative loss function that is derived from the negative log-likelihood function of a discrete time-to-event model. We show that replacing the cross-entropy loss by the negative log-likelihood loss results in much better calibrated prediction rules and also in an improved discriminatory power, as measured by the concordan-e index.<svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMATHI-74" d="M26 385Q19 392 19 395Q19 399 22 411T27 425Q29 430 36 430T87 431H140L159 511Q162 522 166 540T173 566T179 586T187 603T197 615T211 624T229 626Q247 625 254 615T261 596Q261 589 252 549T232 470L222 433Q222 431 272 431H323Q330 424 330 420Q330 398 317 385H210L174 240Q135 80 135 68Q135 26 162 26Q197 26 230 60T283 144Q285 150 288 151T303 153H307Q322 153 322 145Q322 142 319 133Q314 117 301 95T267 48T216 6T155 -11Q125 -11 98 4T59 56Q57 64 57 83V101L92 241Q127 382 128 383Q128 385 77 385H26Z"></path></defs></svg>

Bias in Cross-Entropy-Based Training of Deep Survival Networks

Bias in Cross-Entropy-Based Training of Deep Survival Networks

Deep Semisupervised Multitask Learning Model and Its Interpretability for Survival Analysis.

Learning to rank for censored survival data

Unbiased Prediction and Feature Selection in High-Dimensional Survival Regression

A Ranking-Based Cross-Entropy Loss for Early Classification of Time Series.

The Concordance Index decomposition: A measure for a deeper understanding of survival prediction models

Cross-Entropy Loss Functions: Theoretical Analysis and Applications

Deep Survival Analysis with Latent Clustering and Contrastive Learning

Deep Recurrent Survival Analysis

Copula-Based Deep Survival Models for Dependent Censoring

Deep Copula-Based Survival Analysis for Dependent Censoring with Identifiability Guarantees

Variational Deep Survival Machines: Survival Regression with Censored Outcomes

A Large-Scale Neutral Comparison Study of Survival Models on Low-Dimensional Data

An Introduction to Deep Survival Analysis Models for Predicting Time-to-Event Outcomes

SurvReLU: Inherently Interpretable Survival Analysis via Deep ReLU Networks

Cross Entropy in Deep Learning of Classifiers Is Unnecessary—ISBE Error Is All You Need

Neural Fine-Gray: Monotonic neural networks for competing risks

FastSurvival: Hidden Computational Blessings in Training Cox Proportional Hazards Models