Abstract:In the analysis of survival data, it is usually assumed that any unit will experience the event of interest if it is observed for a sufficient long time. However, one can explicitly assume that an unknown proportion of the population under study will never experience the monitored event. The promotion time model, which has a biological motivation, is one of the survival models taking this feature into account. The promotion time model assumes that the failure time of each subject is generated by the minimum of N latent event times which are independent with a common distribution independent of N. We propose an extension which allows the covariates to influence simultaneously the probability of being cured and the latent distribution. We estimate the latent distribution using a flexible Cox proportional hazard model where the logarithm of the baseline hazard function is specified using Bayesian P-splines. Introducing covariates in the latent distribution implies that the population hazard function might not have a proportional hazard structure. However, the use of the P-splines provides a smooth estimation of the population hazard ratio over time. We propose a restricted use of the model when the follow up of the study is not sufficiently long. A simulation study evaluating the accuracy of our methodology is presented. The proposed model is illustrated on data from the phase III Melanoma e1684 clinical trial.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how to more flexibly estimate the baseline distribution in survival models with a cure fraction (i.e., some individuals will never experience the event of interest) in survival data analysis. Specifically, the paper proposes a flexible method based on Bayesian P - splines to estimate the latent distribution in the promotion time model, which allows covariates to affect both the cure probability and the latent distribution simultaneously. This method not only provides a smooth estimate of the overall hazard ratio over time but also explores the identification problem of the model when the follow - up time of the study is insufficient and proposes a method for restricted use of the model.
### Background and Problem Description of the Paper
In survival data analysis, it is usually assumed that if the observation time is long enough, any individual will eventually experience the event of interest. However, this assumption is not always realistic, especially in scenarios such as cancer clinical trials, where there may be an unknown and unidentifiable part of the population that is cured and will never experience events such as recurrence or death. Such models are called cure survival models. Among them, the promotion time model is a biologically motivated survival model, which assumes that the failure time of each individual is generated by the minimum of multiple independent latent event times, these latent event times have the same distribution and are independent of the number of latent events \(N\).
### Main Contributions
1. **Flexible Baseline Distribution Estimation**: The paper proposes using Bayesian P - splines to estimate the latent distribution, which allows covariates to affect both the cure probability and the latent distribution simultaneously. The use of P - splines provides a smooth estimate of the overall hazard ratio, even if the model does not have a proportional hazard structure.
2. **Exploration of Identification Problems**: The paper discusses the identification problem of the model when the follow - up time of the study is not sufficient to observe all potential failure events. A method for restricted use of the model in this situation is proposed.
3. **Simulation Studies and Practical Applications**: The accuracy of the proposed method is evaluated through simulation studies, and the method is applied to melanoma clinical trial data, demonstrating the effectiveness of the method.
### Formulas and Model Details
- **Overall Survival Function**:
\[
S_p(t|x,z)=\exp \left[-\theta(x) F(t|z)\right]
\]
where \(\theta(x)\) is the parameter of the cure probability and \(F(t|z)\) is the latent distribution.
- **Hazard Function**:
\[
h_p(t|x,z)=\theta(x) f(t|z)
\]
where \(f(t|z)\) is the density function of the latent distribution.
- **Hazard Ratio**:
\[
\text{HR}_p = \frac{h_p(t|x_1,z_1)}{h_p(t|x_2,z_2)}=\exp \left[(x_1^T - x_2^T)\beta\right]\exp \left[(z_1^T - z_2^T)\gamma\right]\frac{S_0(t)(\exp(z_1^T\gamma)-\exp(z_2^T\gamma))}{S_0(t)}
\]
- **Estimation of Baseline Survival Function**:
\[
S_0(t)=\exp \left(-\int_0^t\exp \left(\sum_{k = 1}^K b_k(u)\phi_k\right)du\right)
\]
where \(b_k(u)\) is the cubic B - spline basis function and \(\phi_k\) is the corresponding parameter.
### Conclusion
The promotion time model based on Bayesian P - splines proposed in the paper provides a flexible and effective method for handling survival data analysis with a cure fraction. Through simulation studies and application to real data, the accuracy and practicality of this method are verified. In addition, the paper also explores the model identification problem when the follow - up time is insufficient.