Functional proportional hazards mixture cure model and its application to modelling the association between cancer mortality and physical activity in NHANES 2003-2006

Rahul Ghosal,Marcos Matabuena,Jiajia Zhang
DOI: https://doi.org/10.48550/arXiv.2302.07340
2023-03-30
Abstract:We develop a functional proportional hazards mixture cure (FPHMC) model with scalar and functional covariates measured at the baseline. The mixture cure model, useful in studying populations with a cure fraction of a particular event of interest is extended to functional data. We employ the EM algorithm and develop a semiparametric penalized spline-based approach to estimate the dynamic functional coefficients of the incidence and the latency part. The proposed method is computationally efficient and simultaneously incorporates smoothness in the estimated functional coefficients via roughness penalty. Simulation studies illustrate a satisfactory performance of the proposed method in accurately estimating the model parameters and the baseline survival function. Finally, the clinical potential of the model is demonstrated in two real data examples that incorporate rich high-dimensional biomedical signals as functional covariates measured at the baseline and constitute novel domains to apply cure survival models in contemporary medical situations. In particular, we analyze i) minute-by-minute physical activity data from the National Health and Nutrition Examination Survey (NHANES) 2003-2006 to study the association between diurnal patterns of physical activity (PA) at baseline and all cancer mortality through 2019 while adjusting for other biological factors; ii) the impact of daily functional measures of disease severity collected in the intensive care unit on post ICU recovery and mortality event. Our findings provide novel epidemiological insights into the association between daily patterns of PA and cancer mortality. Software implementation and illustration of the proposed estimation method is provided in R.
Methodology,Applications
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the presence of a cure proportion (i.e., some people will not experience specific events, such as death from cancer), how to use functional data (e.g., daily activity patterns) to model survival outcomes and explore the relationship between these functional covariates and survival time. Specifically, the author developed a new Functional Proportional Hazards Mixture Cure (FPHMC) model for analyzing survival data containing multiple scalar covariates and one or more functional covariates, especially in the case of right censoring. ### Background and Problem Description With the advancement of wearable device technology, high - resolution individual physiological data such as minute - level step counts, heart rate, and energy consumption can be collected in real - time and continuously. These rich data provide new opportunities for prediction models and help to gain a deeper understanding of human behavior and its impact on health. However, in many biological applications, researchers are more concerned about survival or time - to - event outcomes and their associations with risk factors of interest. Traditional survival analysis methods (such as the proportional hazards model) are not suitable for dealing with populations with a cure proportion because they cannot accommodate the number of cured patients. ### Solution To meet this challenge, the author proposed the FPHMC model, which can handle both scalar covariates and functional covariates simultaneously and perform survival analysis in the presence of a cure proportion. Specifically: 1. **Model Framework**: - **Cure Sub - model**: Estimate the cure probability (i.e., the proportion of people in the population who do not experience a specific event) through a generalized scalar - function regression model. - **Latency Sub - model**: Assume a proportional hazards structure and use a linear functional Cox model to estimate survival time. 2. **Estimation Method**: - Use the EM algorithm and semi - parametric penalized spline method to estimate model parameters. - Ensure the smoothness of the estimated functional coefficients simultaneously through roughness penalty. 3. **Application Examples**: - **NHANES 2003 - 2006 Data**: Analyze the relationship between daily activity patterns and cancer death, adjusting for other biological factors. - **ICU Data**: Study the impact of daily - collected functional disease severity measures in the ICU on recovery and death events after discharge. ### Main Contributions 1. **Model Innovation**: Developed a functional proportional hazards mixture cure model containing multiple scalars and one or more functional covariates. 2. **Smoothness**: Avoid the problem of manually selecting the number of principal components in the method based on functional principal component analysis (FPCA) by automatically selecting the smoothing parameter. 3. **Practical Application**: Show novel epidemiological insights into the impact of daily activity patterns on cancer death in NHANES data. ### Simulation Study The author verified the effectiveness of the proposed method through a simulation study. The results show that the FPHMC method performs well in estimating model parameters, baseline survival functions, etc., especially in the presence of a cure proportion, and has higher accuracy and stability compared to traditional methods. ### Conclusion This paper provides a new method for processing survival data containing a cure proportion and functional covariates by developing the FPHMC model and demonstrates its application value in real - data. This provides a powerful tool for future research, especially in the fields of epidemiology and biomedicine.