Integrating Latent Classes in the Bayesian Shared Parameter Joint Model of Longitudinal and Survival Outcomes

Eleni-Rosalina Andrinopoulou,Kazem Nasserinejad,Rhonda Szczesniak,Dimitris Rizopoulos
DOI: https://doi.org/10.48550/arXiv.1802.10015
2019-11-05
Abstract:Cystic fibrosis is a chronic lung disease which requires frequent patient monitoring to maintain lung function over time and minimize onset of acute respiratory events known as pulmonary exacerbations. From the clinical point of view it is important to characterize the association between key biomarkers such as $FEV_1$ and time-to first exacerbation. Progression of the disease is heterogeneous, yielding different sub-groups in the population exhibiting distinct longitudinal profiles. It is desirable to categorize these unobserved sub-groups (latent classes) according to their distinctive trajectories. Accounting for these latent classes, in other words heterogeneity, will lead to improved estimates of association arising from the joint longitudinal-survival model. The joint model of longitudinal and survival data constitutes a popular framework to analyze such data arising from heterogeneous cohorts. In particular, two paradigms within this framework are the shared parameter joint models and the joint latent class models. The former paradigm allows one to quantify the strength of the association between the longitudinal and survival outcomes but does not allow for latent sub-populations. The latter paradigm explicitly postulates the existence of sub-populations but does not directly quantify the strength of the association. We propose to integrate latent classes in the shared parameter joint model in a fully Bayesian approach, which allows us to investigate the association between $FEV_1$ and time-to first exacerbation within each latent class. We, furthermore, focus on the selection of the optimal number of latent classes.
Applications
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on the following aspects: 1. **Modeling the relationship between FEV1 and the time to the first acute exacerbation in patients with cystic fibrosis**: - Cystic fibrosis (CF) is a fatal genetic disease affecting the lungs. Its clinical course is characterized by a gradual decline in lung function, ultimately leading to respiratory failure. Forced expiratory volume in one second (FEV1) is the most important clinical indicator for monitoring the decline in lung function in CF patients. Patients may experience acute respiratory events during the follow - up period, known as acute exacerbations of the lungs. Therefore, from a clinical perspective, it is very important to understand the association between FEV1 and the time to the first acute exacerbation. 2. **Introducing latent classes to explain population heterogeneity**: - Disease progression is heterogeneous, resulting in different subgroups in the population, which exhibit different longitudinal characteristics. In order to more accurately estimate the association between FEV1 and the time to the first acute exacerbation, it is necessary to classify these unobserved subgroups and consider the effects of these latent classes. 3. **Selecting the optimal number of latent classes**: - An important task in the latent class model is to determine the optimal number of latent classes. Traditional selection methods such as the Bayesian Information Criterion (BIC) and the Deviance Information Criterion (DIC) are computationally intensive and may require fitting multiple models with different numbers of classes. For this reason, the paper proposes a method based on Nasserinejad et al. (2017) to select the optimal number of latent classes. This method simplifies the model selection process by excluding latent classes with extremely small proportions. ### Main contributions of the paper 1. **Proposing a Bayesian shared - parameter joint model combined with latent classes**: - This model can not only quantify the strength of the association between FEV1 and the time to the first acute exacerbation but also identify patient groups under different latent classes, thereby providing more accurate association estimates. 2. **Solving the problem of selecting the optimal number of classes in the latent class model**: - By using the method of Nasserinejad et al. (2017), the paper provides an effective and computationally efficient method to select the optimal number of latent classes, avoiding the high computational cost of traditional methods. ### Method overview - **Longitudinal sub - model**: - Use the latent class mixed - effects model to describe the longitudinal change of FEV1, where each latent class has specific fixed effects and random effects. - The formula is expressed as: \[ y_i(t|v_i = g)=\eta_{ig}(t)+\epsilon_i(t)=x_i(t)^{\top}\beta_g + z_i(t)^{\top}b_{ig}+\epsilon_i(t), \] where \(v_i = g\) represents the latent class of the \(i\) - th patient, \(x_i(t)\) and \(z_i(t)\) are the design matrices of fixed effects and random effects respectively, and \(\epsilon_i(t)\sim N(0,\sigma_y^{2})\). - **Survival sub - model**: - Use the Cox proportional hazards model to describe survival data, where the hazard function is associated with the longitudinal process through latent classes. - The formula is expressed as: \[ h_i(t|v_i = g)=h_{0g}(t)\exp[\gamma_g^{\top}w_i+\alpha_g\eta_{ig}(t)], \] where \(h_{0g}(t)\) is the baseline hazard function, \(w_i\) is the vector of baseline covariates, \(\gamma_g\) is the vector of regression coefficients, and \(\alpha_g\) is the association parameter. - **Bayesian estimation**: - Use the Markov chain Monte Carlo (MCMC) method to estimate model parameters, assuming that the longitudinal and survival processes are independent given the random effects. ### Results - **Simulation study**: - The effectiveness of the proposed latent class selection method was verified through simulation studies.