Letter: the outcome of ulcerative colitis patients undergoing pouch surgery may be determined by pre‐surgical factors. Authors’ reply
H. Yanai,S. Ben-Shachar,L. Mlynarsky,L. Godny,M. Leshno,H. Tulchinsky,I. Dotan
DOI: https://doi.org/10.1111/apt.14289
2017-11-01
Abstract:EDITORS, We would like to thank Drs Erfan Ayubi and Saeid Safiri for their interest in our work and for their comments. We thank you for the opportunity to respond. We read the comments thoughtfully and address each issue that was raised. The strategy taken in the multivariable Cox proportional hazard regression model building is specified and described both in the methods and results sections 2.3 and 3.4, respectively. In case of perfect multicollinearity in linear regression the matrix XtX cannot be inverted (the closed form of the coefficient is ^ b 1⁄4 ðXtXÞ XtYÞ, and therefore in linear regression multicollinearity may cause imperfect estimation of the coefficient. Indeed, multicollinearity is also a problem when estimating generalized linear models, including logistic regression and Cox regression. Nonetheless, in generalized linear models multicollinearity does not bias coefficients, it only makes them unstable. In addition, multicollinearity does not reduce the predictive power or reliability of the model as a whole; it merely affects calculations regarding individual predictors. That is, a multiple logistic regression model with correlated predictors can indicate how well the entire bundle of predictors predicts the outcome variable, but it may not give valid results about any individual predictor, or about which predictors are redundant with others. In logistic regression or Cox regression, the regression coefficients are estimated using maximum likelihood estimation and the problem of multicollinearity is that in the numerical algorithm, weighted version of XtX is inverted as well and one should be aware of the risk when fitting a model. Of note, Hosmer and Lemeshow suggest that one would normally not employ an in-depth investigation of the collinearity of the covariates unless there was evidence of degradation in the fit model. An alternative is to use the ridge regression methods proposed by Schaefer to avoid the multicollinearity problem. In the analysis performed in our study we were aware of the risk when fitting a model and checked for multicollinearity, but for simplicity we did not present it in the manuscript. We tested the variables in the model for multicollinearity by the “collinearity diagnostics” procedure in SPSS software, and the variance inflation factor (VIF) was close to 1 and no collinearity was found. Although our data did not demonstrate multicollinearity, there is high correlation between the two variables included in the model: (1) age of UC diagnosis and (2) age at ileal pouch-anal anastomosis surgery. Therefore, we conducted an additional Cox regression with only one variable, age at UC diagnosis (as the variable “age at ileal pouch-anal anastomosis surgery” that was primarily included was not statistically significant [P = .488]). The alternative model obtained similar results with the original, ie the refractory indication is still the most significant predictor of outcome, HR 2.90, P = .004 (Table 1). This is comparable to the originally reported HR of 3.43, P = .006, thus the results are not biased. We agree with the commentators that in Extended Cox models where covariates are usually time-dependent it is essential to check for model assumptions prior to implementing the model. However, the covariates in the model used in our study are independent of time. The approach of validation is important and used routinely in machine learning and data mining techniques, but less in clinical studies, mainly because of the relative small sample size. In this study, we conducted the analysis over the entire data set, and did not randomly split the data into training data (in-sample) and testing data (out-sample), thus internal validation could not be conducted.