759 Revisiting the ETV Success Score Using Machine Learning Can We Do Better?
Syed M. Adil,Andreas Seas,Daniel Sexton,Pranav Warman,Lacey Carter,Brad Kolls,Anthony Fuller,Nandan P. Lad,Timothy Dunn,Herbert E. Fuchs,Matthew Vestal,Gerald Grant
DOI: https://doi.org/10.1227/neu.0000000000002809_759
IF: 5.315
2024-04-01
Neurosurgery
Abstract:INTRODUCTION: The endoscopic third ventriculostomy (ETV) success score (ETVSS) is a useful decision-making heuristic when considering the probability of surgical success, defined as no repeat CSF diversion surgery needed within six months. Nonetheless, the original dataset published in 2009 was relatively small (n = 618) and logistic regression (LR) model performance modest. METHODS: We queried the MarketScan national database for the years 2005-2020 to identify patients <18 years of age who underwent a first-time ETV and subsequently had at least 6 months of continuous enrollment in the database. We collected data on predictors matching the original ETVSS—age, etiology of hydrocephalus, and history of any previous shunt placement. Next, we used six ML algorithms—LR, support vector classifier, random forest, k-nearest neighbors, XGBoost, and Naive Bayes—to develop predictive models. Finally, we used nested cross-validation to assess models’ comparative performances on unseen data. RESULTS: We identified 1698 patients that met inclusion criteria, and 1139 (67%) had successful ETVs. Performance of all models exceeded that of the original ETVSS, with the LR model still performing best with area under the receiver operating characteristic curve (AUROC) 0.90 (95% confidence interval, 0.86-0.94), sensitivity 0.79 (0.70-0.88), and specificity 0.91 (0.87-0.95) at Youden’s index. The random forest was the second-best ML algorithm following closely behind. This compares favorably to the original ETVSS with AUROC 0.80 (0.76-0.84, p < 0.01), sensitivity 0.65 (0.47-0.85, p < 0.01), and specificity 0.71 (0.65-0.75, p < 0.01). CONCLUSIONS: Our predictions for ETV success outperform the current gold standard. Logistic regression may still be the most interpretable and predictive algorithm; further work is underway to externally validate these models.
surgery,clinical neurology