Advancing patient care: Machine learning models for predicting grade 3+ toxicities in gynecologic cancer patients treated with HDR brachytherapy
Andres Portocarrero Bonifaz,Salman Syed,Maxwell Kassel,Grant McKenzie,Vishwa Shah,Bryce Forry,Jeremy Gaskins,Keith Sowards,Thulasi Babitha Avula,Adrianna Masters,Jose Schneider,Scott Silva
DOI: https://doi.org/10.1101/2024.10.04.24314917
2024-10-06
Abstract:Background:
Gynecological cancers are among the most prevalent cancers in women worldwide. Brachytherapy, often used as a boost to external beam radiotherapy, is integral to treatment. Advances in computation, algorithms, and data availability have popularized machine learning.
Objective:
To develop and compare machine learning models for predicting grade 3 or higher toxicities in gynecological cancer patients treated with high dose rate (HDR) brachytherapy, aiming to contribute to personalized radiation treatments.
Methods:
A retrospective analysis on gynecological cancer patients who underwent HDR brachytherapy with Syed-Neblett or Tandem and Ovoid applicators from 2009 to 2023. After exclusions, 233 patients were included. Dosimetric variables for the high-risk clinical target volume (HR-CTV) and organs at risk, along with tumor, patient, and toxicity data, were collected and compared between groups with and without grade 3 or higher toxicities using statistical tests. Six supervised classification machine learning models (Logistic Regression, Random Forest, K-Nearest Neighbors, Support Vector Machines, Gaussian Naive Bayes, and Multi-Layer Perceptron Neural Networks) were constructed and evaluated. The construction process involved sequential feature selection (SFS) when appropriate, followed by hyperparameter tuning. Final model performance was characterized using a 25% withheld test dataset.
Results:
The top three ranking models were Support Vector Machines, Random Forest, and Logistic Regression, with F1 testing scores of 0.63, 0.57, and 0.52; normMCC testing scores of 0.75, 0.77, and 0.71; and accuracy testing scores of 0.80, 0.85, and 0.81, respectively. The SFS algorithm selected 10 features for the highest-ranking model. In traditional statistical analysis, HR-CTV volume, Charlson Comorbidity Index, Length of Follow-Up, and D2cc - Rectum differed significantly between groups with and without grade 3 or higher toxicities.
Conclusions:
Machine learning models were developed to predict grade 3 or higher toxicities, achieving satisfactory performance. Machine learning presents a novel solution to creating multivariable models for personalized radiation therapy care.
Health Informatics