Predicting Outcome in Clear Aligner Treatment: A Machine Learning Analysis

Daniel Wolf,Gasser Farrag,Tabea Flügge,Lan Huong Timm
DOI: https://doi.org/10.3390/jcm13133672
2024-06-24
Abstract:Background/Objectives: Machine learning (ML) models predicting the risk of refinement (i.e., a subsequent course of treatment being necessary) in clear aligner therapy (CAT) were developed and evaluated. Methods: An anonymized sample of 9942 CAT patients (70.6% females, 29.4% males, age range 18-64 years, median 30.5 years), as provided by DrSmile, a large European CAT provider based in Berlin, Germany, was used. Three different ML methods were employed: (1) logistic regression with L1 regularization, (2) extreme gradient boosting (XGBoost), and (3) support vector classification with a radial basis function kernel. In total, 74 factors were selected as predictors for these methods and are consistent with clinical reasoning. Results: On a held-out test set with a true-positive rate of 0.58, the logistic regression model has an area under the ROC curve (AUC) of 0.67, an average precision (AP) of 0.73, and Brier loss of 0.22; the XGBoost model has an AUC of 0.67, an AP of 0.74, and Brier loss of 0.22; and the support vector model has a recall of 0.61 and a precision of 0.64. The logistic regression and XGBoost models identify predictors influencing refinement risk, including patient compliance, interproximal enamel reduction (IPR) and certain planned tooth movements, for example, lingual translation of maxillary incisors being associated with the lowest risk of refinement and rotation of mandibular incisors with the highest risk. Conclusions: These findings suggest moderate, well-calibrated predictive accuracy with both regularized logistic regression and XGBoost and underscore the influence the identified factors have on the risk of refinement in CAT, emphasizing their importance in the careful planning of orthodontic treatment and the potential for shorter treatment times, less patient discomfort, and fewer clinic visits. Identification of at-risk individuals could support tailored clinical decision-making and enable targeted interventions.
What problem does this paper attempt to address?