MACHINE LEARNING-BASED PREDICTION OF GEBOES SCORE AND HISTOLOGIC IMPROVEMENT AND REMISSION THRESHOLDS IN ULCERATIVE COLITIS

Zahil Shanis,Harshith Padigela,Kathleen Sucipto,John Shamshoian,Jin Li,Andrew Walker,Darren Fahy,Mary Lin,Mike Montalto,Andrew Beck,Jimish Mehta,Ilan Wapinski,Archit Khosla,Harsha Pokkalla,Fedaa Najdawi,Christina Jayson
DOI: https://doi.org/10.1093/ibd/izac247.038
2023-01-26
Inflammatory Bowel Diseases
Abstract:Abstract BACKGROUND Histology is emerging as a key therapeutic endpoint for ulcerative colitis driven by associations between histologic response and long-term outcomes. However, existing scoring systems are subjective and consequently have variable inter- and intra-reader variability. Geboes scoring is a well-established system for ulcerative colitis histologic assessment that has previously been used to define thresholds for histo-endoscopic mucosal improvement (Geboes Score ≤3.1, together with Mayo score 0 or 1) and histologic remission (Geboes Score <2). Here we report the first machine learning (ML)-based prediction of the Geboes Score, and Geboes Score-derived thresholds of histologic improvement and remission, directly from whole slide images (WSI) of hematoxylin and eosin (H&E)-stained mucosal biopsies. METHODS 3,148 WSI were scored by three expert gastrointestinal pathologists and the median consensus score was used to determine the Geboes score for each slide as ground truth. ML models were trained on median consensus scores to predict the Geboes score and subscores for each slide. Model performance vs. pathologist median consensus score was measured using accuracy and the F1 score, which accounts for both false positive and false negative errors. RESULTS The ML-based model performance, measured against median consensus scores of three pathologists, showed strong performance at predicting overall Geboes Score, with a quadratic kappa of 0.89. The model was also able to predict both histologic improvement and histologic remission with high accuracy. For predicting histological improvement as defined by a Geboes Score of ≤3.1, the model showed accuracy of 0.92 and F1 score of 0.92 (Figure 1). For predicting histological remission as defined by a Geboes Score of < 2, the model showed accuracy of 0.91 and F1 score of 0.89 (Figure 2). CONCLUSIONS We report a ML-based approach for predicting Geboes score and Geboes score-based key thresholds of histologic improvement and histologic remission. Model predictions show high accuracy compared to median consensus pathologist scores. This approach may enable standardized, reproducible and accurate prediction of these clinically relevant thresholds to better measure histologic disease activity and treatment response in clinical trials.
gastroenterology & hepatology
What problem does this paper attempt to address?