Artificial intelligence algorithm developed to predict immune checkpoint inhibitors efficacy in non–small-cell lung cancer.

Mehrdad Rakaee,Masoud Tafavvoghi,Elio Adib,Biagio Ricciuti,Joao Victor Machado Alessi,Alessio Cortellini,Claudia A.M. Fulgenzi,Kajsa Møllersen,Lars Ailo Bongo,Sayed MS Hashemi,Ilias Houda,Lill-Tove Rasmussen Busund,Tom Donnem,Idris Bahce,David J. James Pinato,Lynette M. Sholl,Mark M. Awad,David J. Kwiatkowski
DOI: https://doi.org/10.1200/jco.2023.41.16_suppl.9132
IF: 45.3
2023-06-01
Journal of Clinical Oncology
Abstract:9132 Background: Many non-small cell lung cancer (NSCLC) patients with high PD-L1 IHC expression and/or high TMB level do not respond to immune-check point inhibitors (ICIs). Discovery of novel predictive biomarkers for ICI response in NSCLC continues to be a critical unmet medical need, given the expense, potential toxicity, and therapeutic delay due to ICI treatment without benefit. Methods: We developed a supervised deep learning algorithm (Deep-IO) to predict therapeutic response to ICIs in NSCLC patients from standard histology whole slide hematoxylin and eosin (H&E) images, which was trained based on ICI objective response rate. We utilized a convolutional neural network (CNN) implemented in Pytorch. The efficiency of the Deep-IO was evaluated through classification performance metrics and the area under the receiver operating characteristic curve (AUROC). The algorithm was trained and tested on 85218 tiles (size: 512x512 pixels) in 446 advanced stage NSCLC patients from the Dana Farber Cancer Institute (DFCI), all treated with ICI monotherapy. Results: The objective response rate, comprising complete and partial response, was 26% (n = 114) in the overall cohort. The DFCI patient cohort was randomly split into a training set (N=379, 85%) and a test set (N=67, 15%). The classifier’s predicted class (responder vs non-responder) probabilities at the tiles level were averaged for each patient. Quantitative evaluation of whole section test dataset shows that the developed model achieves the accuracy of 0.72, weighted average (WAVG) precision of 0.77, WAVG recall of 0.72 and WAVG F-score of 0.74 on classifying ICI non-responders from responders. Comparing with PD-L1 expression (TPS%) and TMB (mu/Mb), Deep-IO had superior predictive power for ICI response (AUROC=0.75) in the test set. The combined Deep-IO+PD-L1 scores resulted in a significantly greater AUROC value of 0.82 compared to single tests and a 32% improvement in specificity (0.88 vs. 0.56) compared to PD-L1 evaluation alone (Table). Conclusions: This proof-of-concept study shows that an artificial intelligence–powered H&E image classifier can predict ICI effectiveness in NSCLC. Deep-IO outperformed both established predictive biomarkers. We will assess Deep-IO in additional external NSCLC data sets for validation and confirmation. [Table: see text]
oncology
What problem does this paper attempt to address?