Prediction of Effectiveness and Toxicities of Immune Checkpoint Inhibitors Using Real-World Patient Data
Levente Lippenszky,Kathleen F Mittendorf,Zoltán Kiss,Michele L LeNoue-Newton,Pablo Napan-Molina,Protiva Rahman,Cheng Ye,Balázs Laczi,Eszter Csernai,Neha M Jain,Marilyn E Holt,Christina N Maxwell,Madeleine Ball,Yufang Ma,Margaret B Mitchell,Douglas B Johnson,David S Smith,Ben H Park,Christine M Micheel,Daniel Fabbri,Jan Wolber,Travis J Osterman
DOI: https://doi.org/10.1200/CCI.23.00207
Abstract:Purpose: Although immune checkpoint inhibitors (ICIs) have improved outcomes in certain patients with cancer, they can also cause life-threatening immunotoxicities. Predicting immunotoxicity risks alongside response could provide a personalized risk-benefit profile, inform therapeutic decision making, and improve clinical trial cohort selection. We aimed to build a machine learning (ML) framework using routine electronic health record (EHR) data to predict hepatitis, colitis, pneumonitis, and 1-year overall survival. Methods: Real-world EHR data of more than 2,200 patients treated with ICI through December 31, 2018, were used to develop predictive models. Using a prediction time point of ICI initiation, a 1-year prediction time window was applied to create binary labels for the four outcomes for each patient. Feature engineering involved aggregating laboratory measurements over appropriate time windows (60-365 days). Patients were randomly partitioned into training (80%) and test (20%) sets. Random forest classifiers were developed using a rigorous model development framework. Results: The patient cohort had a median age of 63 years and was 61.8% male. Patients predominantly had melanoma (37.8%), lung cancer (27.3%), or genitourinary cancer (16.4%). They were treated with PD-1 (60.4%), PD-L1 (9.0%), and CTLA-4 (19.7%) ICIs. Our models demonstrate reasonably strong performance, with AUCs of 0.739, 0.729, 0.755, and 0.752 for the pneumonitis, hepatitis, colitis, and 1-year overall survival models, respectively. Each model relies on an outcome-specific feature set, though some features are shared among models. Conclusion: To our knowledge, this is the first ML solution that assesses individual ICI risk-benefit profiles based predominantly on routine structured EHR data. As such, use of our ML solution will not require additional data collection or documentation in the clinic.