Validation of an Artificial Intelligence-Based Prediction Model Using 5 External PET/CT Datasets of Diffuse Large B-Cell Lymphoma
Maria C. Ferrández,Sandeep S.V. Golla,Jakoba J. Eertink,Sanne E. Wiegers,Gerben J.C. Zwezerijnen,Martijn W. Heymans,Pieternella J. Lugtenburg,Lars Kurch,Andreas Hüttmann,Christine Hanoun,Ulrich Dührsen,Sally F. Barrington,N. George Mikhaeel,Luca Ceriani,Emanuele Zucca,Sándor Czibor,Tamás Györke,Martine E.D. Chamuleau,Josée M. Zijlstra,Ronald Boellaard
DOI: https://doi.org/10.2967/jnumed.124.268191
2024-11-02
Journal of Nuclear Medicine
Abstract:The aim of this study was to validate a previously developed deep learning model in 5 independent clinical trials. The predictive performance of this model was compared with the international prognostic index (IPI) and 2 models incorporating radiomic PET/CT features (clinical PET and PET models). Methods: In total, 1,132 diffuse large B-cell lymphoma patients were included: 296 for training and 836 for external validation. The primary outcome was 2-y time to progression. The deep learning model was trained on maximum-intensity projections from PET/CT scans. The clinical PET model included metabolic tumor volume, maximum distance from the bulkiest lesion to another lesion, SUV peak , age, and performance status. The PET model included metabolic tumor volume, maximum distance from the bulkiest lesion to another lesion, and SUV peak . Model performance was assessed using the area under the curve (AUC) and Kaplan–Meier curves. Results: The IPI yielded an AUC of 0.60 on all external data. The deep learning model yielded a significantly higher AUC of 0.66 ( P < 0.01). For each individual clinical trial, the model was consistently better than IPI. Radiomic model AUCs remained higher for all clinical trials. The deep learning and clinical PET models showed equivalent performance (AUC, 0.69; P > 0.05). The PET model yielded the highest AUC of all models (AUC, 0.71; P < 0.05). Conclusion: The deep learning model predicted outcome in all trials with a higher performance than IPI and better survival curve separation. This model can predict treatment outcome in diffuse large B-cell lymphoma without tumor delineation but at the cost of a lower prognostic performance than with radiomics.
radiology, nuclear medicine & medical imaging