Reproducible radiomics through automated machine learning validated on twelve clinical applications
Martijn P. A. Starmans,Sebastian R. van der Voort,Thomas Phil,Milea J. M. Timbergen,Melissa Vos,Guillaume A. Padmos,Wouter Kessels,David Hanff,Dirk J. Grunhagen,Cornelis Verhoef,Stefan Sleijfer,Martin J. van den Bent,Marion Smits,Roy S. Dwarkasing,Christopher J. Els,Federico Fiduzi,Geert J. L. H. van Leenders,Anela Blazevic,Johannes Hofland,Tessa Brabander,Renza A. H. van Gils,Gaston J. H. Franssen,Richard A. Feelders,Wouter W. de Herder,Florian E. Buisman,Francois E. J. A. Willemssen,Bas Groot Koerkamp,Lindsay Angus,Astrid A. M. van der Veldt,Ana Rajicic,Arlette E. Odink,Mitchell Deen,Jose M. Castillo T.,Jifke Veenland,Ivo Schoots,Michel Renckens,Michail Doukas,Rob A. de Man,Jan N. M. IJzermans,Razvan L. Miclea,Peter B. Vermeulen,Esther E. Bron,Maarten G. Thomeer,Jacob J. Visser,Wiro J. Niessen,Stefan Klein
DOI: https://doi.org/10.48550/arXiv.2108.08618
2022-07-29
Abstract:Radiomics uses quantitative medical imaging features to predict clinical outcomes. Currently, in a new clinical application, finding the optimal radiomics method out of the wide range of available options has to be done manually through a heuristic trial-and-error process. In this study we propose a framework for automatically optimizing the construction of radiomics workflows per application. To this end, we formulate radiomics as a modular workflow and include a large collection of common algorithms for each component. To optimize the workflow per application, we employ automated machine learning using a random search and ensembling. We evaluate our method in twelve different clinical applications, resulting in the following area under the curves: 1) liposarcoma (0.83); 2) desmoid-type fibromatosis (0.82); 3) primary liver tumors (0.80); 4) gastrointestinal stromal tumors (0.77); 5) colorectal liver metastases (0.61); 6) melanoma metastases (0.45); 7) hepatocellular carcinoma (0.75); 8) mesenteric fibrosis (0.80); 9) prostate cancer (0.72); 10) glioma (0.71); 11) Alzheimer's disease (0.87); and 12) head and neck cancer (0.84). We show that our framework has a competitive performance compared human experts, outperforms a radiomics baseline, and performs similar or superior to Bayesian optimization and more advanced ensemble approaches. Concluding, our method fully automatically optimizes the construction of radiomics workflows, thereby streamlining the search for radiomics biomarkers in new applications. To facilitate reproducibility and future research, we publicly release six datasets, the software implementation of our framework, and the code to reproduce this study.
Image and Video Processing,Computer Vision and Pattern Recognition