Influence of inter-observer delineation variability on radiomics stability in different tumor sites

Matea Pavic,Marta Bogowicz,Xaver Würms,Stefan Glatz,Tobias Finazzi,Oliver Riesterer,Johannes Roesch,Leonie Rudofsky,Martina Friess,Patrick Veit-Haibach,Martin Huellner,Isabelle Opitz,Walter Weder,Thomas Frauenfelder,Matthias Guckenberger,Stephanie Tanadini-Lang
DOI: https://doi.org/10.1080/0284186X.2018.1445283
Abstract:Background: Radiomics is a promising methodology for quantitative analysis and description of radiological images using advanced mathematics and statistics. Tumor delineation, which is still often done manually, is an essential step in radiomics, however, inter-observer variability is a well-known uncertainty in radiation oncology. This study investigated the impact of inter-observer variability (IOV) in manual tumor delineation on the reliability of radiomic features (RF). Methods: Three different tumor types (head and neck squamous cell carcinoma (HNSCC), malignant pleural mesothelioma (MPM) and non-small cell lung cancer (NSCLC)) were included. For each site, eleven individual tumors were contoured on CT scans by three experienced radiation oncologists. Dice coefficients (DC) were calculated for quantification of delineation variability. RF were calculated with an in-house developed software implementation, which comprises 1404 features: shape (n = 18), histogram (n = 17), texture (n = 137) and wavelet (n = 1232). The IOV of RF was studied using the intraclass correlation coefficient (ICC). An ICC >0.8 indicates a good reproducibility. For the stable RF, an average linkage hierarchical clustering was performed to identify classes of uncorrelated features. Results: Median DC was high for NSCLC (0.86, range 0.57-0.90) and HNSCC (0.72, 0.21-0.89), whereas it was low for MPM (0.26, 0-0.9) indicating substantial IOV. Stability rate of RF correlated with DC and depended on tumor site, showing a high stability in NSCLC (90% of total parameters), acceptable stability in HNSCC (59% of total parameters) and low stability in MPM (36% of total parameters). Shape features showed the weakest stability across all tumor types. Hierarchical clustering revealed 14 groups of correlated and stable features for NSCLC and 6 groups for both HNSCC and MPM. Conclusion: Inter-observer delineation variability has a relevant influence on radiomics analysis and is strongly influenced by tumor type. This leads to a reduced number of suitable imaging features.
What problem does this paper attempt to address?