International Benchmark for Total Metabolic Tumor Volume Measurement in Baseline 18F-FDG PET/CT of Lymphoma Patients: A Milestone Toward Clinical Implementation

Ronald Boellaard,Irène Buvat,Christophe Nioche,Luca Ceriani,Anne-Ségolène Cottereau,Luca Guerra,Rodney J. Hicks,Salim Kanoun,Carsten Kobe,Annika Loft,Heiko Schöder,Annibale Versari,Conrad-Amadeus Voltin,Gerben J.C. Zwezerijnen,Josée M. Zijlstra,N. George Mikhaeel,Andrea Gallamini,Tarec C. El-Galaly,Christine Hanoun,Stephane Chauvie,Romain Ricci,Emanuele Zucca,Michel Meignan,Sally F. Barrington
DOI: https://doi.org/10.2967/jnumed.124.267789
2024-09-04
Journal of Nuclear Medicine
Abstract:Total metabolic tumor volume (TMTV) is prognostic in lymphoma. However, cutoff values for risk stratification vary markedly, according to the tumor delineation method used. We aimed to create a standardized TMTV benchmark dataset allowing TMTV to be tested and applied as a reproducible biomarker. Methods: Sixty baseline 18 F-FDG PET/CT scans were identified with a range of disease distributions (20 follicular, 20 Hodgkin, and 20 diffuse large B-cell lymphoma). TMTV was measured by 12 nuclear medicine experts, each analyzing 20 cases split across subtypes, with each case processed by 3–4 readers. LIFEx or ACCURATE software was chosen according to reader preference. Analysis was performed stepwise: TMTV1 with automated preselection of lesions using an SUV of at least 4 and a volume of at least 3 cm 3 with single-click removal of physiologic uptake; TMTV2 with additional removal of reactive bone marrow and spleen with single clicks; TMTV3 with manual editing to remove other physiologic uptake, if required; and TMTV4 with optional addition of lesions using mouse clicks with an SUV of at least 4 (no volume threshold). Results: The final TMTV (TMTV4) ranged from 8 to 2,288 cm 3 , showing excellent agreement among all readers in 87% of cases (52/60) with a difference of less than 10% or less than 10 cm 3 . In 70% of the cases, TMTV4 equaled TMTV1, requiring no additional reader interaction. Differences in the TMTV4 were exclusively related to reader interpretation of lesion inclusion or physiologic high-uptake region removal, not to the choice of software. For 5 cases, large TMTV differences (>25%) were due to disagreement about inclusion of diffuse splenic uptake. Conclusion: The proposed segmentation method enabled highly reproducible TMTV measurements, with minimal reader interaction in 70% of the patients. The inclusion or exclusion of diffuse splenic uptake requires definition of specific criteria according to lymphoma subtype. The publicly available proposed benchmark allows comparison of study results and could serve as a reference to test improvements using other segmentation approaches.
radiology, nuclear medicine & medical imaging
What problem does this paper attempt to address?