Deep Learning for Automatic Bone Marrow Apparent Diffusion Coefficient Measurements From Whole-Body Magnetic Resonance Imaging in Patients With Multiple Myeloma: A Retrospective Multicenter Study
Markus Wennmann,Peter Neher,Nikolas Stanczyk,Kim-Celine Kahl,Jessica Kächele,Vivienn Weru,Thomas Hielscher,Martin Grözinger,Jiri Chmelik,Kevin Sun Zhang,Fabian Bauer,Tobias Nonnenmacher,Manuel Debic,Sandra Sauer,Lukas Thomas Rotkopf,Anna Jauch,Kai Schlamp,Elias Karl Mai,Niels Weinhold,Saif Afat,Marius Horger,Hartmut Goldschmidt,Heinz-Peter Schlemmer,Tim Frederik Weber,Stefan Delorme,Felix Tobias Kurz,Klaus Maier-Hein
DOI: https://doi.org/10.1097/RLI.0000000000000932
2023-04-01
Abstract:Objectives: Diffusion-weighted magnetic resonance imaging (MRI) is increasingly important in patients with multiple myeloma (MM). The objective of this study was to train and test an algorithm for automatic pelvic bone marrow analysis from whole-body apparent diffusion coefficient (ADC) maps in patients with MM, which automatically segments pelvic bones and subsequently extracts objective, representative ADC measurements from each bone. Materials and methods: In this retrospective multicentric study, 180 MRIs from 54 patients were annotated (semi)manually and used to train an nnU-Net for automatic, individual segmentation of the right hip bone, the left hip bone, and the sacral bone. The quality of the automatic segmentation was evaluated on 15 manually segmented whole-body MRIs from 3 centers using the dice score. In 3 independent test sets from 3 centers, which comprised a total of 312 whole-body MRIs, agreement between automatically extracted mean ADC values from the nnU-Net segmentation and manual ADC measurements from 2 independent radiologists was evaluated. Bland-Altman plots were constructed, and absolute bias, relative bias to mean, limits of agreement, and coefficients of variation were calculated. In 56 patients with newly diagnosed MM who had undergone bone marrow biopsy, ADC measurements were correlated with biopsy results using Spearman correlation. Results: The ADC-nnU-Net achieved automatic segmentations with mean dice scores of 0.92, 0.93, and 0.85 for the right pelvis, the left pelvis, and the sacral bone, whereas the interrater experiment gave mean dice scores of 0.86, 0.86, and 0.77, respectively. The agreement between radiologists' manual ADC measurements and automatic ADC measurements was as follows: the bias between the first reader and the automatic approach was 49 × 10 -6 mm 2 /s, 7 × 10 -6 mm 2 /s, and -58 × 10 -6 mm 2 /s, and the bias between the second reader and the automatic approach was 12 × 10 -6 mm 2 /s, 2 × 10 -6 mm 2 /s, and -66 × 10 -6 mm 2 /s for the right pelvis, the left pelvis, and the sacral bone, respectively. The bias between reader 1 and reader 2 was 40 × 10 -6 mm 2 /s, 8 × 10 -6 mm 2 /s, and 7 × 10 -6 mm 2 /s, and the mean absolute difference between manual readers was 84 × 10 -6 mm 2 /s, 65 × 10 -6 mm 2 /s, and 75 × 10 -6 mm 2 /s. Automatically extracted ADC values significantly correlated with bone marrow plasma cell infiltration ( R = 0.36, P = 0.007). Conclusions: In this study, a nnU-Net was trained that can automatically segment pelvic bone marrow from whole-body ADC maps in multicentric data sets with a quality comparable to manual segmentations. This approach allows automatic, objective bone marrow ADC measurements, which agree well with manual ADC measurements and can help to overcome interrater variability or nonrepresentative measurements. Automatically extracted ADC values significantly correlate with bone marrow plasma cell infiltration and might be of value for automatic staging, risk stratification, or therapy response assessment.