The Impact of Fatty Infiltration on MRI Segmentation of Lower Limb Muscles in Neuromuscular Diseases: A Comparative Study of Deep Learning Approaches
Marc-Adrien Hostin,Augustin C Ogier,Constance P Michel,Yann Le Fur,Maxime Guye,Shahram Attarian,Etienne Fortanier,Marc-Emmanuel Bellemare,David Bendahan
DOI: https://doi.org/10.1002/jmri.28708
Abstract:Background: Deep learning methods have been shown to be useful for segmentation of lower limb muscle MRIs of healthy subjects but, have not been sufficiently evaluated on neuromuscular disease (NDM) patients. Purpose: Evaluate the influence of fat infiltration on convolutional neural network (CNN) segmentation of MRIs from NMD patients. Study type: Retrospective study. Subjects: Data were collected from a hospital database of 67 patients with NMDs and 14 controls (age: 53 ± 17 years, sex: 48 M, 33 F). Ten individual muscles were segmented from the thigh and six from the calf (20 slices, 200 cm section). Field strength/sequence: A 1.5 T. Sequences: 2D T1 -weighted fast spin echo. Fat fraction (FF): three-point Dixon 3D GRE, magnetization transfer ratio (MTR): 3D MT-prepared GRE, T2: 2D multispin-echo sequence. Assessment: U-Net 2D, U-Net 3D, TransUNet, and HRNet were trained to segment thigh and leg muscles (101/11 and 95/11 training/validation images, 10-fold cross-validation). Automatic and manual segmentations were compared based on geometric criteria (Dice coefficient [DSC], outlier rate, absence rate) and reliability of measured MRI quantities (FF, MTR, T2, volume). Statistical tests: Bland-Altman plots were chosen to describe agreement between manual vs. automatic estimated FF, MTR, T2 and volume. Comparisons were made between muscle populations with an FF greater than 20% (G20+) and lower than 20% (G20-). Results: The CNNs achieved equivalent results, yet only HRNet recognized every muscle in the database, with a DSC of 0.91 ± 0.08, and measurement biases reaching -0.32% ± 0.92% for FF, 0.19 ± 0.77 for MTR, -0.55 ± 1.95 msec for T2, and - 0.38 ± 3.67 cm3 for volume. The performances of HRNet, between G20- and G20+ decreased significantly. Data conclusion: HRNet was the most appropriate network, as it did not omit any muscle. The accuracy obtained shows that CNNs could provide fully automated methods for studying NMDs. However, the accuracy of the methods may be degraded on the most infiltrated muscles (>20%). Evidence level: 4. Technical efficacy: Stage 1.