Biometry and volumetry in multi-centric fetal brain MRI: assessing the bias of super-resolution reconstruction
Thomas Sanchez,Angeline Mihailov,Mériam Koob,Nadine Girard,Aurélie Manchon,Ingacio Valenzuela,Marta Gómez-Chiari,Gerard Martí Juan,Alexandre Pron,Elisenda Eixarch,Gemma Piella,Migue Angel González Ballester,Oscar Camara,Vincent Dunet,Guillaume Auzias,Meritxell Bach Cuadra
DOI: https://doi.org/10.1101/2024.09.23.24313965
2024-09-24
Abstract:Super-resolution reconstruction (SRR) of fetal brain magnetic resonance imaging has the potential to enable the development of new imaging biomarkers to better study in utero neurodevelopment. However, potential biases in 2D biometric and 3D volumetric measurements due to different SRR techniques remain understudied.
To assess the consistency of biometric and volumetric measurements across three hospitals using three widely used SRR pipelines.
Materials and Methods: This retrospective study used T2-weighted (T2w) fetal brain MRI scans acquired in routine clinical practice at three hospitals. MRIs from each subject were reconstructed with each of the 3 SRR methods. Four experts did biometric measurements on each SRR volume blinded to the method used. Automated 3D volumetry was performed using a state-of-the-art segmentation method. A univariate analysis was first carried out with Friedman tests with post-hoc Wilcoxon rank-sum tests, and results were confirmed in a multivariate analysis accounting for the effect of gestational age and different raters, using a t-distributed generalized additive model. An additional qualitative evaluation was performed to assess how likely clinicians would be to use the current SRR volumes in their practice, and whether they would prefer it to low-resolution T2w acquisitions. Differences were assessed with Friedman tests and post-hoc Wilcoxon rank-sum tests.
84 healthy subjects were included in three gestational age groups ([21-28): 25.4±1.9, [28-32): 29.3±1.3, [32-36): 33.5±1.2). Statistically significant differences in biometric measurements were found, but consistently remained below voxel width (0.8 mm). Automated 3D volumetry revealed systematic but very small effects (<2.8%). The qualitative evaluation showed systematic differences between SRR methods for the perception of white matter intensity (p=0.02) and sharpness of the image (p=0.01).
Variations in 2D and 3D quantitative measurements did not show any large systematic bias when using different SRR methods for radiological assessment in clinical routine across multiple centers, scanners, and raters.
Obstetrics and Gynecology