nnU-Net Revisited: A Call for Rigorous Validation in 3D Medical Image Segmentation

Fabian Isensee,Tassilo Wald,Constantin Ulrich,Michael Baumgartner,Saikat Roy,Klaus Maier-Hein,Paul F. Jaeger

2024-07-25

Abstract:The release of nnU-Net marked a paradigm shift in 3D medical image segmentation, demonstrating that a properly configured U-Net architecture could still achieve state-of-the-art results. Despite this, the pursuit of novel architectures, and the respective claims of superior performance over the U-Net baseline, continued. In this study, we demonstrate that many of these recent claims fail to hold up when scrutinized for common validation shortcomings, such as the use of inadequate baselines, insufficient datasets, and neglected computational resources. By meticulously avoiding these pitfalls, we conduct a thorough and comprehensive benchmarking of current segmentation methods including CNN-based, Transformer-based, and Mamba-based approaches. In contrast to current beliefs, we find that the recipe for state-of-the-art performance is 1) employing CNN-based U-Net models, including ResNet and ConvNeXt variants, 2) using the nnU-Net framework, and 3) scaling models to modern hardware resources. These results indicate an ongoing innovation bias towards novel architectures in the field and underscore the need for more stringent validation standards in the quest for scientific progress.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper attempts to address the issue in the field of 3D medical image segmentation where many newly proposed models and methods claim to surpass the classic nnU-Net baseline model, but these claims often lack rigorous validation. Specifically, the authors point out the following major issues: 1. **Insufficient Benchmarking**: Many new methods use datasets that are insufficient in quantity and quality during validation, making it impossible to comprehensively evaluate their performance. 2. **Unfair Comparisons**: Some studies combine innovations with additional performance enhancement techniques (such as residual connections, self-supervised pre-training, etc.), making it difficult to fairly compare the results with the baseline model. 3. **Hardware Resource Differences**: Some studies conduct experiments under different hardware conditions, leading to incomparable results. 4. **Lack of Standardized Benchmarks**: Many studies do not use strictly configured baseline models, casting doubt on the reliability of the results. To address these issues, the authors propose a series of systematic validation standards and re-evaluate current popular 3D medical image segmentation methods through large-scale benchmarking. Their goal is to promote more rigorous method validation in the field, thereby fostering genuine scientific progress.

nnU-Net Revisited: A Call for Rigorous Validation in 3D Medical Image Segmentation

3D U$$^$$-Net: A 3D Universal U-Net for Multi-domain Medical Image Segmentation

nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation

Medical Image Segmentation Review: The success of U-Net

Automated Design of Deep Learning Methods for Biomedical Image Segmentation

Revisiting MAE pre-training for 3D medical image segmentation

nnSegNeXt: A 3D Convolutional Network for Brain Tissue Segmentation Based on Quality Evaluation

3D Multiple-Contextual ROI-Attention Network for Efficient and Accurate Volumetric Medical Image Segmentation.

One Network to Segment Them All: A General, Lightweight System for Accurate 3D Medical Image Segmentation

Investigation and benchmarking of U-Nets on prostate segmentation tasks

DC-UNet: Rethinking the U-Net Architecture with Dual Channel Efficient CNN for Medical Images Segmentation

U-Net-Based Models towards Optimal MR Brain Image Segmentation

Effect of Metabolites of γ-Aminobutyric Shunt on Activities of NAD- and NADP-Isocitrate Dehydrogenases and Aconitate Hydratase from Higher Plants

R2U++: a multiscale recurrent residual U-Net with dense skip connections for medical image segmentation

Benchmarking of Deep Architectures for Segmentation of Medical Images

UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation

UNet Architectures in Multiplanar Volumetric Segmentation -- Validated on Three Knee MRI Cohorts

U-Net-Based Medical Image Segmentation

LHU-Net: A Light Hybrid U-Net for Cost-Efficient, High-Performance Volumetric Medical Image Segmentation

Multi Scale Supervised 3D U-Net for Kidney and Tumor Segmentation

Performance Analysis of UNet and Variants for Medical Image Segmentation