Abstract:In recent years, "U-shaped" neural networks featuring encoder and decoder structures have gained popularity in the field of medical image segmentation. Various variants of this model have been developed. Nevertheless, the evaluation of these models has received less attention compared to model development. In response, we propose a comprehensive method for evaluating medical image segmentation models for multi-indicator and multi-organ (named MIMO). MIMO allows models to generate independent thresholds which are then combined with multi-indicator evaluation and confidence estimation to screen and measure each organ. As a result, MIMO offers detailed information on the segmentation of each organ in each sample, thereby aiding developers in analyzing and improving the model. Additionally, MIMO can produce concise usability and comprehensiveness scores for different models. Models with higher scores are deemed to be excellent models, which is convenient for clinical evaluation. Our research tests eight different medical image segmentation models on two abdominal multi-organ datasets and evaluates them from four perspectives: correctness, confidence estimation, Usable Region and MIMO. Furthermore, robustness experiments are tested. Experimental results demonstrate that MIMO offers novel insights into multi-indicator and multi-organ medical image evaluation and provides a specific and concise measure for the usability and comprehensiveness of the model. Code: <a class="link-external link-https" href="https://github.com/SCUT-ML-GUO/MIMO" rel="external noopener nofollow">this https URL</a>

What problem does this paper attempt to address?

The paper attempts to address the issue of insufficient evaluation methods in multi-metric and multi-organ medical image segmentation models. Although "U-shaped" neural networks have achieved significant success in the field of medical image segmentation, and various variants (such as Attention U-Net, nnU-Net, etc.) have emerged, the evaluation methods for these models have received less attention compared to model development. Traditional evaluation methods mainly focus on accuracy metrics (such as Dice coefficient, Hausdorff distance), and while these metrics are important, they are not sufficient to comprehensively evaluate the model's practicality in clinical practice. The paper proposes a new multi-metric and multi-organ medical image segmentation model evaluation method (MIMO), which aims to screen and measure each organ by generating independent thresholds and combining multi-metric evaluation and confidence estimation. Specifically, MIMO allows the model to generate independent thresholds and screen sample organs by jointly ranking prediction correctness indices and confidence estimates. Then, MIMO provides feedback on whether each organ in each sample meets the standard, thereby aiding subsequent analysis and model improvement. Additionally, MIMO outputs usability and comprehensiveness scores in the form of "regions" to facilitate intuitive evaluation of different models. The main contributions of the paper include: 1. Proposing a new multi-metric and multi-organ medical image segmentation model evaluation method that allows the model to automatically generate thresholds and screen sample organs through these thresholds. 2. Evaluating the thresholds of each organ under each evaluation metric using the Bootstrap algorithm. 3. Providing detailed methods for calculating usability and comprehensiveness scores, helping developers better analyze and improve models. 4. Validating the effectiveness of the proposed method and demonstrating its advantages in robustness by testing and evaluating eight different medical image segmentation models on two public datasets. Through this method, the researchers hope to promote more clinically-oriented model evaluation and development.

Evaluation of Multi-indicator And Multi-organ Medical Image Segmentation Models

Multi-Modal Evaluation Approach for Medical Image Segmentation

Mmy-net: a Multimodal Network Exploiting Image and Patient Metadata for Simultaneous Segmentation and Diagnosis

Mutual Information-Based Graph Co-Attention Networks for Multimodal Prior-Guided Magnetic Resonance Imaging Segmentation

MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned Text Embedding and Alternating Training

SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation

Segment Anything Model for Medical Images?

M$^4$oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts

Comprehensive Multimodal Segmentation in Medical Imaging: Combining YOLOv8 with SAM and HQ-SAM Models

Complementary Information Mutual Learning for Multimodality Medical Image Segmentation

A medical image segmentation method for rectal tumors based on multi‐scale feature retention and multiple attention mechanisms

IMIIN: An inter-modality information interaction network for 3D multi-modal breast tumor segmentation

UniMOS: A Universal Framework For Multi-Organ Segmentation Over Label-Constrained Datasets

Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation

Indescribable Multi-modal Spatial Evaluator

Unified semantic model for medical image segmentation

Multi-Organ Foundation Model for Universal Ultrasound Image Segmentation with Task Prompt and Anatomical Prior

Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline

Inter-Rater Uncertainty Quantification in Medical Image Segmentation via Rater-Specific Bayesian Neural Networks

TG-LMM: Enhancing Medical Image Segmentation Accuracy through Text-Guided Large Multi-Modal Model

A new solution model for cardiac medical image segmentation