Multi-scale Cascaded Large-Model for Whole-body ROI Segmentation

Rui Hao,Dayu Tan,Yansen Su,Chunhou Zheng
2024-11-23
Abstract:Organs-at-risk segmentation is critical for ensuring the safety and precision of radiotherapy and surgical procedures. However, existing methods for organs-at-risk image segmentation often suffer from uncertainties and biases in target selection, as well as insufficient model validation experiments, limiting their generality and reliability in practical applications. To address these issues, we propose an innovative cascaded network architecture called the Multi-scale Cascaded Fusing Network (MCFNet), which effectively captures complex multi-scale and multi-resolution features. MCFNet includes a Sharp Extraction Backbone and a Flexible Connection Backbone, which respectively enhance feature extraction in the downsampling and skip-connection stages. This design not only improves segmentation accuracy but also ensures computational efficiency, enabling precise detail capture even in low-resolution images. We conduct experiments using the A6000 GPU on diverse datasets from 671 patients, including 36,131 image-mask pairs across 10 different datasets. MCFNet demonstrates strong robustness, performing consistently well across 10 datasets. Additionally, MCFNet exhibits excellent generalizability, maintaining high accuracy in different clinical scenarios. We also introduce an adaptive loss aggregation strategy to further optimize the model training process, improving both segmentation accuracy and efficiency. Through extensive validation, MCFNet demonstrates superior performance compared to existing methods, providing more reliable image-guided support. Our solution aims to significantly improve the precision and safety of radiotherapy and surgical procedures, advancing personalized treatment. The code has been made available on GitHub:<a class="link-external link-https" href="https://github.com/Henry991115/MCFNet" rel="external noopener nofollow">this https URL</a>.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to address the challenge of accurately segmenting Organs - at - Risk (OAR) during radiotherapy and surgical procedures. Existing OAR image segmentation methods usually have the following problems: 1. **Uncertainty and bias in target selection**: Existing methods often have uncertainty and bias when selecting the target area, which affects the segmentation accuracy. 2. **Insufficient model validation experiments**: Existing methods are insufficient in model validation experiments, resulting in limited generalization ability and reliability in practical applications. To deal with these problems, the author proposes an innovative cascaded network architecture, named Multi - scale Cascaded Fusing Network (MCFNet). MCFNet improves the segmentation accuracy and computational efficiency by effectively capturing complex multi - scale and multi - resolution features, and can also achieve accurate detail capture in low - resolution images. ### Main contributions 1. **Novel multi - scale cascaded large - scale model**: A novel multi - scale cascaded U - shaped network is proposed, which cascades two backbone networks, namely the designed FCB (Flexible Connecting Backbone) and SEB (Sharp Extracting Backbone). These two backbone networks are used to process images of different scales and resolutions, and perform feature fusion at four skip connections and the down - sampling bottom layer. 2. **Innovative adaptive loss aggregation strategy**: A new adaptive loss aggregation strategy, named Adaptive Multi - scale Feature - Mixing Loss Aggregation (Adaptive - MFA), is introduced. Through this strategy, the available prediction maps can be combined to optimize the model training process. 3. **Extensive verification to ensure robustness and generalization ability**: To thoroughly verify the robustness and generalization ability of the model, the author selects ten datasets for different organs - at - risk and tumors throughout the body for experiments. These datasets cover the segmentation areas of the brain, thorax, abdomen, and lower limbs, and two of these datasets are provided by partner hospitals. ### Method overview MCFNet mainly consists of three parts: cascaded backbone networks, a decoder with cascaded skip connections, and an aggregation module. 1. **Cascaded backbone networks**: Include SEB (Sharp Extracting Backbone) and FCB (Flexible Connecting Backbone), which are respectively responsible for extracting features from images of different resolutions. 2. **Decoder with cascaded skip connections**: Achieve feature aggregation through skip connections and bridging layers, enhance feature representation, and make full use of feature information of different levels and sub - networks. 3. **Aggregation module**: Mainly reflects the aggregation of the decoder output layer and the application of the adaptive loss aggregation strategy. ### Experimental results The author conducted experiments on ten different datasets, which cover the risk areas of different organs throughout the body and tumors. The experimental results show that MCFNet exhibits strong robustness and generalization ability on multiple datasets, significantly improving the segmentation accuracy and efficiency. Through these improvements, MCFNet provides reliable support for improving the accuracy and safety of radiotherapy and surgical procedures, and promotes the development of personalized treatment.