ADNet++: A few-shot learning framework for multi-class medical image volume segmentation with uncertainty-guided feature refinement

Stine Hansen,Srishti Gautam,Suaiba Amina Salahuddin,Michael Kampffmeyer,Robert Jenssen
DOI: https://doi.org/10.1016/j.media.2023.102870
Abstract:A major barrier to applying deep segmentation models in the medical domain is their typical data-hungry nature, requiring experts to collect and label large amounts of data for training. As a reaction, prototypical few-shot segmentation (FSS) models have recently gained traction as data-efficient alternatives. Nevertheless, despite the recent progress of these models, they still have some essential shortcomings that must be addressed. In this work, we focus on three of these shortcomings: (i) the lack of uncertainty estimation, (ii) the lack of a guiding mechanism to help locate edges and encourage spatial consistency in the segmentation maps, and (iii) the models' inability to do one-step multi-class segmentation. Without modifying or requiring a specific backbone architecture, we propose a modified prototype extraction module that facilitates the computation of uncertainty maps in prototypical FSS models, and show that the resulting maps are useful indicators of the model uncertainty. To improve the segmentation around boundaries and to encourage spatial consistency, we propose a novel feature refinement module that leverages structural information in the input space to help guide the segmentation in the feature space. Furthermore, we demonstrate how uncertainty maps can be used to automatically guide this feature refinement. Finally, to avoid ambiguous voxel predictions that occur when images are segmented class-by-class, we propose a procedure to perform one-step multi-class FSS. The efficiency of our proposed methodology is evaluated on two representative datasets for abdominal organ segmentation (CHAOS dataset and BTCV dataset) and one dataset for cardiac segmentation (MS-CMRSeg dataset). The results show that our proposed methodology significantly (one-sided Wilcoxon signed rank test, p<0.05) improves the baseline, increasing the overall dice score with +5.2, +5.1, and +2.8 percentage points for the CHAOS dataset, the BTCV dataset, and the MS-CMRSeg dataset, respectively.
What problem does this paper attempt to address?