Exploring Weakly Supervised Semantic Segmentation Ensembles for Medical Imaging Systems

Erik Ostrowski,Bharath Srinivas Prabakaran,Muhammad Shafique
DOI: https://doi.org/10.48550/arXiv.2303.07896
2023-03-16
Abstract:Reliable classification and detection of certain medical conditions, in images, with state-of-the-art semantic segmentation networks, require vast amounts of pixel-wise annotation. However, the public availability of such datasets is minimal. Therefore, semantic segmentation with image-level labels presents a promising alternative to this problem. Nevertheless, very few works have focused on evaluating this technique and its applicability to the medical sector. Due to their complexity and the small number of training examples in medical datasets, classifier-based weakly supervised networks like class activation maps (CAMs) struggle to extract useful information from them. However, most state-of-the-art approaches rely on them to achieve their improvements. Therefore, we propose a framework that can still utilize the low-quality CAM predictions of complicated datasets to improve the accuracy of our results. Our framework achieves that by first utilizing lower threshold CAMs to cover the target object with high certainty; second, by combining multiple low-threshold CAMs that even out their errors while highlighting the target object. We performed exhaustive experiments on the popular multi-modal BRATS and prostate DECATHLON segmentation challenge datasets. Using the proposed framework, we have demonstrated an improved dice score of up to 8% on BRATS and 6% on DECATHLON datasets compared to the previous state-of-the-art.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the training challenges in medical image semantic segmentation due to the lack of pixel-level annotation data. Specifically, current state-of-the-art semantic segmentation networks require a large amount of pixel-level annotated data to achieve high-precision prediction results, but such data is very scarce in the medical field. Therefore, the authors propose a weakly supervised semantic segmentation framework based on image-level labels to reduce the reliance on a large amount of pixel-level annotated data. ### Main Contributions 1. **Proposed a new semantic segmentation framework**: This framework utilizes image-level labels of medical images for training and improves segmentation accuracy by evaluating the ensemble methods of different network configurations. 2. **Achieved significant performance improvement on BRATS and DECATHLON datasets**: Compared to current state-of-the-art methods, this framework improved the Dice score by 8% on the BRATS dataset and by 6% on the DECATHLON dataset. 3. **Open-source framework**: The framework is fully open-source and can be accessed on GitHub. ### Method Overview 1. **Classifier training**: First, train classifier models on the target dataset, using ResNet-34 and ResNet-50 models in the experiments. 2. **Generate CAM predictions**: Use Grad-CAM to generate initial segmentation masks. 3. **Ensemble methods**: Generate the final segmentation results by combining multiple Grad-CAM predictions using different ensemble strategies (such as "or", "and", "min", "max", etc.). 4. **Threshold optimization**: Determine the optimal threshold settings by testing different threshold combinations on the training set to improve detection accuracy. ### Experimental Results - **BRATS dataset**: In the multi-modal brain tumor segmentation task, this framework outperformed existing WSS-CMER and SEAM methods, especially in handling complex datasets. - **DECATHLON dataset**: In the multi-modal prostate segmentation task, this framework also performed excellently, particularly in reducing false positive and false negative rates. ### Conclusion This paper proposes an effective weakly supervised semantic segmentation framework that can achieve high-quality medical image segmentation using image-level labels in the absence of pixel-level annotated data. This method is not only applicable to the medical field but can also be extended to other scenarios that require handling complex datasets.