Semi-supervised Semantic Segmentation of Cataract Surgical Images based on DeepLab v3+

Hongyu Chen,Xiao Ma,Tong Xia,Fucang Jia
DOI: https://doi.org/10.1145/3456529.3456549
2021-02-02
Abstract:Microscopic surgical image analysis is very important in surgical skill analysis, workflow recognition, and autonomous robotic surgery. Semantic segmentation of microscopic image is a prerequisite. Currently, supervised deep convolutional neural network (CNN) has become the state-of-the-art image segmentation method. However, manual contouring of thousands of images is needed for training a supervised CNN. Semi-supervised or weakly supervised learning can greatly reduce the need for labeling, and the semantic segmentation results are also close to or slightly better than those of supervised learning. However, most existing semi-supervised methods have disadvantages of poor segmentation accuracy and poor robustness. The existing semi-supervised methods can only be close to the segmentation accuracy of the existing supervised methods, and can only be used in large-scale data sets. There are still many limitations to the use of the field. In this article, we proposed a novel semi-supervised learning method based on principle of cross-consistency in microscopic images segmentation. To leverage unlabeled images, we use consistency principle between main decoder and auxiliary decoder. The auxiliary decoder is a variety of interference examples generated by different interference functions, which can improve the accuracy and robustness of the overall output of the network. The network also has good task scalability, and has high segmentation accuracy in fields such as natural scenes and medical surgical scenes. The experimental results of the CATARACTS-Semantic-Segmentation 2020 data set proved the effectiveness of the deep neural network. The proposed method is better than the baseline model in terms of accuracy. The experimental setting 1 segmentation is increased by 1.71%, and the experimental setting 2 segmentation accuracy is improved by 22.34%.
What problem does this paper attempt to address?