Pixel-Wise and Class-Wise Semantic Cues for Few-Shot Segmentation in Astronaut Working Scenes

Qingwei Sun,Jiangang Chao,Wanhong Lin,Dongyang Wang,Wei Chen,Zhenying Xu,Shaoli Xie
DOI: https://doi.org/10.3390/aerospace11060496
IF: 2.66
2024-06-21
Aerospace
Abstract:Few-shot segmentation (FSS) is a cutting-edge technology that can meet requirements using a small workload. With the development of China Aerospace Engineering, FSS plays a fundamental role in astronaut working scene (AWS) intelligent parsing. Although mainstream FSS methods have made considerable breakthroughs in natural data, they are not suitable for AWSs. AWSs are characterized by a similar foreground (FG) and background (BG), indistinguishable categories, and the strong influence of light, all of which place higher demands on FSS methods. We design a pixel-wise and class-wise network (PCNet) to match support and query features using pixel-wise and class-wise semantic cues. Specifically, PCNet extracts pixel-wise semantic information at each layer of the backbone using novel cross-attention. Dense prototypes are further utilized to extract class-wise semantic cues as a supplement. In addition, the deep prototype is distilled in reverse to the shallow layer to improve its quality. Furthermore, we customize a dataset for AWSs and conduct abundant experiments. The results indicate that PCNet outperforms the published best method by 4.34% and 5.15% in accuracy under one-shot and five-shot settings, respectively. Moreover, PCNet compares favorably with the traditional semantic segmentation model under the 13-shot setting.
engineering, aerospace
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in Astronaut Working Scenes (AWS), how to achieve effective semantic segmentation with a small number of labeled samples. Specifically, in view of the challenges encountered by existing few - shot segmentation (FSS) methods when dealing with AWS, the paper proposes a new method. ### Problem Background 1. **Limitations of Existing FSS Methods**: - Existing FSS methods are mainly aimed at natural datasets (such as COCO and PASCAL - 5i). These datasets have obvious foreground (FG) and background (BG) distinctions, and the differences between classes are significant. - However, the images in AWS usually have similar foreground and background, the differences between classes are small, and are strongly affected by lighting conditions, which makes it difficult for existing FSS methods to be directly applied to AWS. 2. **Characteristics of AWS**: - The foreground and background are similar, and the classes are difficult to distinguish. - The lighting conditions are complex, which increases the difficulty of segmentation. - It is required to handle unseen classes, demanding that the model has good generalization ability. ### Solution To meet the above challenges, the author proposes a new model named PCNet, which combines pixel - wise and class - wise semantic cues to improve the segmentation performance on AWS. ### Main Contributions 1. **Proposing the PCNet Model**: - PCNet improves the segmentation accuracy by introducing pixel - wise and class - wise semantic cues to match the support set and support set features. - It uses a novel cross - attention mechanism to extract pixel - wise semantic information in each layer of the backbone network. - It utilizes dense prototypes to extract class - wise semantic cues as a supplement. - It transfers the information of deep - layer prototypes to the shallow layer through reverse distillation to improve the quality of the low - level network. 2. **Customizing the AWS Dataset**: - A semantic segmentation dataset specifically for AWS has been created, including important objects inside and outside the simulated capsule, filling the gap in this field. 3. **Experimental Verification**: - The experimental results show that the accuracy of PCNet in the one - shot and five - shot settings is 4.34% and 5.15% higher than that of the existing best methods respectively, and it outperforms the traditional semantic segmentation model in the 13 - shot setting. ### Summary This paper aims to solve the problem that existing FSS methods cannot be effectively applied in Astronaut Working Scenes due to the similarity between foreground and background, the difficulty in distinguishing classes, and the complex lighting conditions. By proposing the PCNet model and creating a specialized dataset, the author has successfully improved the performance of FSS in AWS, providing strong support for future research and practical applications.