Abstract:To resolve the problem that the segmentation result of the full convolutional neural network in the Mask R-CNN model is not fine enough, and that the number of loss function hyperparameters is too large, leadings to the time and resource consumption of parameter adjustment, we propose a parameter link and efficient instance segmentation model in this paper. Aiming at the problem that the Mask R-CNN model does not pay attention to sample features, the method of fusing the visual attention network in the ResNet50 backbone network is adopted to achieve self-adaptation and long-range correlation in self-attention, so that the model can precisely recognize the target location and effectively detect and segment the target. The U-Net network is introduced into the segmentation, and the image is processed by stepwise upsampling and downsampling, so that the network segmentation accuracy for the pixel mask is more accurate. Considering the parameter tuning problem of the instance segmentation task, a parameter link loss is recommended to simplify the complexity of model training parameter tuning and further enhance the detection and segmentation performance of the model. We conduct extensive experiments on three extensive baselines, i.e., MiniCOCO, Cityscapes and PASCAL VOC2012, to assess the validity of our model. The experimental findings demonstrate that (1) in the MiniCOCO dataset, a box AP of 35.1 and a mask AP of 32.0 are obtained. Compared with the most advanced mask2former algorithm, the box AP and mask AP are 1.7 and 2.2 higher, respectively. (2) The AP value on Cityscapes is 38.1. In comparison with alternative instance segmentation models, the mAP of each category has been greatly improved. (3) The generalization experiment of our model on the PASCAL VOC2012 dataset shows that the box mAP and mask mAP are 75.5 and 63.6, respectively, which are improved by 3.9 and 1.9, respectively, when contrasting with the Mask R-CNN model. Our model has significant advantages in both detection and segmentation. The code will be available at https://gitee.com/zhiweilu111/simple-mask/tree/master.

Meta R-CNN: Towards General Solver for Instance-Level Low-Shot Learning

Meta R-CNN: Towards General Solver for Instance-Level Low-Shot Learning

Few-Shot Image Segmentation Using Generating Mask with Meta-Learning Classifier Weight Transformer Network

RS-MetaNet: Deep meta metric learning for few-shot remote sensing scene classification

Pixel Matching Network for Cross-Domain Few-Shot Segmentation

Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images

Meta-ResNet: A Novel Few-shot SAR Target Recognition Method Based on Meta-learning.

Few-Shot Cross-Domain Object Detection With Instance-Level Prototype-Based Meta-Learning

Meta Parsing Networks: Towards Generalized Few-shot Scene Parsing with Adaptive Metric Learning

Meta-ZSDETR: Zero-shot DETR with Meta-learning

Few-Shot Classification of Aerial Scene Images via Meta-Learning

3D Meta-Segmentation Neural Network

Meta-HRNet: A High Resolution Network for Coarse-to-Fine Few-Shot Classification

Learning to focus: cascaded feature matching network for few-shot image recognition

Learning Mask-aware CLIP Representations for Zero-Shot Segmentation

MetaMask: Improving Few-Shot Semantic Segmentation Via Multi-Mask Calibriation

SimpleMask: parameter link and efficient instance segmentation

CRNet: Cross-Reference Networks for Few-Shot Segmentation

Differentiable Meta-learning Model for Few-shot Semantic Segmentation

Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment

Meta-Adapter: An Online Few-shot Learner for Vision-Language Model