Remote sensing image instance segmentation network with transformer and multi-scale feature representation
Wenhui Ye,Wei Zhang,Weimin Lei,Wenchao Zhang,Xinyi Chen,Yanwen Wang
DOI: https://doi.org/10.1016/j.eswa.2023.121007
IF: 8.5
2023-07-29
Expert Systems with Applications
Abstract:The goal of remote sensing image (RSI) instance segmentation is to perform instance-level semantic parsing of its contents. Aside from classifying and locating regions of interest (RoI), it also requires assigning finer pixel-wise annotations to objects. However, RSI often suffers from cluttered backgrounds, variable object scales, and complex object edge contours, making the instance segmentation task more challenging. In this work, we analytically customize an instance segmentation model that is more suitable for RSI. Specifically, we propose three novel modules for a region-based instance segmentation framework, namely Channel-Spatial Attention Module (CSA), Multi-Scale Aware Module (MSA), and Semantic Relation Learning Module (SRL). Among them, feature calibration performed by CSA can alleviate the semantic gap between low-level features and high-level semantics in both channel and spatial dimensions. Inheriting the capabilities of both the convolutional neural network (CNN) and the Transformer, SRL can help the network integrate both neighborhood features and long-range dependencies for instance semantic prediction. The MSA module designs a cascaded residual structure with different receptive fields to model the scale variation of objects in RSI. Experimental results on challenging ISAID, NWPU VHR-10, SSDD, BITTC and HRSID datasets demonstrate the superiority of our method, achieving mask APs of 40.2%, 68.2%, 68.4%, 50.4% and 55.8% respectively. Code and pretrained models are available at https://github.com/Sherlock1018/RSIISN .
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science