Abstract:Remote sensing techniques for shoreline extraction are crucial for monitoring changes in erosion rates, surface hydrology, and ecosystem structure. In recent years, Convolutional neural networks (CNNs) have developed as a cutting-edge deep learning technique that has been extensively used in shoreline extraction from remote sensing images, owing to their exceptional feature extraction capabilities. They are progressively replacing traditional methods in this field. However, most CNN models only focus on the features in local receptive fields, and overlook the consideration of global contextual information, which will hamper the model's ability to perform a precise segmentation of boundaries and small objects, consequently leading to unsatisfactory segmentation results. To solve this problem, we propose a parallel semantic segmentation network (TCU-Net) combining CNN and Transformer, to extract shorelines from multispectral remote sensing images, and improve the extraction accuracy. Firstly, TCU-Net imports the Pyramid Vision Transformer V2 (PVT V2) network and ResNet, which serve as backbones for the Transformer branch and CNN branch, respectively, forming a parallel dual-encoder structure for the extraction of both global and local features. Furthermore, a feature interaction module is designed to achieve information exchange, and complementary advantages of features, between the two branches. Secondly, for the decoder part, we propose a cross-scale multi-source feature fusion module to replace the original UNet decoder block, to aggregate multi-scale semantic features more effectively. In addition, a sea-land segmentation dataset covering the Yellow Sea region (GF Dataset) is constructed through the processing of three scenes from Gaofen-6 remote sensing images. We perform a comprehensive experiment with the GF dataset to compare the proposed method with mainstream semantic segmentation models, and the results demonstrate that TCU-Net outperforms the competing models in all three evaluation indices: the PA (pixel accuracy), F1-score, and MIoU (mean intersection over union), while requiring significantly fewer parameters and computational resources compared to other models. These results indicate that the TCU-Net model proposed in this article can extract the shoreline from remote sensing images more effectively, with a shorter time, and lower computational overhead.

CCTNet: CNN and Cross-Shaped Transformer Hybrid Network for Remote Sensing Image Semantic Segmentation

CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images.

CTCFNet: CNN-Transformer Complementary and Fusion Network for High-Resolution Remote Sensing Image Semantic Segmentation

CNN and Transformer Fusion for Remote Sensing Image Semantic Segmentation

Transformer and CNN Hybrid Deep Neural Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery

TCNet: Multiscale Fusion of Transformer and CNN for Semantic Segmentation of Remote Sensing Images

SSNet: A Novel Transformer and CNN Hybrid Network for Remote Sensing Semantic Segmentation

Cascaded CNN and global–local attention transformer network-based semantic segmentation for high-resolution remote sensing image

CTMFNet: CNN and Transformer Multiscale Fusion Network of Remote Sensing Urban Scene Imagery

Enhancing Multiscale Representations with Transformer for Remote Sensing Image Semantic Segmentation

Category attention guided network for semantic segmentation of Fine-Resolution remote sensing images

RCCT-ASPPNet: Dual-Encoder Remote Image Segmentation Based on Transformer and ASPP

Rethinking Transformers for Semantic Segmentation of Remote Sensing Images.

Remote Sensing Image Semantic Segmentation Based on Cascaded Transformer

SCTNet: Single-Branch CNN with Transformer Semantic Information for Real-Time Segmentation

ACTNet: A Dual-Attention Adapter with a CNN-Transformer Network for the Semantic Segmentation of Remote Sensing Imagery

Hybrid transformer-CNN networks using superpixel segmentation for remote sensing building change detection

Hybrid Attention Fusion Embedded in Transformer for Remote Sensing Image Semantic Segmentation

TCUNet: A Lightweight Dual-Branch Parallel Network for Sea-Land Segmentation in Remote Sensing Images

Unsupervised Multi-Scale Hybrid Feature Extraction Network for Semantic Segmentation of High-Resolution Remote Sensing Images

MCAT-UNet: Convolutional and Cross-Shaped Window Attention Enhanced UNet for Efficient High-Resolution Remote Sensing Image Segmentation