Abstract:Remote sensing techniques for shoreline extraction are crucial for monitoring changes in erosion rates, surface hydrology, and ecosystem structure. In recent years, Convolutional neural networks (CNNs) have developed as a cutting-edge deep learning technique that has been extensively used in shoreline extraction from remote sensing images, owing to their exceptional feature extraction capabilities. They are progressively replacing traditional methods in this field. However, most CNN models only focus on the features in local receptive fields, and overlook the consideration of global contextual information, which will hamper the model's ability to perform a precise segmentation of boundaries and small objects, consequently leading to unsatisfactory segmentation results. To solve this problem, we propose a parallel semantic segmentation network (TCU-Net) combining CNN and Transformer, to extract shorelines from multispectral remote sensing images, and improve the extraction accuracy. Firstly, TCU-Net imports the Pyramid Vision Transformer V2 (PVT V2) network and ResNet, which serve as backbones for the Transformer branch and CNN branch, respectively, forming a parallel dual-encoder structure for the extraction of both global and local features. Furthermore, a feature interaction module is designed to achieve information exchange, and complementary advantages of features, between the two branches. Secondly, for the decoder part, we propose a cross-scale multi-source feature fusion module to replace the original UNet decoder block, to aggregate multi-scale semantic features more effectively. In addition, a sea-land segmentation dataset covering the Yellow Sea region (GF Dataset) is constructed through the processing of three scenes from Gaofen-6 remote sensing images. We perform a comprehensive experiment with the GF dataset to compare the proposed method with mainstream semantic segmentation models, and the results demonstrate that TCU-Net outperforms the competing models in all three evaluation indices: the PA (pixel accuracy), F1-score, and MIoU (mean intersection over union), while requiring significantly fewer parameters and computational resources compared to other models. These results indicate that the TCU-Net model proposed in this article can extract the shoreline from remote sensing images more effectively, with a shorter time, and lower computational overhead.

CCTNet: Coupled CNN and Transformer Network for Crop Segmentation of Remote Sensing Images.

Cropnet: Deep Spatial-Temporal-Spectral Feature Learning Network for Crop Classification from Time-Series Multi-Spectral Images

CTFuseNet: A Multi-Scale CNN-Transformer Feature Fused Network for Crop Type Segmentation on UAV Remote Sensing Imagery

TCNet: Multiscale Fusion of Transformer and CNN for Semantic Segmentation of Remote Sensing Images

Crop field extraction from high resolution remote sensing images based on semantic edges and spatial structure map

CNN and Transformer Fusion for Remote Sensing Image Semantic Segmentation

CTMFNet: CNN and Transformer Multiscale Fusion Network of Remote Sensing Urban Scene Imagery

Category attention guided network for semantic segmentation of Fine-Resolution remote sensing images

CMTNet: Convolutional Meets Transformer Network for Hyperspectral Images Classification

Crop classification in high-resolution remote sensing images based on multi-scale feature fusion semantic segmentation model

TCUNet: A Lightweight Dual-Branch Parallel Network for Sea-Land Segmentation in Remote Sensing Images

A CNN-Transformer Hybrid Approach for Crop Classification Using Multitemporal Multisensor Images

Transformer and CNN Hybrid Deep Neural Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery

SSNet: A Novel Transformer and CNN Hybrid Network for Remote Sensing Semantic Segmentation

ACTNet: A Dual-Attention Adapter with a CNN-Transformer Network for the Semantic Segmentation of Remote Sensing Imagery

Understanding the rural third sector: insights from Veblen and Bogdanov

Incorporating convolutional and transformer architectures to enhance semantic segmentation of fine-resolution urban images

Hybrid transformer-CNN networks using superpixel segmentation for remote sensing building change detection

BCTNet: Bi-Branch Cross-Fusion Transformer for Building Footprint Extraction

RCCT-ASPPNet: Dual-Encoder Remote Image Segmentation Based on Transformer and ASPP

MFTransNet: A Multi-Modal Fusion with CNN-Transformer Network for Semantic Segmentation of HSR Remote Sensing Images