Abstract:Cracks are a typical form of road damage, and accurate detection of cracks is of great significance for road maintenance work and ensuring traffic safety. Recently, computer vision has gradually been applied in the field of crack segmentation. However, there are still some extremely challenging problems in crack segmentation, such as complex backgrounds, information loss caused by pooling and convolution operations, and insufficient fusion of global and local semantic information. In response to the above problems, this paper proposes a dual-encoding-path network with U-Net architecture called FCT-Net, by fusing channel atrous spatial pyramid pooling (CASPP) and transformer. Specifically, CASPP obtains multi-scale receptive fields by incorporating spatial and channel attention, while refining and extracting local features. Meanwhile, we introduce long-short distance attention to construct a novel transformer with the prominent characteristic of interaction between local and global attention features. In addition, a residual convolution module is designed to enhance the local features of the transformer. Furthermore, we devise a multi-scale attention weight cross fusion module to aggregate the features of the dual encoding branch, for reducing information loss during downsampling and suppress background information. Eventually, we evaluate the performance of FCT-Net by experiments on three public datasets. Extensive experimental results show that FCT-Net achieves higher F1-score and mean intersection over union (mIoU) than state-of-the-art segmentation networks on the DeepCrack537 and CrackLS315 datasets. Meanwhile, it has excellent segmentation performance for cracks in complex scenes, with the highest recall, F1-score, and mIoU respectively as 85.64%, 81.67%, and 84.05% on the CrackTree260 dataset.

Dual attention transformer network for pixel-level concrete crack segmentation considering camera placement

Two-Stream Boundary-Aware Neural Network for Concrete Crack Segmentation and Quantification

Image-based Concrete Crack Detection in Tunnels Using Deep Fully Convolutional Networks

FCT-Net: A dual-encoding-path network fusing atrous spatial pyramid pooling and transformer for pavement crack detection

Cracklab: A high-precision and efficient concrete crack segmentation and quantification network

PCTC-Net: A Crack Segmentation Network with Parallel Dual Encoder Network Fusing Pre-Conv-Based Transformers and Convolutional Neural Networks

Dual-path network combining CNN and transformer for pavement crack segmentation

CrackT-net: a Method of Convolutional Neural Network and Transformer for Crack Segmentation

A Convolutional-Transformer Network for Crack Segmentation with Boundary Awareness

A novel transformer-based network with attention mechanism for automatic pavement crack detection

Tiny‐Crack‐Net: A multiscale feature fusion network with attention mechanisms for segmentation of tiny cracks

CrackNet: A Hybrid Model for Crack Segmentation with Dynamic Loss Function

CrackViT: a unified CNN-transformer model for pixel-level crack extraction

UTE-CrackNet: transformer-guided and edge feature extraction U-shaped road crack image segmentation

Multi-scale feature fusion for pavement crack detection based on Transformer

A Road Crack Segmentation Method Based on Transformer and Multi-Scale Feature Fusion

Bridging Convolutional Neural Networks and Transformers for Efficient Crack Detection in Concrete Building Structures

Unifying transformer and convolution for dam crack detection

EfficientCrackNet: A Lightweight Model for Crack Segmentation

Segmentation Network of Concrete Cracks with Multi-Frequency OctaveRes Dual Encoder and Cross-Attention Mechanism Optimized by Average Weight

An average pooling designed Transformer for robust crack segmentation