Abstract:Existing tunnel detection methods include crack and water‐leakage segmentation networks. However, if the automated detection algorithm cannot process all defect cases, manual detection is required to eliminate potential risks. The existing intelligent detection methods lack a universal method that can accurately segment all types of defects, particularly when multiple defects are superimposed. To address this issue, a defect segmentation model is proposed based on Vision Transformer (ViT), which is completely different from the network structure of a convolutional neural network. The model proposes an adapter and a decoding head to improve the training effect of the transformer encoder, allowing it to be fitted to small‐scale datasets. In post‐processing, a method is proposed to quantify the threat level for the defects, with the aim of outputting qualitative results that simulate human observation. The model showed impressive results on a real‐world dataset containing 11,781 defect images collected from a real subway tunnel. The visualizing results proved that this method is effective and has uniform criteria for single, multiple, and comprehensive defects. Moreover, the tests proved that the proposed model has a significant advantage in the case of multiple‐defect superposition, and it achieved 93.77%, 88.36%, and 92.93% for mean accuracy (Acc), mean intersection over union, and mean F1‐score, respectively. With similar training parameters, the Acc of the proposed method is improved by more than 10% over the DeepLabv3+, Mask R‐convolutional neural network, and UPerNet‐R50 models and by more than 5% over the Swin Transformer and ViT‐Adapter. This study implemented a general method that can process all defect cases and output the threat evaluation results, thereby making more intelligent tunnel detection.

Vision transformer-based autonomous crack detection on asphalt and concrete surfaces

Two-Stream Boundary-Aware Neural Network for Concrete Crack Segmentation and Quantification

Image-based Concrete Crack Detection in Tunnels Using Deep Fully Convolutional Networks

A Detection and Classification Method of Asphalt Pavement Crack based on Vision Transformer

Enhancing Road Crack Localization for Sustainable Road Safety Using HCTNet

Improving the Concrete Crack Detection Process via a Hybrid Visual Transformer Algorithm

Deep learning approaches for autonomous crack detection in concrete wall, brick deck and pavement

CrackViT: a unified CNN-transformer model for pixel-level crack extraction

Vision Mamba-based autonomous crack segmentation on concrete, asphalt, and masonry surfaces

Detection of Pavement Cracks by Deep Learning Models of Transformer and UNet

Bridging Convolutional Neural Networks and Transformers for Efficient Crack Detection in Concrete Building Structures

A novel transformer-based network with attention mechanism for automatic pavement crack detection

Image segmentation using Vision Transformer for tunnel defect assessment

Structural Crack Detection Using Deep Learning–based Fully Convolutional Networks

Robust crack detection in masonry structures with Transformers

Automatic crack detection on concrete and asphalt surfaces using semantic segmentation network with hierarchical Transformer

Visual Detection of Road Cracks for Autonomous Vehicles Based on Deep Learning

CrackUNet: A Novel Network with Joint Network-in-network Structure and Deformable Convolution for Pavement Crack Detection

CrackT-net: a Method of Convolutional Neural Network and Transformer for Crack Segmentation

CNN-Transformer hybrid network for concrete dam crack patrol inspection

An average pooling designed Transformer for robust crack segmentation