Abstract:Existing tunnel detection methods include crack and water‐leakage segmentation networks. However, if the automated detection algorithm cannot process all defect cases, manual detection is required to eliminate potential risks. The existing intelligent detection methods lack a universal method that can accurately segment all types of defects, particularly when multiple defects are superimposed. To address this issue, a defect segmentation model is proposed based on Vision Transformer (ViT), which is completely different from the network structure of a convolutional neural network. The model proposes an adapter and a decoding head to improve the training effect of the transformer encoder, allowing it to be fitted to small‐scale datasets. In post‐processing, a method is proposed to quantify the threat level for the defects, with the aim of outputting qualitative results that simulate human observation. The model showed impressive results on a real‐world dataset containing 11,781 defect images collected from a real subway tunnel. The visualizing results proved that this method is effective and has uniform criteria for single, multiple, and comprehensive defects. Moreover, the tests proved that the proposed model has a significant advantage in the case of multiple‐defect superposition, and it achieved 93.77%, 88.36%, and 92.93% for mean accuracy (Acc), mean intersection over union, and mean F1‐score, respectively. With similar training parameters, the Acc of the proposed method is improved by more than 10% over the DeepLabv3+, Mask R‐convolutional neural network, and UPerNet‐R50 models and by more than 5% over the Swin Transformer and ViT‐Adapter. This study implemented a general method that can process all defect cases and output the threat evaluation results, thereby making more intelligent tunnel detection.

A lightweight transformer with linear self‐attention for defect recognition

Defect-aware transformer network for intelligent visual surface defect detection

A Multi-level spatial feature fusion-based transformer for intelligent defect recognition with small samples toward smart manufacturing system

ViT-LSLA: Vision Transformer with Light Self-Limited-Attention

Investigating Lightweight Transformer Models for Defect Detection

ETDNet: Efficient Transformer-Based Detection Network for Surface Defect Detection

Defect transformer: An efficient hybrid transformer architecture for surface defect detection

Fine-grained insulator defect detection method based on vision-transformer

DHT: Dynamic Vision Transformer Using Hybrid Window Attention for Industrial Defect Images Classification

Surface defect detection and classification of steel using an efficient Swin Transformer

Lite Vision Transformer with Enhanced Self-Attention

SatFormer: Saliency-Guided Abnormality-Aware Transformer for Retinal Disease Classification in Fundus Image

Image segmentation using Vision Transformer for tunnel defect assessment

Small-scale defect detection in industrial environment based on lightweight deep learning network

Improved swin transformer-based defect detection method for transmission line patrol inspection images

CLFormer: A Lightweight Transformer Based on Convolutional Embedding and Linear Self-Attention With Strong Robustness for Bearing Fault Diagnosis Under Limited Sample Conditions

A Dynamic Transformer Network With Early Exit Mechanism for Fast Detection of Multiscale Surface Defects

A PV cell defect detector combined with transformer and attention mechanism

Cas-VSwin transformer: A variant swin transformer for surface-defect detection

LiConvFormer: A lightweight fault diagnosis framework using separable multiscale convolution and broadcast self-attention

DPiT: Detecting Defects of Photovoltaic Solar Cells With Image Transformers