SwinCrack: Pavement crack detection using convolutional swin-transformer network

Cheng Wang,Haibing Liu,Xiaoya An,Zhiqun Gong,Fei Deng
DOI: https://doi.org/10.1016/j.dsp.2023.104297
IF: 2.92
2024-02-01
Digital Signal Processing
Abstract:By leveraging deep learning methods, pavement crack detection can be more automatic, efficient, and accurate than manual inspection. To solve the problem of limited receptive field in pure CNN-based crack detection networks, we proposed an end-to-end detection network based on Swin-Transformer, called SwinCrack. SwinCrack can produce more accurate and continuous descriptions of pavement cracks by modeling long-range interactions and adaptive spatial aggregation compared to CNN-based detection models. Furthermore, to delineate crisp and accurate crack boundaries, we introduced convolution operations to Swin-Transformer for more local and detailed crack information. Convolutional Patch Embedding Layer (CPEL), Convolutional Swin-Transformer Block (CSTB), and Depth-convolution Forward Network (DFN) are proposed and embedded into SwinCrack to capture more spatial contexts. Also, Convolutional Attention Gated Skip Connection (CAGSC) is designed to suppress background interference in low-level features. Furthermore, five evaluation experiments on SwinCrack and an ablation study on the four proposed modules are performed. The attention maps of the SwinCrack are visualized to give a better insight into the contribution of each convolutional module embedded. Evaluation results show that SwinCrack gains OIS values of 0.781 to 0.849 and a maximum 4.4% improvement on OIS among the six public crack datasets.
engineering, electrical & electronic
What problem does this paper attempt to address?