CrackFormer Network for Pavement Crack Segmentation

Huajun Liu,Jing Yang,Xiangyu Miao,Christoph Mertz,Hui Kong
DOI: https://doi.org/10.1109/tits.2023.3266776
IF: 8.5
2023-01-01
IEEE Transactions on Intelligent Transportation Systems
Abstract:In this paper, we rethink our earlier work on self-attention based crack segmentation, and propose an upgraded CrackFormer network (CrackFormer-II) for pavement crack segmentation, instead of only for fine-grained crack-detection tasks. This work embeds novel Transformer encoder modules into a SegNet-like encoder-decoder structure, where the basic module is composed of novel Transformer encoder blocks with effective relative positional embedding and long range interactions to extract efficient contextual information from feature-channels. Further, fusion modules of scaling-attention are proposed to integrate the results of each respective encoder and decoder block to highlight semantic features and suppress non-semantic ones. Moreover, we update the Transformer encoder blocks enhanced by the local feed-forward layer and skip-connections, and optimize the channel configurations to compress the model parameters. Compared with the original CrackFormer, the CrackFormer-II is trained and evaluated on more general crack datasets. It achieves higher accuracy than the original CrackFormer, and the state-of-the-art (SOTA) method with $6.7 \times $ fewer FLOPs and $6.2 \times $ fewer parameters, and its practical inference speed is comparable to most classical CNN models. The experimental results show that it achieves the F-measures on Optimal Dataset Scale (ODS) of 0.912, 0.908, 0.914 and 0.869, respectively, on the four benchmarks. Codes are available at https://github.com/LouisNUST/CrackFormer-II .
What problem does this paper attempt to address?