Eswin-Unet: A Collaborative Model for Industrial Surface Defect Detection

Helei Cui,Tao Xing,Jiaju Ren,Yaxing Chen,Zhiwen Yu,Bin Guo,Xiaobing Guo
DOI: https://doi.org/10.1109/icpads56603.2022.00056
2022-01-01
Abstract:Surface inspection of industrial equipment defection plays a vital role in real production. Traditional inspection routines require a large number of inspection workers, which not only affects production efficiency but also leads to unreliable results. Computer vision-based detection approaches, e.g., using the deep learning method, have shown great potential in this trend. Specifically, the semantic segmentation algorithm based on Convolutional Neural Network (CNN) can extract relatively complete feature information. And the Transformer, which emerged from the field of Natural Language Processing (NLP), also performs well in maintaining and transmitting semantic information. In light of these, we propose to design a segmentation model called eSwin-UNet, i.e., enhanced Swin-UNet, that leverages the advantages of the CNN and Transformer. It uses multi-scale information fusion to better integrate the feature information in the CNN and Transformer branches. Moreover, it also utilizes deep supervision and makes two branches for collaborative training to further improve accuracy. By testing with the MVTec ITODD dataset, Fl-Score and Jaccard achieve results of 0.7891 and 0.6516 respectively, which outperform most current models.
What problem does this paper attempt to address?