SDD-DETR: Surface Defect Detection for No-Service Aero-Engine Blades with Detection Transformer
Xiangkun Sun,Kechen Song,Xin Wen,Yanyan Wang,Yunhui Yan
DOI: https://doi.org/10.1109/tase.2024.3457829
IF: 6.636
2024-01-01
IEEE Transactions on Automation Science and Engineering
Abstract:Vision-based surface defect detection (SDD) for no-service aero-engine blades provides a fast and effective way to monitor product quality. Most existing detection algorithms for aero-engine blades are 1) based on CNN, including artificially designed non-maximum suppression (NMS) operations, and 2) focus on improving the detection accuracy rather than improving the inference speed and even ignoring the latter. To solve the above problems, we introduce a novel object detection paradigm, DEtection TRansformer (DETR), to design a novel network (SDD-DETR) with high accuracy for the SDD of aero-engine blades. To our knowledge, the paper is the first to introduce the DETR detector to SDD of aero-engine blades. While providing high accuracy, the inference speed of DETR remained slow due to self-attention operation and feed-forward network (FFN). Therefore, two lightweight modules have been designed for SDD of aero-engine blades: a progressive feature input multi-scale deformable attention module (PFI-MSDA) and a lightweight FFN (LW-FFN). PFI-MSDA hierarchically reduces the number of tokens input to the self-attention module, thereby reducing the time complexity of the self-attention layer. LW-FFN shrinks the complexity of multilayer perceptron. In addition, no parameter sharing of the detection head is utilized to compensate for the accuracy drop caused by the lightweight. Experiments verify that our method has the same AP and F1-score as DINO (a DETR-based detector), but our approach is lighter. Compared with DINO, the FLOPs are reduced by 113.4 ${G}$ , the inference speed is increased by 42.4%, and the runtime memory usage is reduced by 5.9 ${G}$ , which allows our method to be trained on low-end GPUs with more batch size, further improving the training efficiency. The code is available at https://github.com/VDT-2048/SDD-DETR. Note to Practitioners —The motivation for this paper is to design a high-precision and high-inference speed visual detection method for the SDD of aero-engine blades. Most high-precision vision methods are based on the transformer framework. However, its high complexity and poor compatibility in deployment environments lead to slower detection speeds. Although the application object in this paper is aero-engine blades, it is also applicable in other fields of industry, such as rail detection, plate and strip steel detection, etc. However, the method proposed cannot be supported by deployment environments such as RKNN because of the deformable attention operator, so it takes a certain amount of time to be deployed and put into practical use. Currently, the frameworks of visual and language large models are based on transformers, consistent with the framework of our method, which makes extending our approach to large visual and multi-modal models more accessible.