AENet: attention enhancement network for industrial defect detection in complex and sensitive scenarios

Yi Wan,Lingjie Yi,Bo Jiang,Junfan Chen,Yi Jiang,Xianzhong Xie
DOI: https://doi.org/10.1007/s11227-024-05898-0
IF: 3.3
2024-02-04
The Journal of Supercomputing
Abstract:Conventional image processing and machine learning based on handcrafted features struggle to meet the real time and high-accuracy requirements for industrial defect detection in complex, sensitive, and dynamic environments. To address this issue, this paper proposes AENet, a novel real-time defect detection network based on an encoder-decoder model, which achieves high detection accuracy and efficiency while demonstrating excellent convergence and generalization. Firstly, a spatial channel attention module in the encoding network is designed to exploit both spatial attention and channel attention using a multi-head 3D self-attention mechanism. This improves parallelism and detection efficiency. Secondly, the decoding network of AENet incorporates the cross-level attention fusion module, which fuses input features from different layers. Combined with multi-level upsampling design, the decoder enhances the representation of defect details. Furthermore, we insert a simplified aggregator into the encoder-decoder network to extract feature information at different scales with low computational cost. This aggregation process aids in training and inference on industrial defect datasets by incorporating contextual information. Extensive experimental results demonstrate that AENet outperforms other segmentation models in accomplishing defect recognition and segmentation in challenging optical environments. It exhibits a faster convergence than other networks and a balance between accuracy and speed. It achieves a recognition accuracy of over 96% for almost all types of defects in the actual industrial environment on the NVIDIA Tesla V100 GPU.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?