MINet: Multi-scale Interactive Network for Real-time Salient Object Detection of Strip Steel Surface Defects

Kunye Shen,Xiaofei Zhou,Zhi Liu
DOI: https://doi.org/10.1109/TII.2024.3366221
2024-05-25
Abstract:The automated surface defect detection is a fundamental task in industrial production, and the existing saliencybased works overcome the challenging scenes and give promising detection results. However, the cutting-edge efforts often suffer from large parameter size, heavy computational cost, and slow inference speed, which heavily limits the practical applications. To this end, we devise a multi-scale interactive (MI) module, which employs depthwise convolution (DWConv) and pointwise convolution (PWConv) to independently extract and interactively fuse features of different scales, respectively. Particularly, the MI module can provide satisfactory characterization for defect regions with fewer parameters. Embarking on this module, we propose a lightweight Multi-scale Interactive Network (MINet) to conduct real-time salient object detection of strip steel surface defects. Comprehensive experimental results on SD-Saliency-900 dataset, which contains three kinds of strip steel surface defect detection images (i.e., inclusion, patches, and scratches), demonstrate that the proposed MINet presents comparable detection accuracy with the state-of-the-art methods while running at a GPU speed of 721FPS and a CPU speed of 6.3FPS for 368*368 images with only 0.28M parameters. The code is available at <a class="link-external link-https" href="https://github.com/Kunye-Shen/MINet" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address several key issues in strip steel surface defect detection: 1. **Real-time Performance**: Existing saliency-based defect detection methods, although performing well in some scenarios, often have large parameter sizes, high computational costs, and slow inference speeds. These factors severely limit their application in actual industrial production. Therefore, the paper proposes a lightweight multi-scale interaction network (MINet) to achieve real-time detection of strip steel surface defects. 2. **Detection Accuracy**: In the task of strip steel surface defect detection, challenges such as low contrast between defects and background, diverse defect shapes and scales, and poor lighting conditions make it difficult to improve detection accuracy. The paper designs a multi-scale interaction module (MI module) to effectively extract and fuse features of different scales, thereby improving detection accuracy. 3. **Computational Efficiency**: In industrial equipment, storage and computational capabilities are limited, necessitating the design of a model with low computational overhead. The paper uses depthwise separable convolution (DSConv) and multi-scale strategies to significantly reduce computational costs while maintaining high detection performance. ### Main Contributions 1. **Multi-scale Interaction Module (MI Module)**: The paper proposes a new multi-scale interaction module that embeds multi-scale strategies into DSConv. By using depthwise convolution (DWConv) and pointwise convolution (PWConv) to extract and aggregate multi-scale features, the module effectively describes defect regions. 2. **Real-time Backbone Network**: Based on the MI module, the paper constructs a real-time backbone network consisting of five encoder blocks, with MI modules arranged linearly within each block to achieve low latency. This backbone network can extract effective multi-scale contextual features. 3. **Lightweight Saliency Model (MINet)**: The paper proposes a lightweight multi-scale interaction network (MINet) for real-time strip steel surface defect detection. MINet adopts an encoder-decoder architecture, with the encoder using MI modules and the decoder containing multiple DSConv layers. Experimental results show that MINet achieves a speed of 721 FPS on an NVIDIA GTX 2080Ti GPU and 6.3 FPS on an i9-9900X CPU, with only 0.28M parameters and 0.30G FLOPs. ### Experimental Validation The paper conducts comprehensive experiments on the SD-Saliency-900 dataset, which contains three types of strip steel surface defects (inclusions, spots, and scratches). Experimental results demonstrate that MINet achieves comparable detection accuracy to existing state-of-the-art methods while offering higher running speed and fewer parameters. ### Conclusion By designing a multi-scale interaction module and a real-time backbone network, the paper successfully addresses the issues of real-time performance and computational efficiency in strip steel surface defect detection, providing strong support for practical applications in industrial production.