Steel surface defect detection based on self-supervised contrastive representation learning with matching metric

Xuejin Hu,Jing Yang,Fengling Jiang,Amir Hussain,Kia Dashtipour,Mandar Gogate
DOI: https://doi.org/10.1016/j.asoc.2023.110578
2023-07-02
Applied Soft Computing Journal
Abstract:Defect detection is crucial in the quality control of industrial applications. Existing supervised methods are heavily reliant on the large amounts of labeled data. However, labeled data in some specific fields are still scarce, and it requires professionals to do expensive manual annotations. In this paper, we construct a novel self-supervised steel surface defect detection model by learning better embedding feature representation of the defect on large amounts of unlabeled data, which can achieve excellent results in downstream detection tasks. Commonly used image embeddings strategies in self-supervised contrastive learning methods destroy the spatial structures of the image and are not suitable for pre-training of object detection. To address the aforementioned issue, we preserve convolutional feature maps to mine robust data structures and local features, which can enhance the representation capability of the upstream model and make it applicable for transfer to object detection tasks. Besides, in order to eliminate the effect of random augmentations of contrastive learning, which can introduce noise on multi-target coexistence datasets, the Earth Mover's Distance (EMD) metric is employed to evaluate the contrastive matching similarity. Finally, a Self-supervised Contrastive Representation Learning framework with EMD (SCRL-EMD) is constructed through learning on large-scale unlabeled data and then transferred to Faster R-CNN and RetinaNet for detection performance validation on two public steel defect datasets. Comparative experimental results show that our method can achieve superior results than the state-of-the-art approaches. Compared to the baseline model, it achieves 4.1% and 6.8% mAP improvement on the two datasets, respectively. More importantly, a further improvement can be achieved on a smaller downstream dataset, revealing the meaningful potential of our method in exploiting more readily available unlabeled data.
What problem does this paper attempt to address?