Yolo-sd: simulated feature fusion for few-shot industrial defect detection based on YOLOv8 and stable diffusion

Yihao Wen,Li Wang
DOI: https://doi.org/10.1007/s13042-024-02175-7
2024-05-02
International Journal of Machine Learning and Cybernetics
Abstract:Defect detection from images, an important application in the development of the industrial internet, has been gaining increasing attention due to its close relationship with product quality in industrial production. However, two major challenges in defect detection persist: (1) Limited availability of datasets. Deep learning-based models typically require large-scale training sets to achieve satisfactory detection results. (2) Insufficient image quality and low detection accuracy. When many object detection methods are applied to industrial defect detection, they often exhibit poor performance in handling unclear boundaries, complex backgrounds, noise, and textures. In this study, we propose an advanced defect detection method based on YOLO and Stable Diffusion (YOLO-SD). For the few-shot dataset, a controllable generation module is designed, that integrates CLIP, LoRA, and ControlNet based on Stable Diffusion. Among them, CLIP text inversion can generate the most suitable prompt words from the defect dataset, providing prompt input for Stable Diffusion. LoRA can intervene and adjust the image style of Stable Diffusion by training on the defect dataset in a fine-tuning way. ControlNet obtains boundary and depth maps through HED and Midas. For the insufficient image quality and low detection accuracy, an improved YOLO model with an attention-based Fusion Simulated Feature module (FSF) is built that extracts defect features of the original images and generated images, which provides richer semantic information to improve the detection accuracy. At the same time, in order to make the model lightweight, we introduce a test optimization strategy to improve the model training process. Extensive experiments on the NEU-DET steel defect dataset show that the images generated by our method can expand the dataset to train the model and achieve a certain improvement in defect detection.
computer science, artificial intelligence
What problem does this paper attempt to address?