A Survey on RGB, 3D, and Multimodal Approaches for Unsupervised Industrial Anomaly Detection

Yuxuan Lin,Yang Chang,Xuan Tong,Jiawen Yu,Antonio Liotta,Guofan Huang,Wei Song,Deyu Zeng,Zongze Wu,Yan Wang,Wenqiang Zhang
2024-10-29
Abstract:In the advancement of industrial informatization, Unsupervised Industrial Anomaly Detection (UIAD) technology effectively overcomes the scarcity of abnormal samples and significantly enhances the automation and reliability of smart manufacturing. While RGB, 3D, and multimodal anomaly detection have demonstrated comprehensive and robust capabilities within the industrial informatization sector, existing reviews on industrial anomaly detection have not sufficiently classified and discussed methods in 3D and multimodal settings. We focus on 3D UIAD and multimodal UIAD, providing a comprehensive summary of unsupervised industrial anomaly detection in three modal settings. Firstly, we compare our surveys with recent works, introducing commonly used datasets, evaluation metrics, and the definitions of anomaly detection problems. Secondly, we summarize five research paradigms in RGB, 3D and multimodal UIAD and three emerging industrial manufacturing optimization directions in RGB UIAD, and review three multimodal feature fusion strategies in multimodal settings. Finally, we outline the primary challenges currently faced by UIAD in three modal settings, and offer insights into future development directions, aiming to provide researchers with a thorough reference and offer new perspectives for the advancement of industrial informatization. Corresponding resources are available at <a class="link-external link-https" href="https://github.com/Sunny5250/Awesome-Multi-Setting-UIAD" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges faced by Unsupervised Industrial Anomaly Detection (UIAD) in industrial scenarios. Specifically, the author focuses on how to utilize RGB images, 3D point clouds, and multi - modal information to achieve effective unsupervised industrial anomaly detection. The following are the key problems that this paper attempts to solve: 1. **Scarcity of abnormal samples**: In actual industrial environments, abnormal situations are relatively rare, and it is difficult to obtain a large number of labeled abnormal samples for supervised learning. Therefore, a method that can identify abnormalities in the absence or with only a small amount of labeled data is required. 2. **Diversity of abnormal types**: The types and forms of industrial anomalies can be very diverse, and it is difficult to pre - define or label all possible abnormal types. Unsupervised learning methods can identify new types of anomalies that do not appear in the training data by learning the distribution of normal data, and have better generalization ability. 3. **Deficiencies in existing reviews**: Existing reviews on industrial anomaly detection have not fully classified and discussed methods in 3D and multi - modal environments. This paper aims to fill this gap and provide a comprehensive summary covering the research progress of RGB, 3D, and multi - modal UIAD. 4. **Multi - modal information fusion**: Single - modal information (such as RGB images) may not fully reflect the operating conditions of complex industrial systems, especially in the face of highly complex and changing industrial environments. Multi - modal information fusion can more comprehensively capture the system state and improve the accuracy and robustness of anomaly detection. 5. **Challenges in practical applications**: Under different modal settings (RGB, 3D, and multi - modal), UIAD methods face many challenges in data processing, algorithm design, and practical applications. This paper not only summarizes the deficiencies of existing research but also proposes future research directions, providing a new perspective for the development of industrial informatization. ### Formula Representation To ensure the correctness and readability of the formulas, the following are some key formulas and their explanations: - **Optimization problem**: \[ \theta^*=\arg\min_{\theta}\mathcal{L}(f(x\in\mathcal{X};\theta)) \] where $\theta$ represents the model parameters, $\theta^*$ represents the optimal model parameters, and $\mathcal{L}$ is the loss function for learning the normal data representation. - **Anomaly score calculation**: \[ s_{test}=f(x_{test};\theta^*) \] where $s_{test}$ is the anomaly score of the test sample $x_{test}$, reflecting the degree to which the sample deviates from the learned normal data distribution. - **Evaluation metrics**: - **Precision**: \[ P = \frac{TP}{TP + FP} \] - **Recall**: \[ R=\frac{TP}{TP + FN} \] - **F1 - score**: \[ F1=\frac{2(P\times R)}{P + R} \] Through these formulas, the paper elaborates on the theoretical basis and evaluation methods of unsupervised industrial anomaly detection.