Towards Zero-shot Point Cloud Anomaly Detection: A Multi-View Projection Framework

Yuqi Cheng,Yunkang Cao,Guoyang Xie,Zhichao Lu,Weiming Shen

2024-09-20

Abstract:Detecting anomalies within point clouds is crucial for various industrial applications, but traditional unsupervised methods face challenges due to data acquisition costs, early-stage production constraints, and limited generalization across product categories. To overcome these challenges, we introduce the Multi-View Projection (MVP) framework, leveraging pre-trained Vision-Language Models (VLMs) to detect anomalies. Specifically, MVP projects point cloud data into multi-view depth images, thereby translating point cloud anomaly detection into image anomaly detection. Following zero-shot image anomaly detection methods, pre-trained VLMs are utilized to detect anomalies on these depth images. Given that pre-trained VLMs are not inherently tailored for zero-shot point cloud anomaly detection and may lack specificity, we propose the integration of learnable visual and adaptive text prompting techniques to fine-tune these VLMs, thereby enhancing their detection performance. Extensive experiments on the MVTec 3D-AD and Real3D-AD demonstrate our proposed MVP framework's superior zero-shot anomaly detection performance and the prompting techniques' effectiveness. Real-world evaluations on automotive plastic part inspection further showcase that the proposed method can also be generalized to practical unseen scenarios. The code is available at <a class="link-external link-https" href="https://github.com/hustCYQ/MVP-PCLIP" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper aims to address several key issues in point cloud anomaly detection: 1. **High data acquisition cost**: Traditional unsupervised methods require a large amount of normal point cloud data for training, but obtaining high-precision point cloud data is very expensive. 2. **Cold start problem**: In the early stages of production, there may not be enough normal point cloud data, making it difficult to apply unsupervised learning methods. 3. **Poor category generalization ability**: Unsupervised point cloud anomaly detection methods are usually targeted at specific categories of products and lack generalization ability to unseen categories. To solve the above problems, the authors propose a new framework—Multi-View Projection (MVP). This framework converts point cloud data into multi-view depth images and uses pre-trained Vision-Language Models (VLMs) for zero-shot anomaly detection. Specifically: - MVP projects point cloud data into multi-view depth images, thus transforming the point cloud anomaly detection problem into an image anomaly detection problem. - Pre-trained VLMs are used to detect anomalies in these depth images. - Since VLMs are not specifically designed for zero-shot point cloud anomaly detection, the authors further introduce learnable Visual Prompts and Adaptive Text Prompts to fine-tune these VLMs to improve their detection performance. Experimental results show that the MVP framework exhibits excellent zero-shot anomaly detection performance on the MVTec 3D-AD and Real3D-AD datasets, particularly demonstrating good generalization ability in the actual detection of automotive plastic parts. Additionally, the proposed MVP-PCLIP method significantly enhances the model's detection performance by introducing learnable prompts.

Towards Zero-shot Point Cloud Anomaly Detection: A Multi-View Projection Framework

PointAD: Comprehending 3D Anomalies from Points and Pixels for Zero-shot 3D Anomaly Detection

Towards Zero-shot 3D Anomaly Localization

Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly Detection

A Diffusion-Based Framework for Multi-Class Anomaly Detection

PTMNet: Pixel-Text Matching Network for Zero-Shot Anomaly Detection

Complementary Pseudo Multimodal Feature for Point Cloud Anomaly Detection

See More and Know More: Zero-shot Point Cloud Segmentation via Multi-modal Visual Data

CLIP3D-AD: Extending CLIP for 3D Few-Shot Anomaly Detection with Multi-View Images Generation

Point Cloud Video Anomaly Detection Based on Point Spatio-Temporal Auto-Encoder

Uni-3DAD: GAN-Inversion Aided Universal 3D Anomaly Detection on Model-free Products

AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection

Exploiting GPT-4 Vision for Zero-shot Point Cloud Understanding

AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios

APRIL-GAN: A Zero-/Few-Shot Anomaly Classification and Segmentation Method for CVPR 2023 VAND Workshop Challenge Tracks 1&2: 1st Place on Zero-shot AD and 4th Place on Few-shot AD

VMAD: Visual-enhanced Multimodal Large Language Model for Zero-Shot Anomaly Detection

Towards High-resolution 3D Anomaly Detection via Group-Level Feature Contrastive Learning

Toward Unsupervised 3D Point Cloud Anomaly Detection using Variational Autoencoder

MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images

FADE: Few-shot/zero-shot Anomaly Detection Engine using Large Vision-Language Model