Abstract:This paper investigates the 3D domain generalization (3DDG) ability of large 3D models based on prevalent prompt learning. Recent works demonstrate the performances of 3D point cloud recognition can be boosted remarkably by parameter-efficient prompt tuning. However, we observe that the improvement on downstream tasks comes at the expense of a severe drop in 3D domain generalization. To resolve this challenge, we present a comprehensive regulation framework that allows the learnable prompts to actively interact with the well-learned general knowledge in large 3D models to maintain good generalization. Specifically, the proposed framework imposes multiple explicit constraints on the prompt learning trajectory by maximizing the mutual agreement between task-specific predictions and task-agnostic knowledge. We design the regulation framework as a plug-and-play module to embed into existing representative large 3D models. Surprisingly, our method not only realizes consistently increasing generalization ability but also enhances task-specific 3D recognition performances across various 3DDG benchmarks by a clear margin. Considering the lack of study and evaluation on 3DDG, we also create three new benchmarks, namely base-to-new, cross-dataset and few-shot generalization benchmarks, to enrich the field and inspire future research. Code and benchmarks are available at \url{<a class="link-external link-https" href="https://github.com/auniquesun/Point-PRC" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The main problem this paper attempts to address is the insufficient domain generalization (DG) capability of 3D point cloud models. Specifically, although lightweight prompt tuning can significantly enhance the performance of 3D point cloud recognition tasks, this improvement often comes at the cost of the model's generalization ability in unseen domains (such as new classes, cross-dataset, and few-shot scenarios). The paper proposes a comprehensive regulation framework (Point-PRC) aimed at actively interacting with the general knowledge in large 3D pre-trained models, while simultaneously improving task-specific performance and task-agnostic generalization ability. ### Main Problems: 1. **Overfitting Issue Brought by Lightweight Prompt Tuning**: While lightweight prompt tuning can significantly enhance performance on specific tasks, this improvement usually leads to a decline in the model's generalization ability in unseen domains. 2. **Lack of Systematic Research and Evaluation Benchmarks for 3D Domain Generalization**: Existing 3D domain generalization research and evaluation benchmarks are not comprehensive enough to fully assess the model's generalization ability in real-world scenarios. ### Solutions: 1. **Propose the Point-PRC Framework**: This framework includes three core components: - **Mutual Agreement Constraint**: Ensures that the learned prompts are consistent with the general knowledge of the pre-trained model, avoiding the forgetting of task-agnostic knowledge. - **Text Diversity Constraint**: Utilizes diverse text descriptions to guide lightweight prompt tuning, enhancing the model's transferability. - **Model Ensemble Constraint**: Synthesizes the opinions of different models through weighted voting, avoiding extreme cases and failure scenarios of a single model. 2. **Create New Evaluation Benchmarks**: To more comprehensively evaluate 3D domain generalization ability, the paper creates three new benchmarks: - **Base-to-New Generalization**: Evaluates the model's performance on unseen new classes. - **Cross-Dataset Generalization**: Assesses the model's generalization ability across different datasets. - **Few-Shot Generalization**: Evaluates the model's generalization ability under extremely low data conditions. ### Experimental Results: - **Base-to-New Generalization**: Experimental results show that after incorporating the Point-PRC framework, the model's recognition accuracy on unseen new classes significantly improves, while performance on base classes also gets better. - **Cross-Dataset Generalization**: In scenarios such as OOD generalization and data corruption, the Point-PRC framework also performs excellently, enhancing the model's performance in different target domains. - **Few-Shot Generalization**: Under few-shot conditions, the Point-PRC framework effectively improves the model's generalization ability. In summary, by proposing the Point-PRC framework, this paper not only addresses the overfitting issue brought by lightweight prompt tuning but also advances the research in the 3D domain generalization field by creating new evaluation benchmarks.

Point-PRC: A Prompt Learning Based Regulation Framework for Generalizable Point Cloud Analysis

Point-to-Pixel Prompting for Point Cloud Analysis With Pre-Trained Image Models

Push-and-Pull: A General Training Framework with Differential Augmentor for Domain Generalized Point Cloud Classification

Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training

DeepICP: An End-to-End Deep Neural Network for 3D Point Cloud Registration

PRA-Net: Point Relation-Aware Network for 3D Point Cloud Analysis

PointDGMamba: Domain Generalization of Point Cloud Classification via Generalized State Space Model

PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training

PointCG: Self-supervised Point Cloud Learning via Joint Completion and Generation

Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models

A Unified Framework for 3D Point Cloud Visual Grounding

Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning

RegGeoNet: Learning Regular Representations for Large-Scale 3D Point Clouds

Point-GCC: Universal Self-supervised 3D Scene Pre-training via Geometry-Color Contrast

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud

Robust 3D Point Cloud Recognition: Enhancing Robustness with GPT-4 and CLIP Integration

PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning

Once-Training-All-Fine: No-Reference Point Cloud Quality Assessment via Domain-relevance Degradation Description

DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding

Masked Local-Global Representation Learning for 3D Point Cloud Domain Adaptation