Jincen Jiang,Qianyu Zhou,Yuhang Li,Xuequan Lu,Meili Wang,Lizhuang Ma,Jian Chang,Jian Jun Zhang
Abstract:Recent point cloud understanding research suffers from performance drops on unseen data, due to the distribution shifts across different domains. While recent studies use Domain Generalization (DG) techniques to mitigate this by learning domain-invariant features, most are designed for a single task and neglect the potential of testing data. Despite In-Context Learning (ICL) showcasing multi-task learning capability, it usually relies on high-quality context-rich data and considers a single dataset, and has rarely been studied in point cloud understanding. In this paper, we introduce a novel, practical, multi-domain multi-task setting, handling multiple domains and multiple tasks within one unified model for domain generalized point cloud understanding. To this end, we propose Domain Generalized Point-In-Context Learning (DG-PIC) that boosts the generalizability across various tasks and domains at testing time. In particular, we develop dual-level source prototype estimation that considers both global-level shape contextual and local-level geometrical structures for representing source domains and a dual-level test-time feature shifting mechanism that leverages both macro-level domain semantic information and micro-level patch positional relationships to pull the target data closer to the source ones during the testing. Our DG-PIC does not require any model updates during the testing and can handle unseen domains and multiple tasks, \textit{i.e.,} point cloud reconstruction, denoising, and registration, within one unified model. We also introduce a benchmark for this new setting. Comprehensive experiments demonstrate that DG-PIC outperforms state-of-the-art techniques significantly.
What problem does this paper attempt to address?
The problems that this paper attempts to solve are as follows: Currently, the performance of point cloud understanding research declines on unseen data. Due to the distribution differences between different domains (i.e., domain gap), it is difficult for models to generalize to new datasets. In addition, most of the existing methods are designed for single tasks and ignore the potential value of test data. To solve these problems, the paper proposes a novel multi - domain, multi - task setting and introduces a new framework named DG - PIC (Domain Generalized Point - In - Context Learning).
### Specific description of the problems
1. **Domain gap**: Existing point cloud understanding methods are usually trained and tested on a specific dataset, which makes it difficult for them to handle unseen data from different domains (e.g., synthetic data and real - world data). For example, a model trained on ModelNet40 (synthetic data) may not be able to handle the complex and noisy data in ScanObjectNN (real - world data) well.
2. **Single - task limitation**: Most of the existing domain generalization (DG) methods focus on learning domain - invariant features, but these methods are usually designed for single tasks and lack the ability to handle multiple tasks. This means that if different tasks (such as reconstruction, denoising, and registration) need to be handled, models need to be trained separately for each task, which is inefficient and inflexible.
3. **Ignoring the value of test data**: Many existing DG methods mainly focus on learning during training and ignore the potential of test data as a valuable resource, thus affecting the generalization ability of the model.
### DG - PIC solutions
To address the above challenges, DG - PIC proposes the following innovations:
1. **Multi - domain, multi - task setting**: DG - PIC designs a unified model that can handle multiple domains and multiple tasks simultaneously, thereby improving the generalization ability and flexibility of the model.
2. **Two - layer source prototype estimation**: By considering the shape context at the global level and the geometric structure at the local level, DG - PIC develops a two - layer source prototype estimation module to better represent the source domain.
3. **Two - layer feature transfer mechanism at test time**: This mechanism uses the domain semantic information at the macro level and the patch position relationship at the micro level to bring the target data closer to the source domain, so as to improve the generalization ability without updating the model at test time.
4. **New benchmark dataset**: To evaluate the performance of this new setting, the author introduces a new benchmark dataset containing four different datasets (two synthetic datasets and two real - world datasets) and generates the corresponding ground truths for three different tasks (reconstruction, denoising, and registration).
Through these innovations, DG - PIC significantly outperforms existing methods in the multi - domain, multi - task setting, demonstrating its strong generalization ability and superior performance.