PointDGMamba: Domain Generalization of Point Cloud Classification via Generalized State Space Model

Hao Yang,Qianyu Zhou,Haijia Sun,Xiangtai Li,Fengqi Liu,Xuequan Lu,Lizhuang Ma,Shuicheng Yan
2024-08-24
Abstract:Domain Generalization (DG) has been recently explored to improve the generalizability of point cloud classification (PCC) models toward unseen domains. However, they often suffer from limited receptive fields or quadratic complexity due to the use of convolution neural networks or vision Transformers. In this paper, we present the first work that studies the generalizability of state space models (SSMs) in DG PCC and find that directly applying SSMs into DG PCC will encounter several challenges: the inherent topology of the point cloud tends to be disrupted and leads to noise accumulation during the serialization stage. Besides, the lack of designs in domain-agnostic feature learning and data scanning will introduce unanticipated domain-specific information into the 3D sequence data. To this end, we propose a novel framework, PointDGMamba, that excels in strong generalizability toward unseen domains and has the advantages of global receptive fields and efficient linear complexity. PointDGMamba consists of three innovative components: Masked Sequence Denoising (MSD), Sequence-wise Cross-domain Feature Aggregation (SCFA), and Dual-level Domain Scanning (DDS). In particular, MSD selectively masks out the noised point tokens of the point cloud sequences, SCFA introduces cross-domain but same-class point cloud features to encourage the model to learn how to extract more generalized features. DDS includes intra-domain scanning and cross-domain scanning to facilitate information exchange between features. In addition, we propose a new and more challenging benchmark PointDG-3to1 for multi-domain generalization. Extensive experiments demonstrate the effectiveness and state-of-the-art performance of our presented PointDGMamba.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the generalization ability of the model to unseen domains in the point cloud classification (PCC) task. Specifically, existing point cloud classification methods usually encounter performance degradation when dealing with unseen domains, mainly due to domain shifts between different domains, such as shifts caused by differences in sensor types, scanning angles, and environmental conditions. These domain shifts lead to poor performance of existing models in new environments, especially when it is necessary to generalize from multiple source domains to one or more target domains. To meet this challenge, the author proposes a new framework - PointDGMamba, which is based on the state - space model (SSM) and aims to improve the domain generalization ability in the point cloud classification task. PointDGMamba achieves this goal through the following three innovative components: 1. **Masked Sequence Denoising (MSD)**: Selectively mask noisy points in the point cloud sequence, thereby reducing the impact of noise accumulation during the serialization stage. This process not only preserves the basic features of the point cloud but also ensures that the denoised sequence can highly represent the original structure. 2. **Sequence - wise Cross - domain Feature Aggregation (SCFA)**: Aggregate cross - domain but same - class point cloud features to promote the model to extract more generalized features. By introducing a Global Prompt, further avoid unexpected domain - specific information in the sequence data. 3. **Dual - level Domain Scanning (DDS)**: It includes Intra - domain Scanning and Cross - domain Scanning to promote sufficient information interaction between different feature parts. This design helps to convert 3D point cloud data into 1D sequence data suitable for the Mamba model, especially in changing unseen domains. Through these innovations, PointDGMamba performs excellently in multiple benchmark tests, especially on the more challenging multi - domain generalization benchmark PointDG - 3to1 proposed by the author, demonstrating its superior generalization ability and performance.