Domain Generalization for 6D Pose Estimation Through NeRF-based Image Synthesis

Antoine Legrand,Renaud Detry,Christophe De Vleeschouwer
2024-07-15
Abstract:This work introduces a novel augmentation method that increases the diversity of a train set to improve the generalization abilities of a 6D pose estimation network. For this purpose, a Neural Radiance Field is trained from synthetic images and exploited to generate an augmented set. Our method enriches the initial set by enabling the synthesis of images with (i) unseen viewpoints, (ii) rich illumination conditions through appearance extrapolation, and (iii) randomized textures. We validate our augmentation method on the challenging use-case of spacecraft pose estimation and show that it significantly improves the pose estimation generalization capabilities. On the SPEED+ dataset, our method reduces the error on the pose by 50% on both target domains.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is **the domain generalization problem in 6D pose estimation**, especially in spacecraft pose estimation tasks. Specifically, due to the cost and complexity of obtaining real - world images, current pose estimation networks mainly rely on synthetic images for training. However, these synthetic images cannot perfectly capture the lighting conditions in the real world and the real textures of objects, resulting in the domain shift problem, that is, when making predictions on the target domain (such as real images), the accuracy of the network drops significantly. To solve this problem, the paper proposes a NeRF (Neural Radiance Field) - based image synthesis method to enhance the diversity of the training set, thereby improving the generalization ability of the pose estimation network. The images generated by this method can cover more diverse viewpoints, lighting conditions and textures, thus helping the network learn more robust features. ### Main contributions of the paper 1. **Proposed a new data augmentation method**: Use NeRF to learn from synthetic images and generate more diverse training images, including unseen viewpoints, rich lighting conditions and randomized textures. 2. **Verified the effectiveness of this method**: Experiments were carried out on the SPEED+ dataset, and the results show that this method can significantly improve the generalization ability of pose estimation and reduce the error by 50%. 3. **Applied in the challenging spacecraft pose estimation task**: Proved the effectiveness of this method in the very challenging domain generalization task of spacecraft pose estimation. ### Method overview - **Train NeRF**: First, use synthetic images to train a NeRF model so that it can generate new images. - **Generate augmented images**: - **Viewpoint coverage**: Generate images from different viewpoints through NeRF to increase the coverage of the SE(3) space. - **Appearance extrapolation**: Generate more diverse lighting conditions by interpolating or extrapolating the appearance embedding vectors in NeRF. - **Texture randomization**: Generate more diverse textures by adding random noise to the weights of the color MLP. - **Train pose estimation network**: Use the original synthetic images and the augmented images generated by NeRF to jointly train the pose estimation network. ### Experimental results The paper carried out experiments on the SPEED+ dataset to verify the effectiveness of the proposed method. The results show that the augmented images generated by NeRF significantly improve the generalization ability of the pose estimation network, especially when dealing with real images. In conclusion, by introducing the NeRF - based image synthesis method, this paper successfully solves the domain generalization problem in 6D pose estimation and has achieved remarkable results in practical applications.