Abstract:We address the problem of estimating the relative 6D pose, i.e., position and orientation, of a target spacecraft, from a monocular image, a key capability for future autonomous Rendezvous and Proximity Operations. Due to the difficulty of acquiring large sets of real images, spacecraft pose estimation networks are exclusively trained on synthetic ones. However, because those images do not capture the illumination conditions encountered in orbit, pose estimation networks face a domain gap problem, i.e., they do not generalize to real images. Our work introduces a method that bridges this domain gap. It relies on a novel, end-to-end, neural-based architecture as well as a novel learning strategy. This strategy improves the domain generalization abilities of the network through multi-task learning and aggressive data augmentation policies, thereby enforcing the network to learn domain-invariant features. We demonstrate that our method effectively closes the domain gap, achieving state-of-the-art accuracy on the widespread SPEED+ dataset. Finally, ablation studies assess the impact of key components of our method on its generalization abilities.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the 6D attitude estimation problem of on - orbit spacecraft, that is, to estimate the position and orientation of the target spacecraft relative to the servicing spacecraft from monocular images. This is a key capability for future autonomous rendezvous and proximity operations (RPOs). ### Main Challenges 1. **Domain Gap Problem**: Due to the difficulty in obtaining a large number of real - life images, the existing attitude estimation networks can only be trained on synthetic images. However, these synthetic images cannot capture the illumination conditions in orbit, resulting in poor generalization performance of the attitude estimation network on real - life images. 2. **Complex Illumination Conditions**: The illumination conditions in orbit are different from those on the earth. There is no atmospheric diffusion, resulting in high image contrast, and the surface materials of spacecraft are highly reflective, which further exacerbates this problem. 3. **Symmetry and Feature Blurring**: Spacecraft are usually approximately symmetrical, so it is necessary to parse accurate attitude information from subtle features. ### Solutions To overcome the above challenges, this paper proposes a new method to narrow the domain gap through the following means: 1. **Multi - task Learning**: Introduce the segmentation task as an auxiliary task to improve the generalization ability of the Key Point Localization Network (KPN). 2. **Aggressive Data Augmentation Strategy**: Adopt a variety of data augmentation techniques (such as Gaussian noise, brightness - contrast adjustment, Hide&Seek, exposure enhancement, texture enhancement, etc.) to make the training data more diverse, thereby improving the robustness of the model to different illumination conditions and textures. 3. **End - to - End Neural Architecture**: Design a brand - new, fully neural - network - based attitude estimation architecture, including the Key Point Localization Network (KPN) and the Pose Estimation Model (PEM). KPN is used to predict the coordinates of predefined key points, and PEM predicts the 6D pose based on these key points. ### Experimental Verification The authors conducted experiments on the SPEED+ dataset, which contains synthetic images and Hardware - in - the - Loop (HIL) images. The experimental results show that this method can significantly improve the accuracy of attitude estimation without using real - orbit images for training, and has reached the state - of - the - art level on multiple test sets. ### Formula Summary - **Calculation of Key Point Coordinates**: \[ x_k^N=\frac{W}{4}\sum_{i = 1}^{W/4}\sum_{j = 1}^{H/4}h_k^N(i,j)x(i,j) \] \[ y_k^N=\frac{H}{4}\sum_{i = 1}^{W/4}\sum_{j = 1}^{H/4}h_k^N(i,j)y(i,j) \] - **Key Point Coordinates in Full - Resolution Images**: \[ (x_k,y_k)=(x_0 + x_k^Nw,y_0 + y_k^Nh) \] - **Loss Function**: \[ L_{K\text{pts}}=\frac{1}{K}\sum_{k = 1}^K\sqrt{(x_k^N-\hat{x}_k^N)^2+(y_k^N-\hat{y}_k^N)^2} \] \[ L_{PEM}=\frac{\|t-\hat{t}\|_2}{\|\hat{t}\|_2}+\|q_{6D}-\hat{q}_{6D}\|_1 \] \[ L_{KPN}=\beta_{K\text{pts}}L_{K\text{pts}}+\beta_{\text{Multi}}(L_{SS}+L_{FS}) \] Through these methods and techniques, the paper effectively solves the domain gap problem in the 6D attitude estimation of on - orbit spacecraft and demonstrates its excellent performance in real - life scenarios.

Domain Generalization for In-Orbit 6D Pose Estimation

Domain Generalization for 6D Pose Estimation Through NeRF-based Image Synthesis

Pose Estimation and Neural Implicit Reconstruction Towards Non-Cooperative Spacecraft Without Offline Prior Information

Global Adaptation Meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation.

Learning Stereopsis from Geometric Synthesis for 6D Object Pose Estimation

3D Point-to-Keypoint Voting Network for 6D Pose Estimation

Unsupervised Domain Adaptation for 3D Human Pose Estimation

Pose Estimation for Cross-Domain Non-Cooperative Spacecraft Based on Spatial-Aware Keypoints Regression

Pose Estimation for Non-Cooperative Spacecraft Rendezvous Using Convolutional Neural Networks

6D Object Pose Estimation from Approximate 3D Models for Orbital Robotics

Leveraging Neural Radiance Fields for Pose Estimation of an Unknown Space Object during Proximity Operations

KGNet: Knowledge-Guided Networks for Category-Level 6D Object Pose and Size Estimation.

End-to-End 6dof Pose Estimation from Monocular RGB Images

Generalizable Pose Estimation Using Implicit Scene Representations

A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation

Generalizing Monocular 3d Human Pose Estimation In The Wild

Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot, Generalizable Approach using RGB Images

Pose Estimation for Non-Cooperative Rendezvous Using Neural Networks

W6DNet: Weakly Supervised Domain Adaptation for Monocular Vehicle 6-D Pose Estimation With 3-D Priors and Synthetic Data

6D-Vnet: End-To-End 6dof Vehicle Pose Estimation from Monocular RGB Images

DEEP LEARNING-BASED MONOCULAR RELATIVE POSE ESTIMATION OF UNCOOPERATIVE SPACECRAFT