Abstract:In this paper, we study imitation learning under the challenging setting of: (1) only a single demonstration, (2) no further data collection, and (3) no prior task or object knowledge. We show how, with these constraints, imitation learning can be formulated as a combination of trajectory transfer and unseen object pose estimation. To explore this idea, we provide an in-depth study on how state-of-the-art unseen object pose estimators perform for one-shot imitation learning on ten real-world tasks, and we take a deep dive into the effects that camera calibration, pose estimation error, and spatial generalisation have on task success rates. For videos, please visit <a class="link-external link-https" href="https://www.robot-learning.uk/pose-estimation-perspective" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper mainly studies the application of one - shot imitation learning in robot manipulation, especially under the following three challenging conditions: 1. **Only one demonstration is provided**: The robot can only learn the task from a single demonstration. 2. **No further data collection**: After the initial demonstration, no additional data collection or environmental interaction is carried out. 3. **No prior knowledge of tasks or objects**: The robot has no prior knowledge of the task and the objects being manipulated. Specifically, the paper models one - shot imitation learning as a combination of trajectory transfer and unseen object pose estimation. The author explores the following problems through a series of experiments: #### Sections 4.1 and 4.2: The influence of pose estimation error on task success rate - Research how camera calibration error and pose estimation error affect the success rate of the task. - Prove through experiments that pose estimation error has a greater impact on the task success rate than camera calibration error, and rotation error is more critical than position error. #### Section 4.3: Benchmarking real - world tasks - Evaluate the performance of different one - shot unseen object pose estimation methods on ten real - world daily robot tasks. - Compare the performance of these methods with the existing state - of - the - art one - shot imitation learning methods (such as DOME). #### Section 4.4: Spatial generalization ability - Explore the robustness and generalization ability of trajectory transfer when the pose of the object changes relative to the demonstration. - Analyze the task success rate at different positions and poses, and reveal the impact of object pose changes on the task success rate. ### Key conclusions 1. **The importance of pose estimation**: A good pose estimation method is crucial for one - shot imitation learning, especially the influence of rotation error is more significant. 2. **Comparison of multiple methods**: Through simulation and real - world experiments, eight different pose estimation methods are compared, and it is found that the regression - based method performs the best. 3. **Spatial generalization ability**: As the object pose changes, the task success rate will decrease, but some methods (such as regression) show strong robustness. ### Formula summary - **Trajectory transfer formula**: \[ T_{Test}^{RE_t}=T_{Test}^{RO}T_{Demo}^{OE_t} \] where \(T_{Test}^{RO}\) is the transformation matrix of the object relative to the robot at the time of testing, and \(T_{Demo}^{OE_t}\) is the transformation matrix of the end - effector relative to the object at the time of demonstration. - **Relative pose transformation**: \[ R_{\delta}^R = T_{Test}^{RO}T_{Demo}^{OR} \] It represents the transformation of the object from the demonstration to the test scene, expressed in the robot coordinate system \(R\). - **Relative pose transformation in the camera coordinate system**: \[ C_{\delta}^C=T_{Test}^{CO}(T_{Demo}^{CO})^{-1} \] It represents the transformation of the object from the demonstration to the test scene, expressed in the camera coordinate system \(C\). Through these studies, the paper provides new insights into one - shot imitation learning and shows its potential in real - world tasks.

One-Shot Imitation Learning: A Pose Estimation Perspective

One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

Motion Imitation of a Humanoid Robot Via Pose Estimation

One-Shot Imitation Learning with Invariance Matching for Robotic Manipulation

Learning One-Shot Imitation From Humans Without Humans

One-Shot Visual Imitation Learning via Meta-Learning

Transformers for One-Shot Visual Imitation

You Only Look at One: Category-Level Object Representations for Pose Estimation From a Single Example

One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks

One-shot Imitation Learning via Interaction Warping

Object Detection-Based One-Shot Imitation Learning with an RGB-D Camera

One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill

OnePose: One-Shot Object Pose Estimation Without CAD Models

One-Shot Robust Imitation Learning for Long-Horizon Visuomotor Tasks from Unsegmented Demonstrations

One-Shot Imitation under Mismatched Execution

Humanoid Robot Imitation with Pose Similarity Metric Learning

One-Shot Domain-Adaptive Imitation Learning via Progressive Learning

Zero-Shot 3d Pose Estimation of Unseen Object by Two-Step Rgb-D Fusion

PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching

Few-Shot In-Context Imitation Learning via Implicit Graph Alignment

Demonstrate Once, Imitate Immediately (DOME): Learning Visual Servoing for One-Shot Imitation Learning