Learning Interaction Regions and Motion Trajectories Simultaneously from Egocentric Demonstration Videos

Xin Jianjia,Wang Lichun,Xu Kai,Yang Chao,Yin Baocai
DOI: https://doi.org/10.1109/lra.2023.3301307
2024-01-01
Abstract:Learning to interact with objects is significant for robots to integrate into human environments. When the interaction semantic is definite, manually guiding the manipulator is a commonly used method to teach robots how to interact with objects. However, the learning results are robot-dependent because the mechanical parameters are different for different robots, which means the learning process must be executed again. Moreover, during the manual guiding process, operators are responsible for recognizing the region being contacted and providing expert motion programming, which limits the robot's intelligence. To enhance the level of automation in object interaction for robots, this letter proposes IRMT-Net (Interaction Region and Motion Trajectory prediction Network) to predict the interaction region and motion trajectory simultaneously based on images. IRMT-Net achieves state-of-the-art interaction region prediction results on Epic-kitchens dataset, generates reasonable motion trajectories and can support robot interaction in actual situations.
What problem does this paper attempt to address?