Ego+X: an Egocentric Vision System for Global 3D Human Pose Estimation and Social Interaction Characterization

Yuxuan Liu,Jianxin Yang,Xiao Gu,Yao Guo,Guang-Zhong Yang
DOI: https://doi.org/10.1109/iros47612.2022.9981710
2022-01-01
Abstract:Egocentric vision is an emerging topic, which has demonstrated great potential in assistive healthcare scenarios, ranging from human-centric behavior analysis to personal social assistance. Within this field, due to the heterogeneity of visual perception from first-person views, egocentric pose estimation is one of the most significant prerequisites for enabling various downstream applications. However, existing methods for egocentric pose estimation mainly focus on predicting the pose represented in the camera coordinates from a single image, which ignores the latent cues in the temporal domain and results in less accuracy. In this paper, we propose Ego+X, an egocentric vision based system for 3D canonical pose estimation and human-centric social interaction characterization. Our system is composed of two head-mounted egocentric cameras, where one is faced downwards and the other looks outwards. By leveraging the global context provided by visual SLAM, we first propose Ego-Glo for spatial-accurate and temporal-consistent egocentric 3D pose estimation in the canonical coordinate system. With the help of an egocentric camera looking outwards, we then propose Ego-Soc by extending Ego-Glo to various social interaction tasks, e.g., object detection and human-human interaction. Quantitative and qualitative experiments have been conducted to demonstrate the effectiveness of our proposed Ego+X.
What problem does this paper attempt to address?