Abstract:Immersive VR telepresence ideally means being able to interact and communicate with digital avatars that are indistinguishable from and precisely reflect the behaviour of their real counterparts. The core technical challenge is two fold: Creating a digital double that faithfully reflects the real human and tracking the real human solely from egocentric sensing devices that are lightweight and have a low energy consumption, e.g. a single RGB camera. Up to date, no unified solution to this problem exists as recent works solely focus on egocentric motion capture, only model the head, or build avatars from multi-view captures. In this work, we, for the first time in literature, propose a person-specific egocentric telepresence approach, which jointly models the photoreal digital avatar while also driving it from a single egocentric video. We first present a character model that is animatible, i.e. can be solely driven by skeletal motion, while being capable of modeling geometry and appearance. Then, we introduce a personalized egocentric motion capture component, which recovers full-body motion from an egocentric video. Finally, we apply the recovered pose to our character model and perform a test-time mesh refinement such that the geometry faithfully projects onto the egocentric view. To validate our design choices, we propose a new and challenging benchmark, which provides paired egocentric and dense multi-view videos of real humans performing various motions. Our experiments demonstrate a clear step towards egocentric and photoreal telepresence as our method outperforms baselines as well as competing methods. For more details, code, and data, we refer to our project page.

Letters: Avatar motion control by natural body movement via camera

Responsive Action Generation By Physically-Based Motion Retrieval And Adaptation

A Motion-based User Interface for the Control of Virtual Humans Performing Sports.

Virtual Avatar Control Using Wireless Sensors.

Advancing Virtual Reality Interaction: A Ring-Shaped Controller and Pose Tracking

2D Motion Detection Bounded Hand 3D Trajectory Tracking and Gesture Recognition under Complex Background

Big Movements or Small Motions: Controlling Digital Avatars with Single-Camera Motion Capture

Motion-Based Perceptual User Interface

RAM-Avatar: Real-time Photo-Realistic Avatar from Monocular Videos with Full-body Control

HandAvatar: Embodying Non-Humanoid Virtual Avatars Through Hands.

AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing

Implementation of the Interactive Gestures of Virtual Avatar Based on a Multi-user Virtual Learning Environment

Building Virtual Entertainment Environment with Tiled Display Wall and Motion Tracking.

Interactive Animation of Virtual Characters: Application to Virtual Kung-Fu Fighting

EgoAvatar: Egocentric View-Driven and Photorealistic Full-body Avatars

AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos

One Shot, One Talk: Whole-body Talking Avatar from a Single Image

Combining Motion Matching and Orientation Prediction to Animate Avatars for Consumer-Grade VR Devices

MacAction: Realistic 3D macaque body animation based on multi-camera markerless motion capture

High-fidelity facial and speech animation for VR HMDs

Attention-Based VR Facial Animation with Visual Mouth Camera Guidance for Immersive Telepresence Avatars