Context-Aware Head-and-Eye Motion Generation with Diffusion Model
Yuxin Shen,Manjie Xu,Wei Liang
DOI: https://doi.org/10.1109/vr58804.2024.00039
2024-01-01
Abstract:In humanity’s ongoing quest to craft natural and realistic avatars within virtual environments, the generation of authentic eye gaze behaviors stands paramount. Eye gaze not only serves as a primary non-verbal communication cue, but it also reflects cognitive processes, intent, and attentiveness, making it a crucial element in ensuring immersive interactions. However, automatically generating these intricate gaze behaviors presents significant challenges. Traditional methods can be both time-consuming and lack the precision to align gaze behaviors with the intricate nuances of the environment in which the avatar resides. To overcome these challenges, we introduce a novel two-stage approach to generate context-aware head-and-eye motions across diverse scenes. By harnessing the capabilities of advanced diffusion models, our approach adeptly produces contextually appropriate eye gaze points, further leading to the generation of natural head-and-eye movements. Utilizing Head-Mounted Display (HMD) eye-tracking technology, we also present a comprehensive dataset, which captures human eye gaze behaviors in tandem with associated scene features. We show that our approach consistently delivers intuitive and lifelike head-and-eye motions and demonstrates superior performance in terms of motion fluidity, alignment with contextual cues, and overall user satisfaction.