Abstract:Human eye gaze plays a significant role in many virtual and augmented reality (VR/AR) applications, such as gaze-contingent rendering, gaze-based interaction, or eye-based activity recognition. However, prior works on gaze analysis and prediction have only explored eye-head coordination and were limited to human-object interactions. We first report a comprehensive analysis of eye-body coordination in various human-object and human-human interaction activities based on four public datasets collected in real-world (MoGaze), VR (ADT), as well as AR (GIMO and EgoBody) environments. We show that in human-object interactions, e.g. pick and place, eye gaze exhibits strong correlations with full-body motion while in human-human interactions, e.g. chat and teach, a person's gaze direction is correlated with the body orientation towards the interaction partner. Informed by these analyses we then present Pose2Gaze, a novel eye-body coordination model that uses a convolutional neural network and a spatio-temporal graph convolutional neural network to extract features from head direction and full-body poses, respectively, and then uses a convolutional neural network to predict eye gaze. We compare our method with state-of-the-art methods that predict eye gaze only from head movements and show that Pose2Gaze outperforms these baselines with an average improvement of 24.0% on MoGaze, 10.1% on ADT, 21.3% on GIMO, and 28.6% on EgoBody in mean angular error, respectively. We also show that our method significantly outperforms prior methods in the sample downstream task of eye-based activity recognition. These results underline the significant information content available in eye-body coordination during daily activities and open up a new direction for gaze prediction.

Active Gaze Labeling: Visualization for Trust Building

GazeLabel

Gaze2AOI: Open Source Deep-learning Based System for Automatic Area of Interest Annotation with Eye Tracking Data

Quantifying Gaze Behavior During Real-World Interactions Using Automated Object, Face, and Fixation Detection

Gaze-directed Visual Grounding under Object Referring Uncertainty

A guideline for integrating dynamic areas of interests in existing set-up for capturing eye movement: Looking at moving aircraft

CasualGaze: Towards Modeling and Recognizing Casual Gaze Behavior for Efficient Gaze-based Object Selection

Pose2Gaze: Eye-body Coordination during Daily Activities for Gaze Prediction from Full-body Poses

Gaze Target Estimation inspired by Interactive Attention

Automatic Visual Attention Detection for Mobile Eye Tracking Using Pre-Trained Computer Vision Models and Human Gaze

An Accuracy Enhanced Vision Language Grounding Method Fused with Gaze Intention

Integrating Human Gaze into Attention for Egocentric Activity Recognition

A Ground-Truth Data Set and a Classification Algorithm for Eye Movements in 360-degree Videos

Automatic Gaze Analysis: A Survey of Deep Learning based Approaches

GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting

Deep neural networks for collaborative learning analytics: Evaluating team collaborations using student gaze point prediction

Active Gaze Control for Foveal Scene Exploration

Labeling Out-of-View Objects in Immersive Analytics to Support Situated Visual Searching

Eyes on the Game: Deciphering Implicit Human Signals to Infer Human Proficiency, Trust, and Intent

End-to-end Video Gaze Estimation via Capturing Head-face-eye Spatial-temporal Interaction Context

Gaze Pattern Genius: Gaze-Driven VR Interaction Using Unsupervised Domain Adaption