ELF-UA: Efficient Label-Free User Adaptation in Gaze Estimation

Yong Wu,Yang Wang,Sanqing Qu,Zhijun Li,Guang Chen
2024-06-13
Abstract:We consider the problem of user-adaptive 3D gaze estimation. The performance of person-independent gaze estimation is limited due to interpersonal anatomical differences. Our goal is to provide a personalized gaze estimation model specifically adapted to a target user. Previous work on user-adaptive gaze estimation requires some labeled images of the target person data to fine-tune the model at test time. However, this can be unrealistic in real-world applications, since it is cumbersome for an end-user to provide labeled images. In addition, previous work requires the training data to have both gaze labels and person IDs. This data requirement makes it infeasible to use some of the available data. To tackle these challenges, this paper proposes a new problem called efficient label-free user adaptation in gaze estimation. Our model only needs a few unlabeled images of a target user for the model adaptation. During offline training, we have some labeled source data without person IDs and some unlabeled person-specific data. Our proposed method uses a meta-learning approach to learn how to adapt to a new user with only a few unlabeled images. Our key technical innovation is to use a generalization bound from domain adaptation to define the loss function in meta-learning, so that our method can effectively make use of both the labeled source data and the unlabeled person-specific data during training. Extensive experiments validate the effectiveness of our method on several challenging benchmarks.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the problem of label data requirements in user - adaptive 3D gaze estimation. Specifically, existing methods require a small number of annotated images of the target user to fine - tune the model during testing, which is unrealistic in practical applications because obtaining annotated data (especially 3D gaze labels) often requires professional equipment and is very difficult for ordinary users. In addition, existing methods also require that the training data contains gaze labels and user IDs, which limits the range of available data. To solve these problems, this paper proposes a new problem setting, called **ELF - UA: Efficient Label - Free User Adaptation in Gaze Estimation**. Its main contributions are as follows: 1. **New problem setting**: Different from existing methods, the method in this paper only requires a small number of unannotated images of the target user for model adaptation. During the training process, only one source dataset with gaze labels (without user IDs) and some unannotated person - specific datasets with user IDs but without gaze labels are required. 2. **Innovative method**: This paper introduces a surrogate loss function based on the domain - adaptive generalization bound for outer - loop optimization in the MAML framework. This method can effectively adapt the model to new users with only a small number of unannotated images. 3. **Experimental verification**: Through extensive experiments on multiple challenging benchmark datasets, the effectiveness of this method is proved, and it significantly outperforms other alternative methods, achieving performance comparable to the current state - of - the - art methods. ### Detailed problem description #### Limitations of existing methods - **Label requirements**: Existing methods such as [Park et al., 2019] require a small number of annotated images of the target user for fine - tuning, which is impractical for practical applications. - **Data requirements**: These methods also require that the training data contains gaze labels and user IDs, which limits the range of data that can be used. #### Characteristics of the new problem setting - **Unlabeled adaptation**: The method in this paper only requires a small number of unannotated images of the target user for model adaptation. - **Flexible data use**: During the training process, only one source dataset with gaze labels (without user IDs) and some unannotated person - specific datasets with user IDs but without gaze labels are required. #### Innovation points of the method - **Meta - learning framework**: Based on model - agnostic meta - learning (MAML), this paper proposes a new meta - learning framework for learning self - supervised user - adaptive gaze estimation. - **Surrogate loss function**: A surrogate loss function based on the domain - adaptive generalization bound is introduced for outer - loop optimization in the MAML framework, enabling the model to effectively use unannotated data for rapid adaptation. Through these innovations, the ELF - UA method proposed in this paper can effectively adapt the model to new users with only a small number of unannotated images, thus solving the limitations of existing methods in practical applications.