Highly efficient gaze estimation method using online convolutional re-parameterization

De Gu,Minghao Lv,Jianchu Liu,Mari Anne Snow
DOI: https://doi.org/10.1007/s11042-024-18941-2
IF: 2.577
2024-03-26
Multimedia Tools and Applications
Abstract:Existing gaze estimation methods with multi-branch structures significantly improve accuracy but come at the cost of extra training overhead and slow inference speed. In this paper. We propose a hybrid model combining online re-parameterization structures and improved transformer encoders for precise and efficient gaze estimation that significantly reduces training requirements while accelerating inference speed. Our multi-branch model employs online re-parameterization structures to extract multi-scale gaze-related features and can be equivalently transformed into a single-branch model during training and inference to achieve significant cost savings and operational improvements. Moreover, we employ transformer encoders to enhance the global correlation of gaze-related features. To offset performance degradation when the conventional position embeddings that affect the inference speed of encoders are removed, we substitute zero-padding position embeddings for the conventional position embeddings to facilitate encoders to learn absolute position information without introducing additional inference costs. Our experimental results demonstrate that the proposed model achieves improved performance on multiple datasets while saving the training time by 57%, memory usage by 36%, and accelerating the inference speed by 26%.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?