Improving Driver Gaze Prediction With Reinforced Attention

Kai Lv,Hao Sheng,Zhang Xiong,Wei Li,Liang Zheng
DOI: https://doi.org/10.1109/tmm.2020.3038311
IF: 7.3
2021-01-01
IEEE Transactions on Multimedia
Abstract:We consider the task of driver gaze prediction: estimating where the location of the focus of a driver should be, based on a raw video of the outside environment. In practice, we output a probability map that gives the normalized probability of each point in a given scene being the object of the driver attention. Most existing methods (i.e., Coarse-to-Fine and Multi-branch) take an image or a video as input and directly output the fixation map. While successful, these methods can often produce highly scattered predictions, rendering them unreliable for real-world usage. Motivated by this observation, we propose the reinforced attention (RA) model as a regulatory mechanism to increase prediction density. Our method is built directly on top of existing methods, making it complementary to current approaches. Specifically, we first use Multi-branch to obtain an initial fixation map. Then, RA is trained using deep reinforcement learning to learn a location prediction policy, producing a reinforced attention. Finally, in order to obtain the final gaze prediction result, we combine the fixation map and the reinforced attention by a mask-guided multiplication. Experimental results show that our framework improves the accuracy of gaze prediction, and provides state-of-the-art performance on the DR(eye)VE dataset.
computer science, information systems,telecommunications, software engineering
What problem does this paper attempt to address?