Predicting Human Saccadic Scanpaths Based on Iterative Representation Learning
Chen Xia,Junwei Han,Fei Qi,Guangming Shi
DOI: https://doi.org/10.1109/tip.2019.2897966
IF: 10.6
2019-01-01
IEEE Transactions on Image Processing
Abstract:Visual attention is a dynamic process of scene exploration and information acquisition. However, existing research on attention modeling has concentrated on estimating static salient locations. In contrast, dynamic attributes presented by saccade have not been well explored in previous attention models. In this paper, we address the problem of saccadic scanpath prediction by introducing an iterative representation learning framework. Within the framework, saccade can be interpreted as an iterative process of predicting one fixation according to the current representation and updating the representation based on the gaze shift. In the predicting phase, we propose a Bayesian definition of saccade to combine the influence of perceptual residual and spatial location on the selection of fixations. In implementation, we compute the representation error of an autoencoder-based network to measure perceptual residuals of each area. Simultaneously, we integrate saccade amplitude and center-weighted mechanism to model the influence of spatial location. Based on estimating the influence of two parts, the final fixation is defined as the point with the largest posterior probability of gaze shift. In the updating phase, we update the representation pattern for the subsequent calculation by retraining the network with samples extracted around the current fixation. In the experiments, the proposed model can replicate the fundamental properties of psychophysics in visual search. In addition, it can achieve superior performance on several benchmark eye-tracking data sets.