Autofocusing for Synthetic Aperture Imaging Based on Pedestrian Trajectory Prediction

Zhao Pei,Jiaqing Zhang,Wenwen Zhang,Miao Wang,Jianing Wang,Yee-Hong Yang
DOI: https://doi.org/10.1109/tcsvt.2023.3314895
IF: 5.859
2023-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Occlusions and complex backgrounds are common factors that hinder many computer vision applications. In a street scene, the challenge of accurately predicting pedestrian trajectories comes from the complexity of human behavior and the diversity of the external environment. It is difficult, if not impossible, to extract relevant information to accurately predict pedestrian trajectories in dynamic scenes. Synthetic aperture imaging (SAI) uses an array of cameras to mimic a camera with a large virtual convex lens by projecting images of a scene from different views onto a virtual focal plane. It is commonly used to reconstruct occluded objects, and in a street scene, can provide observation of pedestrians occluded by other objects and pedestrians. In this paper, we propose a joint prediction method based on autofocusing of SAI to predict pedestrian trajectories in dynamic scenes. The main contributions of this paper include: 1) The task of pedestrian trajectory prediction in dynamic scenarios is redefined as pedestrian trajectory prediction and SAI autofocusing from a practical but more challenging perspective. 2) The proposed method is based on an existing SAI-based method to extract information in heavily occluded views, which can obtain more accurate results but with less computational cost and without using other sensors such as LiDAR or depth cameras. 3) A new pedestrian trajectory prediction model, an attention-based trajectory prediction variational autoencoder (ATP-VAE), is proposed to extract complex human behavior and social interactions in dynamic scenes through a new Intention Attention Unit. The experimental results on multiple public datasets show that the proposed method achieves state-of-the-art results in the first-person perspective and in aerial view.
What problem does this paper attempt to address?