Where Are They Going? Predicting Human Behaviors in Crowded Scenes

Bo Zhang,Rui Zhang,Niccolo Bisagno,Nicola Conci,Francesco G. B. De Natale,Hongbo Liu
DOI: https://doi.org/10.1145/3449359
IF: 4.094
2021-01-01
ACM Transactions on Multimedia Computing Communications and Applications
Abstract:In this article, we propose a framework for crowd behavior prediction in complicated scenarios. The fundamental framework is designed using the standard encoder-decoder scheme, which is built upon the long short-term memory module to capture the temporal evolution of crowd behaviors. To model interactions among humans and environments, we embed both the social and the physical attention mechanisms into the long short-term memory. The social attention component can model the interactions among different pedestrians, whereas the physical attention component helps to understand the spatial configurations of the scene. Since pedestrians’ behaviors demonstrate multi-modal properties, we use the generative model to produce multiple acceptable future paths. The proposed framework not only predicts an individual’s trajectory accurately but also forecasts the ongoing group behaviors by leveraging on the coherent filtering approach. Experiments are carried out on the standard crowd benchmarks (namely, the ETH, the UCY, the CUHK crowd, and the CrowdFlow datasets), which demonstrate that the proposed framework is effective in forecasting crowd behaviors in complex scenarios.
What problem does this paper attempt to address?