Predicting Crowd Flows via Pyramid Dilated Deeper Spatial-temporal Network

Congcong Miao,Jiajun Fu,Jilong Wang,Heng Yu,Botao Yao,Anqi Zhong,Jie Chen,Zekun He
DOI: https://doi.org/10.1145/3437963.3441785
2021-01-01
Abstract:Predicting crowd flows is crucial for urban planning, traffic management and public safety. However, predicting crowd flows is not trivial because of three challenges: 1) highly heterogeneous mobility data collected by various services; 2) complex spatio-temporal correlations of crowd flows, including multi-scale spatial correlations along with non-linear temporal correlations. 3) diversity in long-term temporal patterns. To tackle these challenges, we proposed an end-to-end architecture, called pyramid dilated spatial-temporal network (PDSTN), to effectively learn spatial-temporal representations of crowd flows with a novel attention mechanism. Specifically, PDSTN employs the ConvLSTM structure to identify complex features that capture spatial-temporal correlations simultaneously, and then stacks multiple ConvLSTM units for deeper feature extraction. For further improving the spatial learning ability, a pyramid dilated residual network is introduced by adopting several dilated residual ConvLSTM networks to extract multiscale spatial information. In addition, a novel attention mechanism, which considers both long-term periodicity and the shift in periodicity, is designed to study diverse temporal patterns. Extensive experiments were conducted on three highly heterogeneous real-world mobility datasets to illustrate the effectiveness of PDSTN beyond the state-of-the-art methods. Moreover, PDSTN provides intuitive interpretation into the prediction.
What problem does this paper attempt to address?