Human Trajectory Forecasting with Explainable Behavioral Uncertainty

Jiangbei Yue,Dinesh Manocha,He Wang
2023-07-05
Abstract:Human trajectory forecasting helps to understand and predict human behaviors, enabling applications from social robots to self-driving cars, and therefore has been heavily investigated. Most existing methods can be divided into model-free and model-based methods. Model-free methods offer superior prediction accuracy but lack explainability, while model-based methods provide explainability but cannot predict well. Combining both methodologies, we propose a new Bayesian Neural Stochastic Differential Equation model BNSP-SFM, where a behavior SDE model is combined with Bayesian neural networks (BNNs). While the NNs provide superior predictive power, the SDE offers strong explainability with quantifiable uncertainty in behavior and observation. We show that BNSP-SFM achieves up to a 50% improvement in prediction accuracy, compared with 11 state-of-the-art methods. BNSP-SFM also generalizes better to drastically different scenes with different environments and crowd densities (~ 20 times higher than the testing data). Finally, BNSP-SFM can provide predictions with confidence to better explain potential causes of behaviors. The code will be released upon acceptance.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenge of simultaneously achieving high prediction accuracy and strong interpretability in human trajectory prediction. Specifically, existing methods can be divided into two categories: model - free methods and model - based methods. Model - free methods usually use deep neural networks. Although they have high prediction accuracy, they lack the ability to interpret prediction results; while model - based methods can provide a certain degree of interpretability, but are not as good as the former in terms of prediction accuracy. Therefore, this paper proposes a new Bayesian Neural Stochastic Differential Equation model (BNSP - SFM), aiming to combine the advantages of both, that is, to improve prediction accuracy and enhance the interpretability of the model. ### Main contributions 1. **New Bayesian Neural Stochastic Differential Equation model**: By combining stochastic differential equations (SDE) and Bayesian neural networks (BNNs), this model not only improves prediction accuracy, but also can quantify the uncertainty in behavior and observation, thus providing stronger interpretability. 2. **Fine - grained uncertainty modeling**: Different from previous methods, BNSP - SFM subdivides uncertainty into interpretable uncertainty (aleatoric uncertainty) caused by behavioral randomness and uninterpretable uncertainty (epistemic uncertainty) caused by unobserved reasons. This fine - grained uncertainty modeling helps to better interpret prediction results. 3. **Effectiveness on multiple tasks and datasets**: The paper shows that the BNSP model outperforms existing methods in standard trajectory prediction tasks and has better generalization ability in different scenarios. ### Method overview - **Bayesian Neural Stochastic Differential Equation**: Use Bayesian neural networks to approximately solve stochastic differential equations and estimate model parameters at the same time, so as to construct a probability model that can capture the dynamic characteristics of the system and quantify the uncertainty of model parameters. - **Stochastic social physics model**: Model the movement of pedestrians as a second - order stochastic differential equation containing random forces, which come from social interactions and environmental impacts. - **Conditional variational auto - encoder (CVAE)**: Used to model and predict observation noise, so as to capture the residual uncertainty not explained by the SDE model. ### Formula analysis - **Stochastic differential equation (SDE)**: \[ dX_t=\mu(t, X_t)dt+\sigma(t, X_t)dW_t \] where \(X_t\) is the state process of the system, \(\mu\) and \(\sigma\) are two real - valued functions, and \(W(t)\) is a Wiener process. - **Pedestrian movement model**: \[ \begin{cases} dp(t)=\dot{p}(t)dt+\epsilon(t, p_{t:t - M})\\ d\dot{p}(t)=f_{\eta,\phi}(t, p(t), \dot{p}(t), \Omega(t), p_T, E)dt+\sigma_{\eta,\phi}(t, p(t), \dot{p}(t), \Omega(t), p_T, E)dW(t) \end{cases} \] where \(p(t)\) is the position, \(\dot{p}(t)\) is the speed, \(\epsilon(t, p_{t:t - M})\) is time - dependent observation noise, \(f_{\eta,\phi}\) and \(\sigma_{\eta,\phi}\) are functions parameterized by neural networks, \(\Omega(t)\) is the neighborhood set, and \(E\) is an environmental factor. - **Target attraction**: \[ F^t_{\text{goal}}=\left(\frac{p_T - p_