Abstract:In this study, we propose a predictive model composed of a recurrent neural network including parametric bias and stochastic elements, and an environmentally adaptive robot control method including variance minimization using the model. Robots which have flexible bodies or whose states can only be partially observed are difficult to modelize, and their predictive models often have stochastic behaviors. In addition, the physical state of the robot and the surrounding environment change sequentially, and so the predictive model can change online. Therefore, in this study, we construct a learning-based stochastic predictive model implemented in a neural network embedded with such information from the experience of the robot, and develop a control method for the robot to avoid unstable motion with large variance while adapting to the current environment. This method is verified through a mobile robot in simulation and to the actual robot Fetch.
What problem does this paper attempt to address?
This paper attempts to solve the following problems:
1. **Difficulties in Modeling**: For robots with flexible bodies or partially observable states, it is difficult to establish their models. The prediction models of such robots usually exhibit random behavior, making traditional control methods difficult to handle.
2. **Models Vary with the Environment**: The physical state of the robot and its surrounding environment are constantly changing, so the prediction model needs to be updated online to adapt to the current environment.
3. **Randomness of the Model**: These models may have random behavior, which may lead to serious problems in some cases. For example, differences in the wheel angles and floor materials of the mobile robot Fetch can lead to randomness in its movement, thus affecting its stability and control accuracy.
To solve these problems, the author proposes a prediction model that combines parametric bias and random elements, and develops an environment - adaptive robot control method including variance minimization. Specifically:
- **Prediction Model**: A recurrent neural network (RNN) containing parametric bias and random elements is used, which can output the mean and variance, thus embedding random behavior.
- **Environment Adaptation**: By implicitly embedding environmental information from motion data in different environments through parametric bias, the model can adapt to different environmental changes.
- **Variance Minimization**: The goal of variance minimization is introduced during the control process to avoid unstable movement and ensure that the robot is more stable when performing tasks.
### Formula Summary
1. **SPNPB Model Expression**:
\[
(s_{t + 1}, v_{t + 1})=h(s_t, u_t, p)
\]
where \( s_t \) is the robot state, \( u_t \) is the control command, \( p \) is the parametric bias, \( v_{t+1} \) is the variance of the state, and \( h \) is the SPNPB model.
2. **Loss Function**:
\[
P(s_{i,t}^k | D_k^{1:t - 1}, W, p_k)=\frac{1}{\sqrt{2\pi\hat{v}_{i,t}}}\exp\left(-\frac{(s_{i,t}^k-\hat{s}_{i,t}^k)^2}{2\hat{v}_{i,t}}\right)
\]
\[
L_{\text{likelihood}}(W, p_{1:K} | D_{\text{train}})=\prod_{k = 1}^K\prod_{t = 1}^{T_k}\prod_{i = 1}^{N_s}P(s_{i,t}^k | D_k^{1:t - 1}, W, p_k)
\]
\[
L_{\text{train}}=-\log(L_{\text{likelihood}})
\]
3. **Control Objective Function**:
\[
L_{\text{control}}=\|s_{\text{ref}}^{\text{seq}}-\hat{s}^{\text{seq}}\|^2 + C_{\text{variance}}\|\hat{v}^{\text{seq}}\|^2
\]
\[
u_{\text{opt}}^{\text{seq}}\leftarrow u_{\text{opt}}^{\text{seq}}-\gamma\frac{\partial L_{\text{control}}}{\partial u_{\text{opt}}^{\text{seq}}}
\]
Through these methods, the author aims to improve the stability and adaptability of robots in complex and dynamic environments.