Abstract:Sampling-based Model Predictive Control (MPC) has been a practical and effective approach in many domains, notably model-based reinforcement learning, thanks to its flexibility and parallelizability. Despite its appealing empirical performance, the theoretical understanding, particularly in terms of convergence analysis and hyperparameter tuning, remains absent. In this paper, we characterize the convergence property of a widely used sampling-based MPC method, Model Predictive Path Integral Control (MPPI). We show that MPPI enjoys at least linear convergence rates when the optimization is quadratic, which covers time-varying LQR systems. We then extend to more general nonlinear systems. Our theoretical analysis directly leads to a novel sampling-based MPC algorithm, CoVariance-Optimal MPC (CoVo-MPC) that optimally schedules the sampling covariance to optimize the convergence rate. Empirically, CoVo-MPC significantly outperforms standard MPPI by 43-54% in both simulations and real-world quadrotor agile control tasks. Videos and Appendices are available at \url{
What problem does this paper attempt to address?
The paper mainly aims to address the following issues:
### Research Background and Objectives
- **Research Background**: Sampling-based Model Predictive Control (MPC) is a practical and effective method that has been widely used in many fields, especially in model-based reinforcement learning. Despite its excellent experimental performance, there is still a lack of theoretical understanding, particularly in terms of convergence and hyperparameter tuning.
- **Specific Objectives**:
- Conduct a convergence analysis of the widely used Model Predictive Path Integral Control (MPPI) method.
- Based on the above theoretical analysis, design a novel sampling-based model predictive control algorithm—CoVariance-Optimal MPC (CoVO-MPC), which can optimize the sampling covariance matrix to accelerate the convergence process.
- Validate the effectiveness of the proposed CoVO-MPC algorithm on different robotic systems.
### Issues Addressed
1. **Theoretical Analysis**: For the first time, a convergence analysis of MPPI is conducted, particularly the linear convergence rate under quadratic cost functions, and further extended to nonlinear systems.
2. **Novel Algorithm**: A new sampling-based MPC algorithm, CoVO-MPC, is proposed. This algorithm optimizes the sampling covariance matrix Σ by utilizing information from the system dynamics and cost function, thereby improving the convergence speed.
3. **Empirical Validation**: Extensive experimental validation is conducted on different robotic systems (including quadrotors in both simulated and real-world environments), showing that CoVO-MPC significantly outperforms the standard MPPI algorithm, with a performance improvement of 43%-54%.
In summary, this paper aims to fill the gap in the theoretical foundation of sampling-based MPC algorithms, while proposing a practical and efficient new algorithm, CoVO-MPC, and validating its effectiveness through experiments.