Epersist: A Self Balancing Robot Using PID Controller And Deep Reinforcement Learning

Ghanta Sai Krishna,Dyavat Sumith,Garika Akshay
DOI: https://doi.org/10.48550/arXiv.2207.11431
2022-07-23
Abstract:A two-wheeled self-balancing robot is an example of an inverse pendulum and is an inherently non-linear, unstable system. The fundamental concept of the proposed framework "Epersist" is to overcome the challenge of counterbalancing an initially unstable system by delivering robust control mechanisms, Proportional Integral Derivative(PID), and Reinforcement Learning (RL). Moreover, the micro-controller NodeMCUESP32 and inertial sensor in the Epersist employ fewer computational procedures to give accurate instruction regarding the spin of wheels to the motor driver, which helps control the wheels and balance the robot. This framework also consists of the mathematical model of the PID controller and a novel self-trained advantage actor-critic algorithm as the RL agent. After several experiments, control variable calibrations are made as the benchmark values to attain the angle of static equilibrium. This "Epersist" framework proposes PID and RL-assisted functional prototypes and simulations for better utility.
Robotics,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the problem of designing a Two-Wheeled Self-Balancing Robot (TWSBR) and overcoming its inherent nonlinearity and instability. Specifically, the paper proposes a framework named "Epersist," which aims to achieve effective control of the two-wheeled self-balancing robot by combining a Proportional-Integral-Derivative (PID) controller and Deep Reinforcement Learning (DRL). ### Main Issues: 1. **Nonlinearity and Instability**: The two-wheeled self-balancing robot is a high-order, multivariable, nonlinear, tightly coupled, and inherently unstable system. Traditional control methods such as PID, fuzzy control, and sliding mode control can achieve basic stability but struggle to reach optimal control. 2. **Optimal Control**: Existing control methods can achieve maximum stability but often fail to reach optimal performance in practical applications, especially when adjustments based on the robot's physical characteristics are needed. 3. **Cost-Effectiveness**: Existing solutions typically use expensive microcontrollers (such as Raspberry Pi, Arduino Uno, etc.), leading to high overall costs. ### Solutions: 1. **PID Controller**: Optimize the parameters of the PID controller through theoretical analysis and experimental calibration to improve system stability and response speed. 2. **Deep Reinforcement Learning**: Introduce a deep reinforcement learning agent based on the Advantage Actor-Critic (A2C) algorithm to learn and optimize control strategies through interaction with the environment, achieving smoother and more efficient balance control. 3. **Low-Cost Hardware**: Use NodeMCU ESP32 as the microcontroller to reduce the overall system cost and enhance the robot's practicality and operability through a Bluetooth-connected mobile application. ### Experimental Results: - **PID vs. DRL**: Experimental results show that the DRL control mechanism outperforms the PID control mechanism in terms of average stability time and distance covered per unit time. - **Hardware Prototype**: The effectiveness of the theoretical model was validated through a hardware prototype, demonstrating the actual performance of the PID and DRL control mechanisms. ### Conclusion: The "Epersist" framework not only outperforms traditional methods in terms of control effectiveness but also has significant advantages in cost-effectiveness, providing new ideas and technical support for the research and application of two-wheeled self-balancing robots.