Abstract:Deep reinforcement learning (DRL) is an area of machine learning that combines a deep learning approach and reinforcement learning (RL). However, there seem to be few studies that analyze the latest DRL algorithms on real-world powertrain control problems. Meanwhile, the boost control of a variable geometry turbocharger (VGT)-equipped diesel engine is difficult mainly due to its strong coupling with an exhaust gas recirculation (EGR) system and large lag, resulting from time delay and hysteresis between the input and output dynamics of the engine’s gas exchange system. In this context, one of the latest model-free DRL algorithms, the deep deterministic policy gradient (DDPG) algorithm, was built in this paper to develop and finally form a strategy to track the target boost pressure under transient driving cycles. Using a fine-tuned proportion integration differentiation (PID) controller as a benchmark, the results show that the control performance based on the proposed DDPG algorithm can achieve a good transient control performance from scratch by autonomously learning the interaction with the environment, without relying on model supervision or complete environment models. In addition, the proposed strategy is able to adapt to the changing environment and hardware aging over time by adaptively tuning the algorithm in a self-learning manner on-line, making it attractive to real plant control problems whose system consistency may not be strictly guaranteed and whose environment may change over time.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to optimize the transient response control strategy in diesel engines equipped with Variable Geometry Turbochargers (VGT) by using Deep Reinforcement Learning (DRL) technology, especially the problem of tracking the target boost pressure in transient driving cycles. Specifically, the paper focuses on the following points:
1. **VGT control challenges**: The interaction between VGT and the Exhaust Gas Recirculation (EGR) system increases the complexity of control. In addition, due to the time delay and hysteresis between the input and output dynamics of the diesel engine gas exchange system, it is difficult to precisely control VGT.
2. **Limitations of existing control methods**: Although the traditional fixed - parameter Proportion Integration Differentiation (PID) controller is widely used in industry, its parameter setting process is complex, and it is difficult to obtain satisfactory results when the state of the control loop changes. Although there are some PID variants such as expert PID control, fuzzy PID control, and neural - network - based PID control, these algorithms perform better when properly tuned, but they respectively require obtaining expert knowledge, constructing fuzzy control decision - making tables, and adjusting complex neural network parameters, so their wide application in VGT boost control is limited.
3. **Control problems of high - order, large - lag, strong - coupling, nonlinear, and time - varying parameter systems**: For complex industrial systems with the above characteristics (such as VGT control systems), the traditional control theory relying on mathematical models is still immature, and some methods are too complex to be directly applied in industrial practice. In addition, it may be infeasible or unrealistic to develop the first - principle model of complex industrial processes.
4. **Application prospects of model - free intelligent algorithms**: In view of the limitations of traditional control methods, the paper proposes to apply "model - free" intelligent algorithms (i.e., algorithms that can achieve end - to - end learning and intelligent control without a high - fidelity model) to meet the industry's requirements for simplicity and robustness and provide an attractive alternative.
In summary, the main objective of this paper is to use the Deep Deterministic Policy Gradient (DDPG) algorithm to establish a control strategy that can autonomously learn to interact with the environment without model supervision or a complete environmental model, thereby achieving the target boost pressure tracking of VGT diesel engines under transient driving conditions. By comparing with a finely - tuned PID controller, the superiority of the proposed DDPG algorithm is verified.