A Data-Driven Reinforcement Learning Based Energy Management Strategy via Bridging Offline Initialization and Online Fine-Tuning for a Hybrid Electric Vehicle

Bo Hu,Bocheng Liu,Sunan Zhang
DOI: https://doi.org/10.1109/tie.2024.3357870
IF: 7.7
2024-01-01
IEEE Transactions on Industrial Electronics
Abstract:Considering the simulation-to-real gap and the fact that the data-driven learning methods are often suboptimal, an effective offline-to-online learning paradigm that can further improve the initialized offline policy in a real environment is necessary. However, for most industrial applications, this two-stage learning paradigm is often challenging to implement, as the initialized policy frequently deviates from the optimal improvement during fine-tuning. To address this, an effective offline-to-online reinforcement learning (RL) based training framework that bridges offline initialization and online fine-tuning will be proposed in this work. In contrast to most of RL algorithms, which evaluate unseen actions from the latest policy and extract policies only corresponding to the maximum value function, the proposed method uses two implicit constraints, namely, quantile regression and advantage weighted regression. In order to assess the efficacy of the proposed algorithm with real controllers, a hardware-in-the-loop test is conducted. The findings demonstrate that by leveraging the generalization capability of the quantile regression to estimate the value of the best available state-action pair and derive the policy using the advantage-weighted form of behavioral cloning to favor actions that receive higher advantage, the proposed data-driven RL can not only learn effectively from a static offline dataset, but also exhibit a robust policy improvement during the subsequent fine-tuning process.
automation & control systems,engineering, electrical & electronic,instruments & instrumentation
What problem does this paper attempt to address?