Deep reinforcement learning-based energy management of hybrid battery systems in electric vehicles

Weihan Li,Han Cui,Thomas Nemeth,Jonathan Jansen,Cem Ünlübayir,Zhongbao Wei,Lei Zhang,Zhenpo Wang,Jiageng Ruan,Haifeng Dai,Xuezhe Wei,Dirk Uwe Sauer
DOI: https://doi.org/10.1016/j.est.2021.102355
IF: 9.4
2021-04-01
Journal of Energy Storage
Abstract:<p>In this paper, we propose an energy management strategy based on deep reinforcement learning for a hybrid battery system in electric vehicles consisting of a high-energy and a high-power battery pack. The energy management strategy of the hybrid battery system was developed based on the electrical and thermal characterization of the battery cells, aiming at minimizing the energy loss and increasing both the electrical and thermal safety level of the whole system. Primarily, we designed a novel reward term to explore the optimal operating range of the high-power pack without imposing a rigid constraint of state of charge. Furthermore, various load profiles were randomly combined to train the deep Q-learning model, which avoided the overfitting problem. The training and validation results showed both the effectiveness and reliability of the proposed strategy in loss reduction and safety enhancement. The proposed energy management strategy has demonstrated its superiority over the reinforcement learning-based methods in both computation time and energy loss reduction of the hybrid battery system, highlighting the use of such an approach in future energy management systems.</p>
energy & fuels
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to optimize the energy management strategy of the hybrid battery system through deep reinforcement learning (DRL) in electric vehicles. Specifically, the research aims to design an energy management strategy based on deep Q - learning (DQL) to reduce energy loss and improve the electrical and thermal safety of the entire system. The strategy proposed in the paper focuses particularly on the power distribution between high - energy battery packs (HE) and high - power battery packs (HP). By designing a new reward function, it explores the optimal working range of the HP battery pack while avoiding rigid constraints on the state of charge (SoC). In addition, in order to prevent the over - fitting problem, the paper uses a random combination of different load configurations to train the DQL model. The main contributions of the paper are as follows: 1. Developed an energy management strategy based on DQL, aiming to reduce energy loss and enhance electrical and thermal safety. 2. Designed a novel reward function for automatically determining the optimal working range of the HP battery pack. 3. Established and parameterized a coupled electro - thermal battery model for simulating the electrical and thermal dynamic characteristics of HE and HP battery cells. 4. Designed a new training scenario to limit the over - fitting problem by randomly combining different driving conditions within each training cycle. 5. Verified the proposed DQL - based energy management strategy using new driving conditions and conducted a comparative study with the Q - learning (QL) - based strategy, highlighting the superiority of the proposed method in terms of calculation and performance.