Multi-step backtrack Q-learning based dynamic optimal algorithm for auto generation control order dispatch

Liu Qian-jin
2011-01-01
Abstract:This paper presents the application of multi-step backtrack Q(λ) learning-based methodology on CPS order dynamic dispatch problem.The proposed Q(λ) learning can effectively solve the long time-delay assessment for the action strategy of one step Q-learning in the thermal dominated power system.AGC production cost is formulated as Markov decision process(MDP) reward function by means of linear weighted aggregative approach in the CPS order multi-objective dynamic optimal dispatch.Simulation of institute of electrical and electronics engineers(IEEE) two-area LFC model shows that the convergence time of the Q(λ) algorithm is reduced by more than 50% comparing with Q-learning.The statistical experiments of Q(λ) in the China Southern Power grid show that the proposed method can effectively enhance the robustness and dynamic performance of AGC systems in CPS assessment and save more than 5% of AGC production cost while the CPS compliances are ensured.
What problem does this paper attempt to address?