LwDeepStack: A lightweight DeepStack based on knowledge distillation

Jiajia Zhang,Zengyue Guo,Huale Li,Xuan Wang,Dongming Liu,Shuhan Qi
DOI: https://doi.org/10.1109/ICDIS55630.2022.00072
2022-01-01
Abstract:Counterfactual regret minimization (CFR) is a classical method to find Nash equilibrium in two-player games, which has achieved great success in recent years. In recent years, the method combining CFR and deep neural networks shows strong performance on solving two-player games, such as DeepStack. However, when using this kind of method to small equipments, its decision time and decision model greatly limits its application in many scenes. In order to solve this problem, we propose an improved vatiant of DeepStack based on knowledge distillation, which we called lightweight DeepStack (LwDeepStack). In the LwDeepStack, we design a specific objective function to distill knowledge from counterfactual value networks of DeepStack. Experimental results show that the proposed method simplifies the model and reduces decision time effectively, while the game performance has not decreased compared with the original DeepStack.
What problem does this paper attempt to address?