Measuring Structural Similarities in Finite MDPs.

Hao Wang,Shaokang Dong,Ling Shao
DOI: https://doi.org/10.24963/ijcai.2019/511
2019-01-01
Abstract:In this paper, we investigate the structural similarities within a finite Markov decision process (MDP). We view a finite MDP as a heterogeneous directed bipartite graph and propose novel measures for state similarity and action similarity in a mutual reinforcement manner. We prove that the state similarity is a metric and the action similarity is a pseudometric. We also establish the connection between the proposed similarity measures and the optimal values of the MDP. Extensive experiments show that the proposed measures are effective.
What problem does this paper attempt to address?