Measuring Structural Similarities in Finite MDPs

Measuring Structural Similarities in Finite MDPs

Hao Wang, Shaokang Dong, Ling Shao

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 3684-3690. https://doi.org/10.24963/ijcai.2019/511

In this paper, we investigate the structural similarities within a finite Markov decision process (MDP). We view a finite MDP as a heterogeneous directed bipartite graph and propose novel measures for state similarity and action similarity in a mutual reinforcement manner. We prove that the state similarity is a metric and the action similarity is a pseudometric. We also establish the connection between the proposed similarity measures and the optimal values of the MDP. Extensive experiments show that the proposed measures are effective.
Keywords:
Machine Learning: Reinforcement Learning
Machine Learning: Transfer, Adaptation, Multi-task Learning
Planning and Scheduling: Markov Decisions Processes
Planning and Scheduling: Planning under Uncertainty