Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions

Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions

Aijun Bai, Stuart Russell

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 1418-1424. https://doi.org/10.24963/ijcai.2017/196

In the context of hierarchical reinforcement learning, the idea of hierarchies of abstract machines (HAMs) is to write a partial policy as a set of hierarchical finite state machines with unspecified choice states, and use reinforcement learning to learn an optimal completion of this partial policy. Given a HAM with potentially deep hierarchical structure, there often exist many internal transitions where a machine calls another machine with the environment state unchanged. In this paper, we propose a new hierarchical reinforcement learning algorithm that discovers such internal transitions automatically, and shortcircuits them recursively in computation of Q values. The resulting HAMQ-INT algorithm outperforms the state of the art significantly on the benchmark Taxi domain and a much more complex RoboCup Keepaway domain.
Keywords:
Machine Learning: Reinforcement Learning
Robotics and Vision: Multi-Robot Systems
Robotics and Vision: Robotics
Planning and Scheduling: Markov Decisions Processes