Improving Reinforcement Learning with Confidence-Based Demonstrations

Improving Reinforcement Learning with Confidence-Based Demonstrations

Zhaodong Wang, Matthew E. Taylor

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 3027-3033. https://doi.org/10.24963/ijcai.2017/422

Reinforcement learning has had many successes, but in practice it often requires significant amounts of data to learn high-performing policies. One common way to improve learning is to allow a trained (source) agent to assist a new (target) agent. The goals in this setting are to 1) improve the target agent's performance, relative to learning unaided, and 2) allow the target agent to outperform the source agent. Our approach leverages source agent demonstrations, removing any requirements on the source agent's learning algorithm or representation. The target agent then estimates the source agent's policy and improves upon it. The key contribution of this work is to show that leveraging the target agent's uncertainty in the source agent's policy can significantly improve learning in two complex simulated domains, Keepaway and Mario.
Keywords:
Machine Learning: Reinforcement Learning
Multidisciplinary Topics and Applications: Human-Computer Interaction