Value Refinement Network (VRN)

Value Refinement Network (VRN)

Jan Wöhlke, Felix Schmitt, Herke van Hoof

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 3558-3565. https://doi.org/10.24963/ijcai.2022/494

In robotic tasks, we encounter the unique strengths of (1) reinforcement learning (RL) that can handle high-dimensional observations as well as unknown, complex dynamics and (2) planning that can handle sparse and delayed rewards given a dynamics model. Combining these strengths of RL and planning, we propose the Value Refinement Network (VRN), in this work. Our VRN is an RL-trained neural network architecture that learns to locally refine an initial (value-based) plan in a simplified (2D) problem abstraction based on detailed local sensory observations. We evaluate the VRN on simulated robotic (navigation) tasks and demonstrate that it can successfully refine sub-optimal plans to match the performance of more costly planning in the non-simplified problem. Furthermore, in a dynamic environment, the VRN still enables high task completion without global re-planning.
Keywords:
Machine Learning: Deep Reinforcement Learning
Machine Learning: Reinforcement Learning
Planning and Scheduling: Learning in Planning and Scheduling