How to Learn Domain-Invariant Representations for Visual Reinforcement Learning: An Information-Theoretical Perspective

How to Learn Domain-Invariant Representations for Visual Reinforcement Learning: An Information-Theoretical Perspective

Shuo Wang, Zhihao Wu, Jinwen Wang, Xiaobo Hu, Youfang Lin, Kai Lv

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 1389-1397. https://doi.org/10.24963/ijcai.2024/154

Despite the impressive success in visual control challenges, Visual Reinforcement Learning (VRL) policies have struggled to generalize to other scenarios. Existing works attempt to empirically improve the generalization capability, lacking theoretical support. In this work, we explore how to learn domain-invariant representations for VRL from an information-theoretical perspective. Specifically, we identify three Mutual Information (MI) terms. These terms highlight that a robust representation should preserve domain invariant information (return and dynamic transition) under significant observation perturbation. Furthermore, we relax the MI terms to derive three components for implementing a practical Mutual Information-based Invariant Representation (MIIR) algorithm for VRL. Extensive experiments demonstrate that MIIR achieves state-of-the-art generalization performance and the best sample efficiency in the DeepMind Control suite, Robotic Manipulation, and Carla.
Keywords:
Computer Vision: CV: Embodied vision: Active agents, simulation