Efficient Offline Meta-Reinforcement Learning via Robust Task Representations and Adaptive Policy Generation

Efficient Offline Meta-Reinforcement Learning via Robust Task Representations and Adaptive Policy Generation

Zhengwei Li, Zhenyang Lin, Yurou Chen, Zhiyong Liu

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 4524-4532. https://doi.org/10.24963/ijcai.2024/500

Zero-shot adaptation is crucial for agents facing new tasks. Offline Meta-Reinforcement Learning (OMRL), utilizing offline multi-task datasets to train policies, offers a way to attain this ability. Although most OMRL methods construct task representations via contrastive learning and merge them with states for policy input, these methods may have inherent problems. Specifically, integrating task representations with states for policy input limits learning efficiency, due to failing to leverage the similarities among tasks. Moreover, uniformly sampling an equal number of negative samples from different tasks in contrastive learning can hinder differentiation of more similar tasks, potentially diminishing task representation robustness. In this paper, we introduce an OMRL algorithm to tackle the aforementioned issues. We design a network structure for efficient learning by leveraging task similarity. It features shared lower layers for common feature extraction with a hypernetworks-driven upper layer, customized to process features per task's attributes. Furthermore, to achieve robust task representations for generating task-specific control policies, we utilize contrastive learning and introduce a novel method to construct negative sample pairs based on task similarity. Experimental results show that our method notably boosts learning efficiency and zero-shot adaptation in new tasks, surpassing previous methods across multiple challenging domains.
Keywords:
Machine Learning: ML: Reinforcement learning
Machine Learning: ML: Meta-learning
Machine Learning: ML: Offline reinforcement learning
Robotics: ROB: Learning in robotics