Selective Learning for Sample-Efficient Training in Multi-Agent Sparse Reward Tasks (Extended Abstract)

Selective Learning for Sample-Efficient Training in Multi-Agent Sparse Reward Tasks (Extended Abstract)

Xinning Chen, Xuan Liu, Yanwen Ba, Shigeng Zhang, Bo Ding, Kenli Li

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Sister Conferences Best Papers. Pages 8384-8388. https://doi.org/10.24963/ijcai.2024/927

Learning effective strategies in sparse reward tasks is one of the fundamental challenges in reinforcement learning. This becomes extremely difficult in multi-agent environments, as the concurrent learning of multiple agents induces the non-stationarity problem and a sharply increased joint state space. Existing works have attempted to promote multi-agent cooperation through experience sharing. However, learning from a large collection of shared experiences is inefficient as there are only a few high-value states in sparse reward tasks, which may instead lead to the curse of dimensionality in large-scale multi-agent systems. This paper focuses on sparse-reward multi-agent cooperative tasks and proposes an effective experience-sharing method, Multi-Agent Selective Learning (MASL), to boost sample-efficient training by reusing valuable experiences from other agents. MASL adopts a retrogression-based selection method to identify high-value traces of agents from the team rewards, based on which some recall traces are generated and shared among agents to motivate effective exploration. Moreover, MASL selectively considers information from other agents to cope with the non-stationarity issue while enabling efficient training for large-scale agents. Experimental results show that MASL significantly improves sample efficiency compared with state-of-the-art MARL algorithms in cooperative tasks with sparse rewards.
Keywords:
Machine Learning: ML: Multiagent Reinforcement Learning
Agent-based and Multi-agent Systems: MAS: Coordination and cooperation