Enhanced DouDiZhu Card Game Strategy Using Oracle Guiding and Adaptive Deep Monte Carlo Method

Enhanced DouDiZhu Card Game Strategy Using Oracle Guiding and Adaptive Deep Monte Carlo Method

Qian Luo, Tien Ping Tan, Daochen Zha, Tianqiao Zhang

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 5972-5980. https://doi.org/10.24963/ijcai.2024/660

Deep Reinforcement Learning (DRL) exhibits significant advancements in games with both perfect and imperfect information, such as Go, Chess, Texas Hold'em, and Dota2. However, DRL encounters considerable challenges when tackling card game DouDiZhu because of the imperfect information, large state-action space, and the sparse reward issue. This paper presents OADMCDou, which combines Oracle Guiding and Adaptive Deep Monte Carlo Method to address the challenges in DouDiZhu. Oracle Guiding trains an Oracle agent with both imperfect and perfect information, gradually reducing reliance on imperfect information to transition to a standard agent. Adaptive Deep Monte Carlo uses gradient weight clipping and constrains the magnitude of updates to prevent extreme policy updates. We conduct extensive experiments to evaluate the effectiveness of the proposed methods, demonstrating OADMCDou's superior performance over the state-of-the-art DouDiZhu AI, DouZero. This superiority over DouZero is reflected in two metrics: a 95% confidence interval of 0.104 ± 0.041 for performance, and a 28.6% reduction in loss.
Keywords:
Multidisciplinary Topics and Applications: MTA: Entertainment
Multidisciplinary Topics and Applications: MTA: Computer games
Multidisciplinary Topics and Applications: MTA: Game playing
Agent-based and Multi-agent Systems: MAS: Other