DeltaDou: Expert-level Doudizhu AI through Self-play

DeltaDou: Expert-level Doudizhu AI through Self-play

Qiqi Jiang, Kuangzheng Li, Boyao Du, Hao Chen, Hai Fang

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 1265-1271. https://doi.org/10.24963/ijcai.2019/176

Artificial Intelligence has seen several breakthroughs in two-player perfect information game.  Nevertheless, Doudizhu, a three-player imperfect information game, is still quite challenging.  In this paper, we present a Doudizhu AI by applying deep reinforcement learning from games of self-play.  The algorithm combines an asymmetric MCTS on nodes of information set of each player, a policy-value network that approximates the policy and value on each decision node, and inference on unobserved hands of other players by given policy.  Our results show that self-play can significantly improve the performance of our agent in this multi-agent imperfect information game.  Even starting with a weak AI, our agent can achieve human expert level after days of self-play and training.
Keywords:
Heuristic Search and Game Playing: Game Playing and Machine Learning
Machine Learning: Reinforcement Learning
Uncertainty in AI: Approximate Probabilistic Inference