Real-Time Solving of Quantified CSPs Based on Monte-Carlo Game Tree Search

Baba Satomi, Yongjoon Joe, Atsushi Iwasaki, Makoto Yokoo

We develop a real-time algorithm based on a Monte-Carlo game tree search for solving a quantified constraint satisfaction problem (QCSP), which is a CSP where some variables are universally quantified. A universally quantified variable represents a choice of nature or an adversary. The goal of a QCSP is to make a robust plan against an adversary. However, obtaining a complete plan off-line is intractable when the size of the problem becomes large. Thus, we need to develop a real-time algorithm that sequentially selects a promising value at each deadline. Such a problem has been considered in the field of game tree search. In a standard game tree search algorithm, developing a good static evaluation function is crucial. However, developing a good static evaluation function for a QCSP is very difficult since it must estimate the possibility that a partially assigned QCSP is solvable. Thus, we apply a Monte-Carlo game tree search technique called UCT. However, the simple application of the UCT algorithm does not work since the player and the adversary are asymmetric, i.e., finding a game sequence where the player wins is very rare. We overcome this difficulty by introducing constraint propagation techniques. We experimentally compare the winning probability of our UCT-based algorithm and the state-of-the-art alpha-beta search algorithm. Our results show that our algorithm outperforms the state-of-the-art algorithm in large-scale problems.