Online Planning for Ad Hoc Autonomous Agent Teams
Feng Wu, Shlomo Zilberstein, Xiaoping Chen
We propose a novel online planning algorithm for ad hoc team settings — challenging situations in which an agent must collaborate with unknown teammates without prior coordination. Our approach is based on constructing and solving a series of stage games, and then using biased adaptive play to choose actions. The utility function in each stage game is estimated via Monte-Carlo tree search using the UCT algorithm. We establish analytically the convergence of the algorithm and show that it performs well in a variety of ad hoc team domains.