Online Planning for Ad Hoc Autonomous Agent Teams
Feng Wu, Shlomo Zilberstein and Xiaoping Chen
We propose a novel online planning algorithm for ad hoc team settings--challenging situations in which an agent must collaborate with unknown teammates without prior coordination. Our approach is based on constructing and solving a series of stage games, then using biased adaptive play to choose actions. The utility function in each stage game is estimated by Monte-Carlo tree search using the UCT algorithm. We establish analytically the convergence of the algorithm and show that it performs well in a variety of ad hoc team problems.