Scaling Up Optimal Heuristic Search in Dec-POMDPs via Incremental Expansion
Matthijs Spaan, Frans Oliehoek and Christopher Amato
Planning under uncertainty for multiagent systems can be formalized as a decentralized partially observable Markov decision process (Dec-POMDP). We advance the state of the art in its optimal solution, building on the Multiagent A* (MAA*) heuristic search method. A key insight is that we can avoid the full expansion of a search node that generates a number of children doubly exponential in the node's depth. Instead we incrementally expand the children of a node only when a next child might have the highest heuristic value. We target a subsequent bottleneck by introducing a more memory-efficient representation for our heuristic functions. Proof is given that the resulting algorithm is correct and experiments demonstrate a significant speed up over the state-of-the-art allowing for the optimal solution over longer horizons for many benchmark problems.