A POMDP Approach to Token-Based Team Coordination

Yang Xu,Paul Scerri,Bin Yu,Michael Lewis,Katia Sycara
2005-01-01
Abstract:Ecient coordination among large numbers of heterogeneous agents promises to revolutionize the way in which some com- plex tasks, such as responding to urban disasters can be per- formed. Token-based approaches have shown to be a novel and promising way for such coordination. However, previ- ous token-based algorithms were built on heuristics and did not explicitly consider utilities related to token movements or changes in team states. In this paper we put forward an algorithm that uses team rewards to improve token rout- ing decisions. The ideal solution of this token movement model is a centralized Markov Decision Process (MDP) with joint activity. Unfortunately, the assumptions underlying this model are not feasible for large team coordination and we have to make several approximations. First, we decen- tralize the centralized MDP as a set of standard MDPs with independent individual activities. Then this MDP is approx- imated by a Partially Observable Markov Decision Process (POMDP) because agents in a large team may not know the exact states of their teammates or that of the environment. A logical team organization is imposed to limit the token passing among one agent and its neighbors. Belief states of the POMDP model are eciently estimated using Monte Carlo sampling process.
What problem does this paper attempt to address?