Results 1  10
of
63
Exploiting locality of interaction in factored DecPOMDPs
 In Proc. Int. Joint Conf. Autonomous Agents and Multi Agent Systems
, 2008
"... Decentralized partially observable Markov decision processes (DecPOMDPs) constitute an expressive framework for multiagent planning under uncertainty, but solving them is provably intractable. We demonstrate how their scalability can be improved by exploiting locality of interaction between agents ..."
Abstract

Cited by 46 (20 self)
 Add to MetaCart
(Show Context)
Decentralized partially observable Markov decision processes (DecPOMDPs) constitute an expressive framework for multiagent planning under uncertainty, but solving them is provably intractable. We demonstrate how their scalability can be improved by exploiting locality of interaction between agents in a factored representation. Factored DecPOMDP representations have been proposed before, but only for DecPOMDPs whose transition and observation models are fully independent. Such strong assumptions simplify the planning problem, but result in models with limited applicability. By contrast, we consider general factored DecPOMDPs for which we analyze the model dependencies over space (locality of interaction) and time (horizon of the problem). We also present a formulation of decomposable value functions. Together, our results allow us to exploit the problem structure as well as heuristics in a single framework that is based on collaborative graphical Bayesian games (CGBGs). A preliminary experiment shows a speedup of two orders of magnitude.
Interactiondriven Markov games for decentralized multiagent planning under uncertainty
 in Proc. AAMAS
, 2008
"... In this paper we propose interactiondriven Markov games (IDMGs), a new model for multiagent decision making under uncertainty. IDMGs aim at describing multiagent decision problems in which interaction among agents is a local phenomenon. To this purpose, we explicitly distinguish between situations ..."
Abstract

Cited by 34 (10 self)
 Add to MetaCart
(Show Context)
In this paper we propose interactiondriven Markov games (IDMGs), a new model for multiagent decision making under uncertainty. IDMGs aim at describing multiagent decision problems in which interaction among agents is a local phenomenon. To this purpose, we explicitly distinguish between situations in which agents should interact and situations in which they can afford to act independently. The agents are coupled through the joint rewards and joint transitions in the states in which they interact. The model combines several fundamental properties from transitionindependent DecMDPs and weakly coupled MDPs while allowing to address, in several aspects, more general problems. We introduce a fast approximate solution method for planning in IDMGs, exploiting their particular structure, and we illustrate its successful application on several large multiagent tasks.
Incremental Policy Generation for FiniteHorizon DECPOMDPs
"... Solving multiagent planning problems modeled as DECPOMDPs is an important challenge. These models are often solved by using dynamic programming, but the high resource usage of current approaches results in limited scalability. To improve the efficiency of dynamic programming algorithms, we propose ..."
Abstract

Cited by 33 (21 self)
 Add to MetaCart
Solving multiagent planning problems modeled as DECPOMDPs is an important challenge. These models are often solved by using dynamic programming, but the high resource usage of current approaches results in limited scalability. To improve the efficiency of dynamic programming algorithms, we propose a new backup algorithm that is based on a reachability analysis of the state space. This method, which we call incremental policy generation, can be used to produce an optimal solution for any possible initial state or further scalability can be achieved by making use of a known start state. When incorporated into the optimal dynamic programming algorithm, our experiments show that planning horizon can be increased due to a marked reduction in resource consumption. This approach also fits nicely with approximate dynamic programming algorithms. To demonstrate this, we incorporate it into the stateoftheart PBIP algorithm and show significant performance gains. The results suggest that the performance of other dynamic programming algorithms for DECPOMDPs could be similarly improved by integrating the incremental policy generation approach.
Lossless clustering of histories in decentralized POMDPs
 In Proc. of the Eighth Int. Joint Conf. on Autonomous Agents and Multiagent Systems
, 2009
"... Decentralized partially observable Markov decision processes (DecPOMDPs) constitute a generic and expressive framework for multiagent planning under uncertainty. However, planning optimally is difficult because solutions map local observation histories to actions, and the number of such histories g ..."
Abstract

Cited by 26 (7 self)
 Add to MetaCart
(Show Context)
Decentralized partially observable Markov decision processes (DecPOMDPs) constitute a generic and expressive framework for multiagent planning under uncertainty. However, planning optimally is difficult because solutions map local observation histories to actions, and the number of such histories grows exponentially in the planning horizon. In this work, we identify a criterion that allows for lossless clustering of observation histories: i.e., we prove that when two histories satisfy the criterion, they have the same optimal value and thus can be treated as one. We show how this result can be exploited in optimal policy search and demonstrate empirically that it can provide a speedup of multiple orders of magnitude, allowing the optimal solution of significantly larger problems. We also perform an empirical analysis of the generality of our clustering method, which suggests that it may also be useful in other (approximate) DecPOMDP solution methods.
Pointbased backup for decentralized POMDPs: Complexity and new algorithms
 In AAMAS
, 2010
"... Decentralized POMDPs provide an expressive framework for sequential multiagent decision making. Despite their high complexity, there has been significant progress in scaling up existing algorithms, largely due to the use of pointbased methods. Performing pointbased backup is a fundamental operatio ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
(Show Context)
Decentralized POMDPs provide an expressive framework for sequential multiagent decision making. Despite their high complexity, there has been significant progress in scaling up existing algorithms, largely due to the use of pointbased methods. Performing pointbased backup is a fundamental operation in stateoftheart algorithms. We show that even a single backup step in the multiagent setting is NPComplete. Despite this negative worstcase result, we present an efficient and scalable optimal algorithm as well as a principled approximation scheme. The optimal algorithm exploits recent advances in the weighted CSP literature to overcome the complexity of the backup operation. The polytime approximation scheme provides a constant factor approximation guarantee based on the number of belief points. In experiments on standard domains, the optimal approach provides significant speedup (up to 2 orders of magnitude) over the previous best optimal algorithm and is able to increase the number of belief points by more than a factor of 3. The approximation scheme also works well in practice, providing nearoptimal solutions to the backup problem.
Scaling Up Optimal Heuristic Search in DecPOMDPs via Incremental Expansion
"... Planning under uncertainty for multiagent systems can be formalized as a decentralized partially observable Markov decision process. We advance the state of the art for optimal solution of this model, building on the Multiagent A * heuristic search method. A key insight is that we can avoid the full ..."
Abstract

Cited by 21 (14 self)
 Add to MetaCart
(Show Context)
Planning under uncertainty for multiagent systems can be formalized as a decentralized partially observable Markov decision process. We advance the state of the art for optimal solution of this model, building on the Multiagent A * heuristic search method. A key insight is that we can avoid the full expansion of a search node that generates a number of children that is doubly exponential in the node’s depth. Instead, we incrementally expand the children only when a next child might have the highest heuristic value. We target a subsequent bottleneck by introducing a more memoryefficient representation for our heuristic functions. Proof is given that the resulting algorithm is correct and experiments demonstrate a significant speedup over the state of the art, allowing for optimal solutions over longer horizons for many benchmark problems. 1
Incremental Clustering and Expansion for Faster Optimal Planning in Decentralized POMDPs
, 2013
"... This article presents the stateoftheart in optimal solution methods for decentralized partially observable Markov decision processes (DecPOMDPs), which are general models for collaborative multiagent planning under uncertainty. Building off the generalized multiagent A * (GMAA*) algorithm, which ..."
Abstract

Cited by 19 (12 self)
 Add to MetaCart
(Show Context)
This article presents the stateoftheart in optimal solution methods for decentralized partially observable Markov decision processes (DecPOMDPs), which are general models for collaborative multiagent planning under uncertainty. Building off the generalized multiagent A * (GMAA*) algorithm, which reduces the problem to a tree of oneshot collaborative Bayesian games (CBGs), we describe several advances that greatly expand the range of DecPOMDPs that can be solved optimally. First, we introduce lossless incremental clustering of the CBGs solved by GMAA*, which achieves exponential speedups without sacrificing optimality. Second, we introduce incremental expansion of nodes in the GMAA * search tree, which avoids the need to expand all children, the number of which is in the worst case doubly exponential in the node’s depth. This is particularly beneficial when little clustering is possible. In addition, we introduce new hybrid heuristic representations that are more compact and thereby enable the solution of larger DecPOMDPs. We provide theoretical guarantees that, when a suitable heuristic is used, both incremental clustering and incremental expansion yield algorithms that are both complete and search equivalent. Finally, we present extensive empirical results demonstrating that GMAA*ICE, an algorithm that synthesizes these advances, can optimally solve DecPOMDPs of unprecedented size.
Scalable Multiagent Planning Using Probabilistic Inference
"... Multiagent planning has seen much progress with the development of formal models such as DecPOMDPs. However, the complexity of these models—NEXPComplete even for two agents— has limited scalability. We identify certain mild conditions that are sufficient to make multiagent planning amenable to a s ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
Multiagent planning has seen much progress with the development of formal models such as DecPOMDPs. However, the complexity of these models—NEXPComplete even for two agents— has limited scalability. We identify certain mild conditions that are sufficient to make multiagent planning amenable to a scalable approximation w.r.t. the number of agents. This is achieved by constructing a graphical model in which likelihood maximization is equivalent to plan optimization. Using the ExpectationMaximization framework for likelihood maximization, we show that the necessary inference can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We derive a global update rule that combines these local inferences to monotonically increase the overall solution quality. Experiments on a large multiagent planning benchmark confirm the benefits of the new approach in terms of runtime and scalability. 1
MultiAgent Online Planning with Communication
"... We propose an online algorithm for planning under uncertainty in multiagent settings modeled as DECPOMDPs. The algorithm helps overcome the high computational complexity of solving such problems offline. The key challenge is to produce coordinated behavior using little or no communication. When c ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
We propose an online algorithm for planning under uncertainty in multiagent settings modeled as DECPOMDPs. The algorithm helps overcome the high computational complexity of solving such problems offline. The key challenge is to produce coordinated behavior using little or no communication. When communication is allowed but constrained, the challenge is to produce high value with minimal communication. The algorithm addresses these challenges by communicating only when history inconsistency is detected, allowing communication to be postponed if necessary. Moreover, it bounds the memory usage at each step and can be applied to problems with arbitrary horizons. The experimental results confirm that the algorithm can solve problems that are too large for the best existing offline planning algorithms and it outperforms the best online method, producing higher value with much less communication in most cases.
Multiagent Planning under Uncertainty with Stochastic Communication Delays
 In Proc. of the 18th Int. Conf. on Automated Planning and Scheduling
, 2008
"... We consider the problem of cooperative multiagent planning under uncertainty, formalized as a decentralized partially observable Markov decision process (DecPOMDP). Unfortunately, in these models optimal planning is provably intractable. By communicating their local observations before they take ac ..."
Abstract

Cited by 16 (8 self)
 Add to MetaCart
We consider the problem of cooperative multiagent planning under uncertainty, formalized as a decentralized partially observable Markov decision process (DecPOMDP). Unfortunately, in these models optimal planning is provably intractable. By communicating their local observations before they take actions, agents synchronize their knowledge of the environment, and the planning problem reduces to a centralized POMDP. As such, relying on communication significantly reduces the complexity of planning. In the real world however, such communication might fail temporarily. We present a step towards more realistic communication models for DecPOMDPs by proposing a model that: (1) allows that communication might be delayed by one or more time steps, and (2) explicitly considers future probabilities of successful communication. For our model, we discuss how to efficiently compute an (approximate) value function and corresponding policies, and we demonstrate our theoretical results with encouraging experiments.