Results 1  10
of
13
Scaling Up Optimal Heuristic Search in DecPOMDPs via Incremental Expansion
"... Planning under uncertainty for multiagent systems can be formalized as a decentralized partially observable Markov decision process. We advance the state of the art for optimal solution of this model, building on the Multiagent A * heuristic search method. A key insight is that we can avoid the full ..."
Abstract

Cited by 23 (15 self)
 Add to MetaCart
(Show Context)
Planning under uncertainty for multiagent systems can be formalized as a decentralized partially observable Markov decision process. We advance the state of the art for optimal solution of this model, building on the Multiagent A * heuristic search method. A key insight is that we can avoid the full expansion of a search node that generates a number of children that is doubly exponential in the node’s depth. Instead, we incrementally expand the children only when a next child might have the highest heuristic value. We target a subsequent bottleneck by introducing a more memoryefficient representation for our heuristic functions. Proof is given that the resulting algorithm is correct and experiments demonstrate a significant speedup over the state of the art, allowing for optimal solutions over longer horizons for many benchmark problems. 1
Incremental Clustering and Expansion for Faster Optimal Planning in Decentralized POMDPs
, 2013
"... This article presents the stateoftheart in optimal solution methods for decentralized partially observable Markov decision processes (DecPOMDPs), which are general models for collaborative multiagent planning under uncertainty. Building off the generalized multiagent A * (GMAA*) algorithm, which ..."
Abstract

Cited by 18 (12 self)
 Add to MetaCart
(Show Context)
This article presents the stateoftheart in optimal solution methods for decentralized partially observable Markov decision processes (DecPOMDPs), which are general models for collaborative multiagent planning under uncertainty. Building off the generalized multiagent A * (GMAA*) algorithm, which reduces the problem to a tree of oneshot collaborative Bayesian games (CBGs), we describe several advances that greatly expand the range of DecPOMDPs that can be solved optimally. First, we introduce lossless incremental clustering of the CBGs solved by GMAA*, which achieves exponential speedups without sacrificing optimality. Second, we introduce incremental expansion of nodes in the GMAA * search tree, which avoids the need to expand all children, the number of which is in the worst case doubly exponential in the node’s depth. This is particularly beneficial when little clustering is possible. In addition, we introduce new hybrid heuristic representations that are more compact and thereby enable the solution of larger DecPOMDPs. We provide theoretical guarantees that, when a suitable heuristic is used, both incremental clustering and incremental expansion yield algorithms that are both complete and search equivalent. Finally, we present extensive empirical results demonstrating that GMAA*ICE, an algorithm that synthesizes these advances, can optimally solve DecPOMDPs of unprecedented size.
Bayesian ActionGraph Games
"... Games of incomplete information, or Bayesian games, are an important gametheoretic model and have many applications in economics. We propose Bayesian actiongraph games (BAGGs), a novel graphical representation for Bayesian games. BAGGs can represent arbitrary Bayesian games, and furthermore can com ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
(Show Context)
Games of incomplete information, or Bayesian games, are an important gametheoretic model and have many applications in economics. We propose Bayesian actiongraph games (BAGGs), a novel graphical representation for Bayesian games. BAGGs can represent arbitrary Bayesian games, and furthermore can compactly express Bayesian games exhibiting commonly encountered types of structure including symmetry, action and typespecific utility independence, and probabilistic independence of type distributions. We provide an algorithm for computing expected utility in BAGGs, and discuss conditions under which the algorithm runs in polynomial time. BayesNash equilibria of BAGGs can be computed by adapting existing algorithms for completeinformation normal form games and leveraging our expected utility algorithm. We show both theoretically and empirically that our approaches improve significantly on the state of the art. 1
Exploiting structure in cooperative Bayesian games
 In UAI
, 2012
"... Cooperative Bayesian games (BGs) can model decisionmaking problems for teams of agents under imperfect information, but require space and computation time that is exponential in the number of agents. While agent independence has been used to mitigate these problems in perfect information settings, ..."
Abstract

Cited by 7 (7 self)
 Add to MetaCart
(Show Context)
Cooperative Bayesian games (BGs) can model decisionmaking problems for teams of agents under imperfect information, but require space and computation time that is exponential in the number of agents. While agent independence has been used to mitigate these problems in perfect information settings, we propose a novel approach for BGs based on the observation that BGs additionally possess a different types of structure, which we call type independence. We propose a factor graph representation that captures both forms of independence and present a theoretical analysis showing that nonserial dynamic programming cannot effectively exploit type independence, while MaxSum can. Experimental results demonstrate that ourapproachcantacklecooperativeBayesian games of unprecedented size. 1
Point Based Value Iteration with Optimal Belief Compression for DecPOMDPs
"... We present four major results towards solving decentralized partially observable Markov decision problems (DecPOMDPs) culminating in an algorithm that outperforms all existing algorithms on all but one standard infinitehorizon benchmark problems. (1) We give an integer program that solves collabo ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
We present four major results towards solving decentralized partially observable Markov decision problems (DecPOMDPs) culminating in an algorithm that outperforms all existing algorithms on all but one standard infinitehorizon benchmark problems. (1) We give an integer program that solves collaborative Bayesian games (CBGs). The program is notable because its linear relaxation is very often integral. (2) We show that a DecPOMDP with bounded belief can be converted to a POMDP (albeit with actions exponential in the number of beliefs). These actions correspond to strategies of a CBG. (3) We present a method to transform any DecPOMDP into a DecPOMDP with bounded beliefs (the number of beliefs is a free parameter) using optimal (not lossless) belief compression. (4) We show that the combination of these results opens the door for new classes of DecPOMDP algorithms based on previous POMDP algorithms. We choose one such algorithm, pointbased valued iteration, and modify it to produce the first tractable value iteration method for DecPOMDPs that outperforms existing algorithms. 1
Treebased solution methods for multiagent POMDPs with delayed communication
 In Proceedings of the TwentySixth AAAI Conference on Artificial Intelligence
, 2012
"... Multiagent Partially Observable Markov Decision Processes (MPOMDPs) provide a powerful framework for optimal decision making under the assumption of instantaneous communication. We focus on a delayed communication setting (MPOMDPDC), in which broadcasted information is delayed by at most one tim ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Multiagent Partially Observable Markov Decision Processes (MPOMDPs) provide a powerful framework for optimal decision making under the assumption of instantaneous communication. We focus on a delayed communication setting (MPOMDPDC), in which broadcasted information is delayed by at most one time step. This model allows agents to act on their most recent (private) observation. Such an assumption is a strict generalization over having agents wait until the global information is available and is more appropriate for applications in which response time is critical. In this setting, however, value function backups are significantly more costly, and naive application of incremental pruning, the core of many stateoftheart optimal POMDP techniques, is intractable. In this paper, we overcome this problem by demonstrating that computation of the MPOMDPDC backup can be structured as a tree and by introducing two novel treebased pruning techniques that exploit this structure in an effective way. We experimentally show that these methods have the potential to outperform naive incremental pruning by orders of magnitude, allowing for the solution of larger problems. 1
Computing Convex Coverage Sets for Faster MultiObjective Coordination
"... Abstract In this article, we propose new algorithms for multiobjective coordination graphs (MOCoGs). Key to the efficiency of these algorithms is that they compute a convex coverage set (CCS) instead of a Pareto coverage set (PCS). Not only is a CCS a sufficient solution set for a large class of p ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract In this article, we propose new algorithms for multiobjective coordination graphs (MOCoGs). Key to the efficiency of these algorithms is that they compute a convex coverage set (CCS) instead of a Pareto coverage set (PCS). Not only is a CCS a sufficient solution set for a large class of problems, it also has important characteristics that facilitate more efficient solutions. We propose two main algorithms for computing a CCS in MOCoGs. Convex multiobjective variable elimination (CMOVE) computes a CCS by performing a series of agent eliminations, which can be seen as solving a series of local multiobjective subproblems. Variable elimination linear support (VELS) iteratively identifies the single weight vector w that can lead to the maximal possible improvement on a partial CCS and calls variable elimination to solve a scalarized instance of the problem for w. VELS is faster than CMOVE for small and medium numbers of objectives and can compute an εapproximate CCS in a fraction of the runtime. In addition, we propose variants of these methods that employ AND/OR tree search instead of variable elimination to achieve memory efficiency. We analyze the runtime and space complexities of these methods, prove their correctness, and compare them empirically against a naive baseline and an existing PCS method, both in terms of memoryusage and runtime. Our results show that, by focusing on the CCS, these methods achieve much better scalability in the number of agents than the current state of the art.
Errorbounded approximations for infinitehorizon discounted decentralized POMDPs
 In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases
, 2014
"... Abstract. We address decentralized stochastic control problems represented as decentralized partially observable Markov decision processes (DecPOMDPs). This formalism provides a general model for decisionmaking under uncertainty in cooperative, decentralized settings, but the worstcase complexity ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We address decentralized stochastic control problems represented as decentralized partially observable Markov decision processes (DecPOMDPs). This formalism provides a general model for decisionmaking under uncertainty in cooperative, decentralized settings, but the worstcase complexity makes it difficult to solve optimally (NEXPcomplete). Recent advances suggest recasting DecPOMDPs into continuousstate and deterministic MDPs. In this form, however, states and actions are embedded into highdimensional spaces, making accurate estimate of states and greedy selection of actions intractable for all but trivialsized problems. The primary contribution of this paper is the first framework for errormonitoring during approximate estimation of states and selection of actions. Such a framework permits us to convert stateoftheart exact methods into errorbounded algorithms, which results in a scalability increase as demonstrated by experiments over problems of unprecedented sizes.