Results 1  10
of
21
Adopt: asynchronous distributed constraint optimization with quality guarantees
 ARTIFICIAL INTELLIGENCE LABORATORY, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
, 2005
"... ..."
Taking DCOP to the real world: efficient complete solutions for distributed multievent scheduling
 in AAMAS
, 2004
"... Distributed Constraint Optimization (DCOP) is an elegant formalism relevant to many areas in multiagent systems, yet complete algorithms have not been pursued for real world applications due to perceived complexity. To capably capture a rich class of complex problem domains, we introduce the Distrib ..."
Abstract

Cited by 128 (28 self)
 Add to MetaCart
(Show Context)
Distributed Constraint Optimization (DCOP) is an elegant formalism relevant to many areas in multiagent systems, yet complete algorithms have not been pursued for real world applications due to perceived complexity. To capably capture a rich class of complex problem domains, we introduce the Distributed MultiEvent Scheduling (DiMES) framework and design congruent DCOP formulations with binary constraints which are proven to yield the optimal solution. To approach realworld efficiency requirements, we obtain immense speedups by improving communication structure and precomputing best case bounds. Heuristics for generating better communication structures and calculating bound in a distributed manner are provided and tested on systematically developed domains for meeting scheduling and sensor networks, exemplifying the viability of complete algorithms. 1.
Networked Distributed POMDPs: A Synthesis of Distributed Constraint Optimization and POMDPs
, 2005
"... In many realworld multiagent applications such as distributed sensor nets, a network of agents is formed based on each agent’s limited interactions with a small number of neighbors. While distributed POMDPs capture the realworld uncertainty in multiagent domains, they fail to exploit such locality ..."
Abstract

Cited by 91 (20 self)
 Add to MetaCart
In many realworld multiagent applications such as distributed sensor nets, a network of agents is formed based on each agent’s limited interactions with a small number of neighbors. While distributed POMDPs capture the realworld uncertainty in multiagent domains, they fail to exploit such locality of interaction. Distributed constraint optimization (DCOP) captures the locality of interaction but fails to capture planning under uncertainty. This paper present a new model synthesized from distributed POMDPs and DCOPs, called Networked Distributed POMDPs (NDPOMDPs). Exploiting network structure enables us to present two novel algorithms for NDPOMDPs: a distributed policy generation algorithm that performs local search and a systematic policy search that is guaranteed to reach the global optimal.
Collaborative Multiagent Reinforcement Learning by Payoff Propagation
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of coordination graphs of Guestrin, Koller, and Parr (2002a) which exploits the dependencies between agents to decompose t ..."
Abstract

Cited by 53 (2 self)
 Add to MetaCart
In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of coordination graphs of Guestrin, Koller, and Parr (2002a) which exploits the dependencies between agents to decompose the global payoff function into a sum of local terms. First, we deal with the singlestate case and describe a payoff propagation algorithm that computes the individual actions that approximately maximize the global payoff function. The method can be viewed as the decisionmaking analogue of belief propagation in Bayesian networks. Second, we focus on learning the behavior of the agents in sequential decisionmaking tasks. We introduce different modelfree reinforcementlearning techniques, unitedly called Sparse Cooperative Qlearning, which approximate the global actionvalue function based on the topology of a coordination graph, and perform updates using the contribution of the individual agents to the maximal global action value. The combined use of an edgebased decomposition of the actionvalue function and the payoff propagation algorithm for efficient action selection, result in an approach that scales only linearly in the problem size. We provide experimental evidence that our method outperforms related multiagent reinforcementlearning methods based on temporal differences.
Letting loose a SPIDER on a network of POMDPs: Generating quality guaranteed policies
 In AAMAS
, 2007
"... Distributed Partially Observable Markov Decision Problems (Distributed POMDPs) are a popular approach for modeling multiagent systems acting in uncertain domains. Given the significant complexity of solving distributed POMDPs, particularly as we scale up the numbers of agents, one popular approach ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
(Show Context)
Distributed Partially Observable Markov Decision Problems (Distributed POMDPs) are a popular approach for modeling multiagent systems acting in uncertain domains. Given the significant complexity of solving distributed POMDPs, particularly as we scale up the numbers of agents, one popular approach has focused on approximate solutions. Though this approach is efficient, the algorithms within this approach do not provide any guarantees on solution quality. A second less popular approach focuses on global optimality, but typical results are available only for two agents, and also at considerable computational cost. This paper overcomes the limitations of both these approaches by providing SPIDER, a novel combination of three key features for policy generation in distributed POMDPs: (i) it exploits agent interaction structure given a network of agents (i.e. allowing easier scaleup to larger number of agents); (ii) it uses a combination of heuristics to speedup policy search; and (iii) it allows quality guaranteed approximations, allowing a systematic tradeoff of solution quality for time. Experimental results show orders of magnitude improvement in performance when compared with previous global optimal algorithms.
DCOPs Meet the Real World: Exploring Unknown Reward Matrices with Applications to Mobile Sensor Networks
"... Buoyed by recent successes in the area of distributed constraint optimization problems (DCOPs), this paper addresses challenges faced when applying DCOPs to realworld domains. Three fundamental challenges must be addressed for a class of realworld domains, requiring novel DCOP algorithms. First, a ..."
Abstract

Cited by 27 (5 self)
 Add to MetaCart
(Show Context)
Buoyed by recent successes in the area of distributed constraint optimization problems (DCOPs), this paper addresses challenges faced when applying DCOPs to realworld domains. Three fundamental challenges must be addressed for a class of realworld domains, requiring novel DCOP algorithms. First, agents may not know the payoff matrix and must explore the environment to determine rewards associated with variable settings. Second, agents may need to maximize total accumulated reward rather than instantaneous final reward. Third, limited time horizons disallow exhaustive exploration of the environment. We propose and implement a set of novel algorithms that combine decisiontheoretic exploration approaches with DCOPmandated coordination. In addition to simulation results, we implement these algorithms on robots, deploying DCOPs on a distributed mobile sensor network.
Not all agents are equal: Scaling up distributed POMDPs for agent networks
 In: Proceedings of the seventh international
, 2008
"... Many applications of networks of agents, including mobile sensor networks, unmanned air vehicles, autonomous underwater vehicles, involve 100s of agents acting collaboratively under uncertainty. Distributed Partially Observable Markov Decision Problems (Distributed POMDPs) are wellsuited to address ..."
Abstract

Cited by 24 (4 self)
 Add to MetaCart
(Show Context)
Many applications of networks of agents, including mobile sensor networks, unmanned air vehicles, autonomous underwater vehicles, involve 100s of agents acting collaboratively under uncertainty. Distributed Partially Observable Markov Decision Problems (Distributed POMDPs) are wellsuited to address such applications, but so far, only limited scaleups of up to five agents have been demonstrated. This paper escalates the scaleup, presenting an algorithm called FANS, increasing the number of agents in distributed POMDPs for the first time into double digits. FANS is founded on finite state machines (FSMs) for policy representation and expoits these FSMs to provide three key contributions: (i) Not all agents within an agent network need the same expressivity of policy representation; FANS introduces novel heuristics to automatically vary the FSM size in different agents for scaleup;
On Koptimal distributed constraint optimization algorithms: new bounds and algorithms
 In AAMAS ’08
, 2008
"... Distributed constraint optimization (DCOP) is a promising approach to coordination, scheduling and task allocation in multi agent networks. In largescale or lowbandwidth networks, finding the global optimum is often impractical. Koptimality is a promising new approach: for the first time it provi ..."
Abstract

Cited by 16 (8 self)
 Add to MetaCart
(Show Context)
Distributed constraint optimization (DCOP) is a promising approach to coordination, scheduling and task allocation in multi agent networks. In largescale or lowbandwidth networks, finding the global optimum is often impractical. Koptimality is a promising new approach: for the first time it provides us a set of locally optimal algorithms with quality guarantees as a fraction of global optimum. Unfortunately, previous work in koptimality did not address domains where we may have prior knowledge of reward structure; and it failed to provide quality guarantees or algorithms for domains with hard constraints (such as agents ’ local resource constraints). This paper addresses these shortcomings with three key contributions. It provides: (i) improved lowerbounds on koptima quality incorporating available prior knowledge of reward structure; (ii) lower bounds on koptima quality for problems with hard constraints; and (iii) koptimal algorithms for solving DCOPs with hard constraints and detailed experimental results on largescale networks.
Exploiting locality of interaction in networked distributed POMDPs
 In AAAI Spring Symposium on Distributed Planning and Scheduling
, 2006
"... In many realworld multiagent applications such as distributed sensor nets, a network of agents is formed based on each agent’s limited interactions with a small number of neighbors. While distributed POMDPs capture the realworld uncertainty in multiagent domains, they fail to exploit such locality ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
In many realworld multiagent applications such as distributed sensor nets, a network of agents is formed based on each agent’s limited interactions with a small number of neighbors. While distributed POMDPs capture the realworld uncertainty in multiagent domains, they fail to exploit such locality of interaction. Distributed constraint optimization (DCOP) captures the locality of interaction but fails to capture planning under uncertainty. In previous work, we presented a model synthesized from distributed POMDPs and DCOPs, called Networked Distributed POMDPs (NDPOMDPs). Also, we presented LIDJESP (locally interacting distributed joint equilibriumbased search for policies: a distributed policy generation algorithm based on DBA (distributed breakout algorithm). In this paper, we present a stochastic variation of the LIDJESP that is based on DSA (distributed stochastic algorithm) that allows neighboring agents to change their policies in the same cycle. Through detailed experiments, we show how this can result in speedups without a large difference in solution quality. We also introduce a technique called hyperlinkbased decomposition that allows us to exploit locality of interaction further, resulting in faster run times for both LIDJESP and its stochastic variant without any loss in solution quality.
Conflicts in teamwork: Hybrids to the rescue
 In AAMAS ’05: Proceedings of the fourth international
, 2005
"... Today within the AAMAS community, we see at least four competing approaches to building multiagent systems: beliefdesireintention (BDI), distributed constraint optimization (DCOP), distributed POMDPs, and auctions or gametheoretic approaches. While there is exciting progress within each approach, ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
(Show Context)
Today within the AAMAS community, we see at least four competing approaches to building multiagent systems: beliefdesireintention (BDI), distributed constraint optimization (DCOP), distributed POMDPs, and auctions or gametheoretic approaches. While there is exciting progress within each approach, there is a lack of crosscutting research. This paper highlights hybrid approaches for multiagent teamwork. In particular, for the past decade, the TEAMCORE research group has focused on building agent teams in complex, dynamic domains. While our early work was inspired by BDI, we will present an overview of recent research that uses DCOPs and distributed POMDPs in building agent teams. While DCOP and distributed POMDP algorithms provide promising results, hybrid approaches help us address problems of scalability and expressiveness. For example, in the BDIPOMDP hybrid approach, BDI team plans are exploited to improve POMDP tractability, and POMDPs improve BDI team plan performance. We present some recent results from applying this approach in a Disaster Rescue simulation domain being developed with help from the Los