Results 1 - 10
of
17
Exploiting locality of interaction in factored Dec-POMDPs
- In Proc. Int. Joint Conf. Autonomous Agents and Multi Agent Systems
, 2008
"... Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute an expressive framework for multiagent planning under uncertainty, but solving them is provably intractable. We demonstrate how their scalability can be improved by exploiting locality of interaction between agents ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute an expressive framework for multiagent planning under uncertainty, but solving them is provably intractable. We demonstrate how their scalability can be improved by exploiting locality of interaction between agents in a factored representation. Factored Dec-POMDP representations have been proposed before, but only for Dec-POMDPs whose transition and observation models are fully independent. Such strong assumptions simplify the planning problem, but result in models with limited applicability. By contrast, we consider general factored Dec-POMDPs for which we analyze the model dependencies over space (locality of interaction) and time (horizon of the problem). We also present a formulation of decomposable value functions. Together, our results allow us to exploit the problem structure as well as heuristics in a single framework that is based on collaborative graphical Bayesian games (CGBGs). A preliminary experiment shows a speedup of two orders of magnitude.
Automated design of adaptive controllers for modular robots using reinforcement learning’, accepted for publication
- in International Journal of Robotics Research, Special Issue on SelfReconfigurable Modular Robots
, 2007
"... Designing distributed controllers for self-reconfiguring modular robots has been consistently challenging. We have developed a reinforcement learning approach which can be used both to automate controller design and to adapt robot behavior online. In this paper, we report on our study of reinforceme ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Designing distributed controllers for self-reconfiguring modular robots has been consistently challenging. We have developed a reinforcement learning approach which can be used both to automate controller design and to adapt robot behavior online. In this paper, we report on our study of reinforcement learning in the domain of selfreconfigurable modular robots: the underlying assumptions, the applicable algorithms, and the issues of partial observability, large search spaces and local optima. We propose and validate experimentally in simulation a number of techniques designed to address these and other scalability issues that arise in applying machine learning to distributed systems such as modular robots. We discuss ways to make learning faster, more robust and amenable to online application by giving scaffolding to the learning agents in the form of policy representation, structured experience and additional information. With enough structure modular robots can run learning algorithms to both automate the generation of distributed controllers, and adapt to the changing environment and deliver on the self-organization promise with less interference from human designers, programmers and operators.
A Survey on Sensor Networks from a Multi-Agent perspective
"... Sensor networks arise as one of the most promising technologies for the next decades. The recent emergence of small and inexpensive sensors based upon microelectromechanical system (MEMS) ease the development and proliferation of this kind of networks in a wide range of real-world applications. Mult ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Sensor networks arise as one of the most promising technologies for the next decades. The recent emergence of small and inexpensive sensors based upon microelectromechanical system (MEMS) ease the development and proliferation of this kind of networks in a wide range of real-world applications. Multi-Agent systems (MAS) have been identified as one of the most suitable technologies to contribute to this domain due to their appropriateness for modeling autonomous self-aware sensors in a flexible way. Firstly, this survey summarizes the actual challenges and research areas concerning sensor networks while identifying the most relevant MAS contributions. Secondly, we propose a taxonomy for sensor networks that classifies them depending on their features (and the research problems they pose). Finally, we identify some open future research directions and opportunities for MAS research. 1.
A distributed protocol for safe real-time planning of communicating vehicles with second-order dynamics
- In ROBOCOMM
"... Abstract—This work deals with the problem of planning in real-time, collision-free motions for multiple communicating vehicles that operate in the same, partially-observable environment. A challenging aspect of this problem is how to utilize communication so that vehicles do not reach states from wh ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract—This work deals with the problem of planning in real-time, collision-free motions for multiple communicating vehicles that operate in the same, partially-observable environment. A challenging aspect of this problem is how to utilize communication so that vehicles do not reach states from which collisions cannot be avoided due to second-order motion constraints. This paper provides a distributed communication protocol for realtime planning that guarantees collision avoidance with obstacles and between vehicles. It can also allow the retainment of a communication network when the vehicles operate as a networked team. The algorithm is a novel integration of sampling-based motion planners with message-passing protocols for distributed constraint optimization. Each vehicle uses the motion planner to generate candidate feasible trajectories and the messagepassing protocol for selecting a safe and compatible trajectory. The existence of such trajectories is guaranteed by the overall approach. Experiments on a distributed simulator built on a cluster of processors confirm the safety properties of the approach in applications such as coordinated exploration. Furthermore, the distributed protocol has better scalability properties when compared against typical priority-based schemes. I.
Efficient Distributed Reinforcement Learning Through Agreement
"... Abstract Distributed robotic systems can benefit from automatic controller design and online adaptation by reinforcement learning (RL), but often suffer from the limitations of partial observability. In this paper, we address the twin problems of limited local experience and locally observed but not ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract Distributed robotic systems can benefit from automatic controller design and online adaptation by reinforcement learning (RL), but often suffer from the limitations of partial observability. In this paper, we address the twin problems of limited local experience and locally observed but not necessarily telling reward signals encountered in such systems. We combine direct search in policy space with an agreement algorithm to efficiently exchange local rewards and experience among agents. We demonstrate improved learning ability on the locomotion problem for self-reconfiguring modular robots in simulation, and show that a fully distributed implementation can learn good policies just as fast as the centralized implementation. Our results suggest that prior work on centralized RL algorithms for modular robots may be made effective in practice through the application of agreement algorithms. This approach could be fruitful in many cooperative situations, whenever robots need to learn similar behaviors, but have access only to local information. 1
Leveraging Organizational Guidance Policies with Learning to Self-Tune Multiagent Systems
"... As organization-based multiagent systems are applied to more complex problems, configuring and tuning the systems can become nearly as complex as the original problem a system was designed to solve. A robust system should be able to adapt. It should be able to self-configure and selftune. To this en ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
As organization-based multiagent systems are applied to more complex problems, configuring and tuning the systems can become nearly as complex as the original problem a system was designed to solve. A robust system should be able to adapt. It should be able to self-configure and selftune. To this end, we propose a method for self-tuning using the concept of guidance policies, that is policies that are designed to guide the system without sacrificing its flexibility. Guidance policies allow us to apply traditional learning techniques online without many of the drawbacks associated with a system falling into a local optimum. They also help simplify the learning process. We examine the impact of this learning on various multiagent systems. 1
Using DCOPs to Balance Exploration and Exploitation in Time-Critical Domains
"... Abstract. Substantial work has investigated balancing exploration and exploitation, but relatively little has addressed this tradeoff in the context of coordinated multi-agent interactions. This paper introduces a class of problems in which agents must maximize their on-line reward, a decomposable f ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. Substantial work has investigated balancing exploration and exploitation, but relatively little has addressed this tradeoff in the context of coordinated multi-agent interactions. This paper introduces a class of problems in which agents must maximize their on-line reward, a decomposable function dependent on pairs of agent’s decisions. Unlike previous work, agents must both learn the reward function and exploit it on-line, critical properties for a class of physicallymotivated systems, such as mobile wireless networks. This paper introduces algorithms motivated by the Distributed Constraint Optimization Problem framework and demonstrates when, and at what cost, increasing agents ’ coordination can improve the global reward on such problems. 1
Algorithms, Experimentation
"... We consider the setting of multiple collaborative agents trying to complete a set of tasks as assigned by a centralized controller. We propose a scalable method called“Assignmentbased decomposition ” which is based on decomposing the problem of action selection into an upper assignment level and a l ..."
Abstract
- Add to MetaCart
We consider the setting of multiple collaborative agents trying to complete a set of tasks as assigned by a centralized controller. We propose a scalable method called“Assignmentbased decomposition ” which is based on decomposing the problem of action selection into an upper assignment level and a lower task execution level. The assignment problem is solved by search, while the task execution is solved through coordinated reinforcement learning. We show that this decomposition of the overall problem into two levels scales well and outperforms the state-of-the-art approaches including pure assignment-level search or pure coordinated reinforcement learning. We also show how this approach enables transfer learning from domains with few agents to domains with many agents.

