Results 1  10
of
27
Cooperative MultiAgent Learning: The State of the Art
 Autonomous Agents and MultiAgent Systems
, 2005
"... Cooperative multiagent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multiagent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. ..."
Abstract

Cited by 182 (8 self)
 Add to MetaCart
(Show Context)
Cooperative multiagent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multiagent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to multiagent systems problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multiagent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning or robotics). In this survey we attempt to draw from multiagent learning work in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multiagent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multiagent learning problem domains, and a list of multiagent learning resources. 1
Mobilized adhoc networks: A reinforcement learning approach
 IN INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING
, 2004
"... With the cost of wireless networking and computational power rapidly dropping, mobile adhoc networks will soon become an important part of our society’s computing structures. While there is a great deal of research from the networking community regarding the routing of information over such network ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
(Show Context)
With the cost of wireless networking and computational power rapidly dropping, mobile adhoc networks will soon become an important part of our society’s computing structures. While there is a great deal of research from the networking community regarding the routing of information over such networks, most of these techniques lack automatic adaptivity. The size and complexity of these networks demand that we apply the principles of autonomic computing to this problem. Reinforcement learning methods can be used to control both packet routing decisions and node mobility, dramatically improving the connectivity of the network. We present two applications of reinforcement learning methods to the mobilized adhoc networking domain and demonstrate some promising empirical results under a variety of different scenarios in which the mobile nodes in our adhoc network are embedded with these adaptive routing policies and learned movement policies.
Automatic shaping and decomposition of reward functions
, 2007
"... This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begin by describing a method that learns a shaped reward function given a set of state and temporal abstractions. Next, we co ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
(Show Context)
This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begin by describing a method that learns a shaped reward function given a set of state and temporal abstractions. Next, we consider decomposition of the pertimestep reward in multieffector problems, in which the overall agent can be decomposed into multiple units that are concurrently carrying out various tasks. We show by example that to find a good reward decomposition, it is often necessary to first shape the rewards appropriately. We then give a function approximation algorithm for solving both problems together. Standard reinforcement learning algorithms can be augmented with our methods, and we show experimentally that in each case, significantly faster learning results.
Decentralized Control of Partially Observable Markov Decision Processes
"... Abstract — Markov decision processes (MDPs) are often used to model sequential decision problems involving uncertainty under the assumption of centralized control. However, many large, distributed systems do not permit centralized control due to communication limitations (such as cost, latency or co ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
(Show Context)
Abstract — Markov decision processes (MDPs) are often used to model sequential decision problems involving uncertainty under the assumption of centralized control. However, many large, distributed systems do not permit centralized control due to communication limitations (such as cost, latency or corruption). This paper surveys recent work on decentralized control of MDPs in which control of each agent depends on a partial view of the world. We focus on a general framework where there may be uncertainty about the state of the environment, represented as a decentralized partially observable MDP (DecPOMDP), but consider a number of subclasses with different assumptions about uncertainty and agent independence. In these models, a shared objective function is used, but plans of action must be based on a partial view of the environment. We describe the frameworks, along with the complexity of optimal control and important properties. We also provide an overview of exact and approximate solution methods as well as relevant applications. This survey provides an introduction to what has become an active area of research on these models and their solutions. I.
Automated design of adaptive controllers for modular robots using reinforcement learning
 IN INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, SPECIAL ISSUE ON SELFRECONFIGURABLE MODULAR ROBOTS
, 2007
"... Designing distributed controllers for selfreconfiguring modular robots has been consistently challenging. We have developed a reinforcement learning approach which can be used both to automate controller design and to adapt robot behavior online. In this paper, we report on our study of reinforceme ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
(Show Context)
Designing distributed controllers for selfreconfiguring modular robots has been consistently challenging. We have developed a reinforcement learning approach which can be used both to automate controller design and to adapt robot behavior online. In this paper, we report on our study of reinforcement learning in the domain of selfreconfigurable modular robots: the underlying assumptions, the applicable algorithms, and the issues of partial observability, large search spaces and local optima. We propose and validate experimentally in simulation a number of techniques designed to address these and other scalability issues that arise in applying machine learning to distributed systems such as modular robots. We discuss ways to make learning faster, more robust and amenable to online application by giving scaffolding to the learning agents in the form of policy representation, structured experience and additional information. With enough structure modular robots can run learning algorithms to both automate the generation of distributed controllers, and adapt to the changing environment and deliver on the selforganization promise with less interference from human designers, programmers and operators.
On local rewards and scaling distributed reinforcement learning
 In NIPS
, 2006
"... We consider the scaling of the number of examples necessary to achieve good performance in distributed, cooperative, multiagent reinforcement learning, as a function of the the number of agents n. We prove a worstcase lower bound showing that algorithms that rely solely on a global reward signal t ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
We consider the scaling of the number of examples necessary to achieve good performance in distributed, cooperative, multiagent reinforcement learning, as a function of the the number of agents n. We prove a worstcase lower bound showing that algorithms that rely solely on a global reward signal to learn policies confront a fundamental limit: They require a number of realworld examples that scales roughly linearly in the number of agents. For settings of interest with a very large number of agents, this is impractical. We demonstrate, however, that there is a class of algorithms that, by taking advantage of local reward signals in large distributed Markov Decision Processes, are able to ensure good performance with a number of samples that scales as O(log n). This makes them applicable even in settings with a very large number of agents n. 1
Bayesian Reinforcement Learning for Multiagent Systems with State Uncertainty ABSTRACT
"... Bayesian methods for reinforcement learning are promising because they allow model uncertainty to be considered explicitly and offer a principled way of dealing with the exploration/exploitation tradeoff. However, for multiagent systems there have been few such approaches, and none of them apply to ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
Bayesian methods for reinforcement learning are promising because they allow model uncertainty to be considered explicitly and offer a principled way of dealing with the exploration/exploitation tradeoff. However, for multiagent systems there have been few such approaches, and none of them apply to problems with state uncertainty. In this paper we fill this gap by proposing two frameworks for Bayesian RL for multiagent systems with state uncertainty. This includes a multiagent POMDP model where a team of agents operates in a centralized fashion, but has uncertainty about the model of the environment. We also consider a best response model in which each agent also has uncertainty over the policies of the other agents. In each case, we seek to learn theappropriatemodelswhileactinginanonlinefashion. We transformtheresultingproblemintoaplanningproblemand prove bounds on the solution quality in different situations. We demonstrate our methods using samplebased planning in several domains with varying levels of uncertainty about the model and the other agents ’ policies. Experimental results show that overall, the approach is able to significantly decrease uncertainty and increase value when compared to initial models and policies. 1.
MultiAgent Reinforcement Learning for Intrusion Detection
"... Abstract. Intrusion Detection has been investigated for many years and the field has matured. Nevertheless, there are still important challenges, e.g., how an IDS can detect new and complex distributed attacks. To tackle these problems, we propose a distributed Reinforcement Learning (RL) approach i ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Intrusion Detection has been investigated for many years and the field has matured. Nevertheless, there are still important challenges, e.g., how an IDS can detect new and complex distributed attacks. To tackle these problems, we propose a distributed Reinforcement Learning (RL) approach in a hierarchical architecture of network sensor agents. Each network sensor agent learns to interpret local state observations, and communicates them to a central agent higher up in the agent hierarchy. These central agents, in turn, learn to send signals up the hierarchy, based on the signals that they receive. Finally, the agent at the top of the hierarchy learns when to signal an intrusion alarm. We evaluate our approach in an abstract network domain. 1
Reinforcement Learning for Vulnerability Assessment in PeertoPeer Networks
"... Proactive assessment of computernetwork vulnerability to unknown future attacks is an important but unsolved computer security problem where AI techniques have significant impact potential. In this paper, we investigate the use of reinforcement learning (RL) for proactive security in the context of ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Proactive assessment of computernetwork vulnerability to unknown future attacks is an important but unsolved computer security problem where AI techniques have significant impact potential. In this paper, we investigate the use of reinforcement learning (RL) for proactive security in the context of denialofservice (DoS) attacks in peertopeer (P2P) networks. Such a tool would be useful for network administrators and designers to assess and compare the vulnerability of various network configurations and security measures in order to optimize those choices for maximum security. We first discuss the various dimensions of the problem and how to formulate it as RL. Next we introduce compact parametric policy representations for both single attacker and botnets and derive a policygradient RL algorithm. We evaluate these algorithms under a variety of network configurations that employ recent fairuse DoS security mechanisms. The results show that our RLbased approach is able to significantly outperform a number of heuristic strategies in terms of the severity of the attacks discovered. The results also suggest some possible network design lessons for reducing the attack potential of an intelligent attacker.
Decentralized Reinforcement Learning for the Online Optimization of Distributed Systems 8
"... Distributed reinforcement learning is concerned with what action an agent should take, given its current state and the state of other agents, so as to minimize a system cost function (or maximize a global objective function). In this chapter, we give an overview of distributed reinforcement learning ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Distributed reinforcement learning is concerned with what action an agent should take, given its current state and the state of other agents, so as to minimize a system cost function (or maximize a global objective function). In this chapter, we give an overview of distributed reinforcement learning and describe how it can be used to build distributed systems that can