Results 1  10
of
89
Using Collective Intelligence To Route Internet Traffic
 IN ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
, 1999
"... A COllective INtelligence (COIN) is a set of interacting reinforcement learning (RL) algorithms designed in an automated fashion so that their collective behavior optimizes a global utility function. We summarize the theory of COINs, then present experiments using that theory to design COINs to cont ..."
Abstract

Cited by 59 (24 self)
 Add to MetaCart
A COllective INtelligence (COIN) is a set of interacting reinforcement learning (RL) algorithms designed in an automated fashion so that their collective behavior optimizes a global utility function. We summarize the theory of COINs, then present experiments using that theory to design COINs to control internet traffic routing. These experiments indicate that COINs outperform all previously investigated RLbased, shortest path routing algorithms.
Collective Intelligence and Braess’ Paradox
 IN: PROCEEDINGS OF THE SIXTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 2000
"... We consider the use of multiagent systems to control network routing. Conventional approaches to this task are based on Ideal Shortest Path routing Algorithm (ISPA), under which at each moment each agent in the network sends all of its traffic down the path that will incur the lowest cost to that t ..."
Abstract

Cited by 55 (23 self)
 Add to MetaCart
We consider the use of multiagent systems to control network routing. Conventional approaches to this task are based on Ideal Shortest Path routing Algorithm (ISPA), under which at each moment each agent in the network sends all of its traffic down the path that will incur the lowest cost to that traffic. We demonstrate in computer experiments that due to the sideeffects of one agent's actions on another agent's traffic, use of ISPA's can result in large global cost. In particular, in a simulation of Braess' paradox we see that adding new capacity to a network with ISPA agents can decrease overall throughput. The theory of COllective INtelligence (COIN) design concerns precisely the issue of avoiding such sideeffects. We use that theory to derive an idealized routing algorithm and show that a practical machinelearningbased version of this algorithm, in which costs are only imprecisely estimated substantially outperforms the ISPA, despite having access to less information than does the ISPA. In particular, this practical COIN algorithm avoids Braess' paradox.
Analysis of dynamic task allocation in multirobot systems
 The International Journal of Robotics Research
, 2006
"... Dynamic task allocation is an essential requirement for multirobot systems operating in unknown dynamic environments. It allows robots to change their behavior in response to environmental changes or actions of other robots in order to improve overall system performance. Emergent coordination algor ..."
Abstract

Cited by 46 (5 self)
 Add to MetaCart
Dynamic task allocation is an essential requirement for multirobot systems operating in unknown dynamic environments. It allows robots to change their behavior in response to environmental changes or actions of other robots in order to improve overall system performance. Emergent coordination algorithms for task allocation that use only local sensing and no direct communication between robots are attractive because they are robust and scalable. However, a lack of formal analysis tools makes emergent coordination algorithms difficult to design. In this paper we present a mathematical model of a general dynamic task allocation mechanism. Robots using this mechanism have to choose between two types of task, and the goal is to achieve a desired task division in the absence of explicit communication and global knowledge. Robots estimate the state of the environment from repeated local observations and decide which task to choose based on these observations. We model the robots and observations as stochastic processes and study the dynamics of the collective behavior. Specifically, we analyze the effect that the number of observations and the choice of the decision function have on the performance of the system. The mathematical models are validated in a multirobot multiforaging scenario. The model’s predictions agree very closely with experimental results from sensorbased simulations. 1
Collective intelligence for control of distributed dynamical systems
 Europhysics Letters
, 2000
"... We consider the El Farol bar problem, also known as the minority game (W. B. Arthur, The American Economic Review, 84(2): 406–411 (1994), D. Challet and Y.C. Zhang, Physica A, 256:514 (1998)). We view it as an instance of the general problem of how to configure the nodal elements of a distributed dy ..."
Abstract

Cited by 39 (11 self)
 Add to MetaCart
We consider the El Farol bar problem, also known as the minority game (W. B. Arthur, The American Economic Review, 84(2): 406–411 (1994), D. Challet and Y.C. Zhang, Physica A, 256:514 (1998)). We view it as an instance of the general problem of how to configure the nodal elements of a distributed dynamical system so that they do not “work at cross purposes”, in that their collective dynamics avoids frustration and thereby achieves a provided global goal. We summarize a mathematical theory for such configuration applicable when (as in the bar problem) the global goal can be expressed as minimizing a global energy function and the nodes can be expressed as minimizers of local free energy functions. We show that a system designed with that theory performs nearly optimally for the bar problem. 1
General principles of learningbased multiagent systems
 In Proceedings of the Third International Conference of Autonomous Agents
, 1999
"... We consider the problem of how to design large decentralized multiagent systems (MAS’s) in an automated fashion, with little or no handtuning. Our approach has each agent run a reinforcement learning algorithm. This converts the problem into one of how to automatically set/update the reward functio ..."
Abstract

Cited by 36 (6 self)
 Add to MetaCart
We consider the problem of how to design large decentralized multiagent systems (MAS’s) in an automated fashion, with little or no handtuning. Our approach has each agent run a reinforcement learning algorithm. This converts the problem into one of how to automatically set/update the reward functions for each of the agents so that the global goal is achieved. In particular we do not want the agents to “work at crosspurposes ” as far as the global goal is concerned. We use the term artificial COllective INtelligence (COIN) to refer to systems that embody solutions to this problem. In this paper we present a summary of a mathematical framework for COINs. We then investigate the realworld applicability of the core concepts of that framework via two computer experiments: we show that our COINs perform near optimally in a difficult variant of Arthur’s bar problem [1] (and in particular avoid the tragedy of the commons for that problem), and we also illustrate optimal performance for our COINs in the leaderfollower problem. 1
Learning Sequences of Actions in Collectives of Autonomous Agents
 In Proceedings of the First International Joint Conference on Autonomous Agents and MultiAgent Systems
, 2002
"... In this paper we focus on the problem of designing a collective of autonomous agents that individually learn sequences of actions such that the resultant sequence of joint actions achieves a predetermined global objective. We are particularly interested in instances of this problem where centralized ..."
Abstract

Cited by 33 (20 self)
 Add to MetaCart
In this paper we focus on the problem of designing a collective of autonomous agents that individually learn sequences of actions such that the resultant sequence of joint actions achieves a predetermined global objective. We are particularly interested in instances of this problem where centralized control is either impossible or impractical. For single agent systems in similar domains, machine learning methods (e.g., reinforcement learners [18]) have been successfully used [1, 2, 3, 31]. However, applying such solutions directly to multiagent systems often proves problematic, as agents may work at crosspurposes, or have difficulty in evaluating their contribution to achievement of the global objective, or both. Accordingly, the crucial design step in multiagent systems centers on determining the private objectives of each agent so that as the agents strive for those objectives, the system reaches a good global solution. In this work we consider a version of this problem involving multiple autonomous agents in a grid world. We use concepts from collective intelligence [19, 27, 30] to design goals for the agents that are "aligned" with the global goal, and are "learnable" in that agents can readily see how their behavior affects their utility. We show that reinforcement learning agents using those goals outperform both "natural" extensions of single agent algorithms and global reinforcement learning solutions based on "team games".
Connections Between Cooperative Control and Potential Games Illustrated on the Consensus Problem
, 2007
"... This paper presents a view of cooperative control using the language of learning in games. We review the game theoretic concepts of potential games and weakly acyclic games and demonstrate how the specific cooperative control problem of consensus can be formulated in these settings. Motivated by th ..."
Abstract

Cited by 31 (12 self)
 Add to MetaCart
This paper presents a view of cooperative control using the language of learning in games. We review the game theoretic concepts of potential games and weakly acyclic games and demonstrate how the specific cooperative control problem of consensus can be formulated in these settings. Motivated by this connection, we build upon game theoretic concepts to better accommodate a broader class of cooperative control problems. In particular, we introduce sometimes weakly acyclic games for timevarying objective functions and action sets, and provide distributed algorithms for convergence to an equilibrium. Finally, we illustrate how to implement these algorithms for the consensus problem in a variety of settings, most notably, in an environment with nonconvex obstructions.
Ant Colony Optimization and its Application to Adaptive Routing in Telecommunication Networks
, 2004
"... In ant societies, and, more in general, in insect societies, the activities of the individuals, as well asofthesocietyasawhole,arenotregulatedbyanyexplicit
formofcentralizedcontrol. Onthe other hand, adaptive and robust behaviors transcending the behavioral repertoire of the single individualcanbeea ..."
Abstract

Cited by 26 (13 self)
 Add to MetaCart
In ant societies, and, more in general, in insect societies, the activities of the individuals, as well asofthesocietyasawhole,arenotregulatedbyanyexplicit
formofcentralizedcontrol. Onthe other hand, adaptive and robust behaviors transcending the behavioral repertoire of the single individualcanbeeasilyobserved at society level. Thesecomplexglobalbehaviorsaretheresult of selforganizing dynamics driven by local interactions and communications among a number of relatively simple individuals. The simultaneous presence of these and other fascinating and unique characteristics have made ant societies an attractive and inspiring model for building newalgorithmsandnewmultiagentsystems. Inthelastdecade,antsocietieshavebeentakenasa referenceforanevergrowingbodyof scientific work, mostly in the fields of robotics, operations research, and telecommunications. Among the different works inspired by ant colonies, the Ant Colony Optimization metaheuristic (ACO) is probably the most successful and popular one. The ACO metaheuristic is a multiagent framework for combinatorial optimization whose main components are: a set of antlike agents, the use of memory and of stochastic decisions, and strategies of collective and distributed learning. It finds its roots
Revisiting LogLinear Learning: Asynchrony, Completeness and PayoffBased Implementation
, 2008
"... Loglinear learning is a learning algorithm with equilibrium selection properties. Loglinear learning provides guarantees on the percentage of time that the joint action profile will be at a potential maximizer in potential games. The traditional analysis of loglinear learning has centered around ..."
Abstract

Cited by 26 (10 self)
 Add to MetaCart
Loglinear learning is a learning algorithm with equilibrium selection properties. Loglinear learning provides guarantees on the percentage of time that the joint action profile will be at a potential maximizer in potential games. The traditional analysis of loglinear learning has centered around explicitly computing the stationary distribution. This analysis relied on a highly structured setting: i) players ’ utility functions constitute a potential game, ii) players update their strategies one at a time, which we refer to as asynchrony, iii) at any stage, a player can select any action in the action set, which we refer to as completeness, and iv) each player is endowed with the ability to assess the utility he would have received for any alternative action provided that the actions of all other players remain fixed. Since the appeal of loglinear learning is not solely the explicit form of the stationary distribution, we seek to address to what degree one can relax the structural assumptions while maintaining that only potential function maximizers are the stochastically stable action profiles. In this paper, we introduce slight variants of loglinear learning to include both synchronous updates and incomplete action sets. In both settings, we prove that only potential function maximizers are stochastically stable. Furthermore, we introduce a payoffbased version of loglinear learning, in which players are only aware of the utility they received and the action that they played. Note that loglinear learning in its original form is not a payoffbased learning algorithm. In payoffbased loglinear learning, we also prove that only potential maximizers are stochastically stable. The key enabler for these results is to change the focus of the analysis away from deriving the explicit form of the stationary distribution of the learning process towards characterizing the stochastically stable states. The resulting analysis uses the theory of resistance trees for regular perturbed Markov decision processes, thereby allowing a relaxation of the aforementioned structural assumptions.
Information theory  the bridge connecting bounded rational game theory and statistical physics
 Statistical Physics
, 2004
"... A longrunning difficulty with conventional game theory has been how to modify it to accommodate the bounded rationality of all realworld players. A recurring issue in statistical physics is how best to approximate joint probability distributions with decoupled (and therefore far more tractable) di ..."
Abstract

Cited by 22 (10 self)
 Add to MetaCart
A longrunning difficulty with conventional game theory has been how to modify it to accommodate the bounded rationality of all realworld players. A recurring issue in statistical physics is how best to approximate joint probability distributions with decoupled (and therefore far more tractable) distributions. This paper shows that the same information theoretic mathematical structure, known as Product Distribution (PD) theory, addresses both issues. In this, PD theory not only provides a principled formulation of bounded rationality and a set of new types of mean field theory in statistical physics; it also shows that those topics are fundamentally one and the same. 1