Results 1  10
of
15
Revisiting LogLinear Learning: Asynchrony, Completeness and PayoffBased Implementation
, 2008
"... Loglinear learning is a learning algorithm with equilibrium selection properties. Loglinear learning provides guarantees on the percentage of time that the joint action profile will be at a potential maximizer in potential games. The traditional analysis of loglinear learning has centered around ..."
Abstract

Cited by 25 (9 self)
 Add to MetaCart
Loglinear learning is a learning algorithm with equilibrium selection properties. Loglinear learning provides guarantees on the percentage of time that the joint action profile will be at a potential maximizer in potential games. The traditional analysis of loglinear learning has centered around explicitly computing the stationary distribution. This analysis relied on a highly structured setting: i) players ’ utility functions constitute a potential game, ii) players update their strategies one at a time, which we refer to as asynchrony, iii) at any stage, a player can select any action in the action set, which we refer to as completeness, and iv) each player is endowed with the ability to assess the utility he would have received for any alternative action provided that the actions of all other players remain fixed. Since the appeal of loglinear learning is not solely the explicit form of the stationary distribution, we seek to address to what degree one can relax the structural assumptions while maintaining that only potential function maximizers are the stochastically stable action profiles. In this paper, we introduce slight variants of loglinear learning to include both synchronous updates and incomplete action sets. In both settings, we prove that only potential function maximizers are stochastically stable. Furthermore, we introduce a payoffbased version of loglinear learning, in which players are only aware of the utility they received and the action that they played. Note that loglinear learning in its original form is not a payoffbased learning algorithm. In payoffbased loglinear learning, we also prove that only potential maximizers are stochastically stable. The key enabler for these results is to change the focus of the analysis away from deriving the explicit form of the stationary distribution of the learning process towards characterizing the stochastically stable states. The resulting analysis uses the theory of resistance trees for regular perturbed Markov decision processes, thereby allowing a relaxation of the aforementioned structural assumptions.
1 Achieving Pareto Optimality Through Distributed Learning
"... We propose a simple payoffbased learning rule that is completely decentralized, and that leads to an efficient configuration of actions in any nperson finite strategicform game with generic payoffs. The algorithm follows the theme of exploration versus exploitation and is hence stochastic in natu ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
We propose a simple payoffbased learning rule that is completely decentralized, and that leads to an efficient configuration of actions in any nperson finite strategicform game with generic payoffs. The algorithm follows the theme of exploration versus exploitation and is hence stochastic in nature. We prove that if all agents adhere to this algorithm, then the agents will select the action profile that maximizes the sum of the agents ’ payoffs a high percentage of time. The algorithm requires no communication. Agents respond solely to changes in their own realized payoffs, which are affected by the actions of other agents in the system in ways that they do not necessarily understand. The method can be applied to the optimization of complex systems with many distributed components, such as the routing of information in networks and the design and control of wind farms. The proof of the proposed learning algorithm relies on the theory of large deviations for perturbed Markov chains. I.
Multiagent Learning in Large Anonymous Games
"... In large systems, it is important for agents to learn to act effectively, but sophisticated multiagent learning algorithms generally do not scale. An alternative approach is to find restricted classes of games where simple, efficient algorithms converge. It is shown that stage learning efficiently ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In large systems, it is important for agents to learn to act effectively, but sophisticated multiagent learning algorithms generally do not scale. An alternative approach is to find restricted classes of games where simple, efficient algorithms converge. It is shown that stage learning efficiently converges to Nash equilibria in large anonymous games if bestreply dynamics converge. Two features are identified that improve convergence. First, rather than making learning more difficult, more agents are actually beneficial in many settings. Second, providing agents with statistical information about the behavior of others can significantly reduce the number of observations needed.
Joint channel and power allocation in tactical cognitive networks: Enhanced trial and errors
 the Military Communications and Information Systems Conference (MCC), Saint–Malo
, 2013
"... Abstract—In tactical networks the presence of a central controller (e.g., a base station) is made impractical by the unpredictability of the nodes ’ positions and by the fact that its presence can be exploited by hostile entities. As a consequence, selfconfiguring networks are sought for military a ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract—In tactical networks the presence of a central controller (e.g., a base station) is made impractical by the unpredictability of the nodes ’ positions and by the fact that its presence can be exploited by hostile entities. As a consequence, selfconfiguring networks are sought for military and emergency communication networks. In such networks, the transmission parameters, most notably the transmission channel and the power level, are set by the devices following specific behavioural rules. In this context, an algorithm for selfconfiguring wireless networks is presented, analysed and enhanced to meet the specific needs of tactical networks. Such an algorithm, based on the concept of trial and error, is tested under static and mobile situations, and different metrics are considered to show its performance. In particular, the stability and performance improvements with respect to previously proposed versions of the algorithm are detailed. I.
CENTER FOR THE STUDY OF RATIONALITY
, 2012
"... We consider smallinfluence anonymous games with a large number of players n where every player has two actions. For this class of games we present a bestreply dynamic with the following two properties. First, the dynamic reaches Nash approximate equilibria fast (in at most cn logn steps for some c ..."
Abstract
 Add to MetaCart
We consider smallinfluence anonymous games with a large number of players n where every player has two actions. For this class of games we present a bestreply dynamic with the following two properties. First, the dynamic reaches Nash approximate equilibria fast (in at most cn logn steps for some constant c> 0). Second, Nash approximate equilibria are played by the dynamic with a limit frequency of at least 1 − e −c ′ n for some constant c
Simple Adaptive Strategies: From Regret Matching
, 2012
"... The general framework of this volume is that of game theory, with multiple participants (“players”) who interact repeatedly over time. The players ..."
Abstract
 Add to MetaCart
The general framework of this volume is that of game theory, with multiple participants (“players”) who interact repeatedly over time. The players
Author manuscript, published in "ValueTools 2012, Cargèse: France (2012)" Distributed Learning in Hierarchical Networks
, 2012
"... Abstract—In this article, we propose distributed learning based approaches to study the evolution of a decentralized hierarchical system, an illustration of which is the smart grid. Smart grid management requires the control of nonrenewable energy production and the integration of renewable energie ..."
Abstract
 Add to MetaCart
Abstract—In this article, we propose distributed learning based approaches to study the evolution of a decentralized hierarchical system, an illustration of which is the smart grid. Smart grid management requires the control of nonrenewable energy production and the integration of renewable energies which might be highly unpredictable. Indeed, their production levels rely on uncontrolable factors such as sunshine, wind strength, etc. First, we derive optimal control strategies on the nonrenewable energy productions and compare competitive learning algorithms to forecast the energy needs of the end users. Second, we introduce an online learning algorithm based on regret minimization enabling the agents to forecast the production of renewable energies. Additionally, we define organizations of the market promoting collaborative learning which generate higher performance for the whole smart grid than full competition.
Learning in a Black Box
, 2013
"... Many interactive environments can be represented as games, but they are so large and complex that individual players are in the dark about what others are doing and how their own payoffs are affected. This paper analyzes learning behavior in such ‘black box’ environments, where players ’ only source ..."
Abstract
 Add to MetaCart
Many interactive environments can be represented as games, but they are so large and complex that individual players are in the dark about what others are doing and how their own payoffs are affected. This paper analyzes learning behavior in such ‘black box’ environments, where players ’ only source of information is their own history of actions taken and payoffs received. Specifically we study repeated public goods games, where players must decide how much to contribute at each stage, but they do not know how much others have contributed or how others ’ contributions affect their own payoffs. We identify two key features of the players ’ learning dynamics. First, if a player’s realized payoff increases he is less inclined to change his strategy, whereas if his realized payoff decreases he is more inclined to change his strategy. Second, if increasing his own contribution results in higher payoffs he will tend to increase his contribution still further, whereas the reverse holds if an increase in contribution leads to lower payoffs. These two effects are clearly present when players have no information about the game; moreover they are still present even when players have full information. Convergence to Nash equilibrium
Game Theory and Distributed Control
, 2012
"... Game theory has been employed traditionally as a modeling tool for describing and influencing behavior in societal systems. Recently, game theory has emerged as a valuable tool for controlling or prescribing behavior in distributed engineered systems. The rationale for this new perspective stems fro ..."
Abstract
 Add to MetaCart
Game theory has been employed traditionally as a modeling tool for describing and influencing behavior in societal systems. Recently, game theory has emerged as a valuable tool for controlling or prescribing behavior in distributed engineered systems. The rationale for this new perspective stems from the parallels between the underlying decision making architectures in both societal systems and distributed engineered systems. In particular, both settings involve an interconnection of decision making elements whose collective behavior depends on a compilation of local decisions that are based on partial information about each other and the state of the world. Accordingly, there is extensive work in game theory that is relevant to the engineering agenda. Similarities notwithstanding, there remain important differences between the constraints and objectives in societal and engineered systems that require looking at game theoretic methods from a new perspective. This chapter provides an overview of selected recent developments of game theoretic methods in this role as a framework for distributed control in engineered systems. 1
Coarse Resistance Tree Methods For Stochastic Stability Analysis
"... Abstract — Emergent behavior in natural and manmade systems can often be characterized by the limiting distribution of a special class of Markov processes termed regular perturbed processes. Resistance trees have gained popularity as a computationally efficient way to characterize the stochastically ..."
Abstract
 Add to MetaCart
Abstract — Emergent behavior in natural and manmade systems can often be characterized by the limiting distribution of a special class of Markov processes termed regular perturbed processes. Resistance trees have gained popularity as a computationally efficient way to characterize the stochastically stable states (i.e., support of the limiting distribution); however, there are three main limitations of this approach. First, it often requires finding a minimum weight spanning tree for each state in a potentially large state space. Second, perturbations to transition probabilities must decay at an exponentially smooth rate. Lastly, the approach is shown to hold purely in the context of finite Markov chains. In this paper we seek to address these limitations by developing new tools for characterizing the stochastically stable states. First, we provide necessary conditions for stochastic stability via a coarse, and less computationally intensive, state space analysis. Next, we identify necessary conditions for stochastic stability when smooth convergence requirements are relaxed. Lastly, we establish similar tools for stochastic stability analysis in Markov chains over a continuous state space.