Results 1 - 10
of
86
Evolutionary Game Theory
, 1995
"... Abstract. Experimentalists frequently claim that human subjects in the laboratory violate game-theoretic predictions. It is here argued that this claim is usually premature. The paper elaborates on this theme by way of raising some conceptual and methodological issues in connection with the very def ..."
Abstract
-
Cited by 412 (3 self)
- Add to MetaCart
Abstract. Experimentalists frequently claim that human subjects in the laboratory violate game-theoretic predictions. It is here argued that this claim is usually premature. The paper elaborates on this theme by way of raising some conceptual and methodological issues in connection with the very definition of a game and of players ’ preferences, in particular with respect to potential context dependence, interpersonal preference dependence, backward induction and incomplete information.
Representations and Solutions for Game-Theoretic Problems
- Artificial Intelligence
, 1997
"... A system with multiple interacting agents (whether artificial or human) is often best analyzed using game-theoretic tools. Unfortunately, while the formal foundations are well-established, standard computational techniques for game-theoretic reasoning are inadequate for dealing with realistic games. ..."
Abstract
-
Cited by 105 (0 self)
- Add to MetaCart
A system with multiple interacting agents (whether artificial or human) is often best analyzed using game-theoretic tools. Unfortunately, while the formal foundations are well-established, standard computational techniques for game-theoretic reasoning are inadequate for dealing with realistic games. This paper describes the Gala system, an implemented system that allows the specification and efficient solution of large imperfect information games. The system contains the first implementation of a recent algorithm, due to Koller, Megiddo, and von Stengel. Experimental results from the system demonstrate that the algorithm is exponentially faster than the standard algorithm in practice, not just in theory. It therefore allows the solution of games that are orders of magnitude larger than were previously possible. The system also provides a new declarative language for compactly and naturally representing games by their rules. As a whole, the Gala system provides the capability for automa...
Dynamic Programming for Partially Observable Stochastic Games
- IN PROCEEDINGS OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 2004
"... We develop an exact dynamic programming algorithm for partially observable stochastic games (POSGs). The algorithm is a synthesis of dynamic programming for partially observable Markov decision processes (POMDPs) and iterated elimination of dominated strategies in normal form games. ..."
Abstract
-
Cited by 89 (18 self)
- Add to MetaCart
We develop an exact dynamic programming algorithm for partially observable stochastic games (POSGs). The algorithm is a synthesis of dynamic programming for partially observable Markov decision processes (POMDPs) and iterated elimination of dominated strategies in normal form games.
Fast Algorithms for Finding Randomized Strategies in Game Trees
, 1994
"... Interactions among agents can be conveniently described by game trees. In order to analyze a game, it is important to derive optimal (or equilibrium) strategies for the different players. The standard approach to finding such strategies in games with imperfect information is, in general, computation ..."
Abstract
-
Cited by 76 (14 self)
- Add to MetaCart
Interactions among agents can be conveniently described by game trees. In order to analyze a game, it is important to derive optimal (or equilibrium) strategies for the different players. The standard approach to finding such strategies in games with imperfect information is, in general, computationally intractable. The approach is to generate the normal form of the game (the matrix containing the payoff for each strategy combination), and then solve a linear program (LP) or a linear complementarity problem (LCP). The size of the normal form, however, is typically exponential in the size of the game tree, thus making this method impractical in all but the simplest cases. This paper describes a new representation of strategies which results in a practical linear formulation of the problem of two-player games with perfect recall (i.e., games where players never forget anything, which is a standard assumption). Standard LP or LCP solvers can then be applied to find optimal randomized strategies. The resulting algorithms are, in general, exponentially better than the standard ones, both in terms of time and in terms of space.
Constrained Markov Decision Processes
, 1995
"... This report presents a unified approach for the study of constrained Markov decision processes with a countable state space and unbounded costs. We consider a single controller having several objectives; it is desirable to design a controller that minimize one of cost objective, subject to inequalit ..."
Abstract
-
Cited by 69 (5 self)
- Add to MetaCart
This report presents a unified approach for the study of constrained Markov decision processes with a countable state space and unbounded costs. We consider a single controller having several objectives; it is desirable to design a controller that minimize one of cost objective, subject to inequality constraints on other cost objectives. The objectives that we study are both the expected average cost, as well as the expected total cost (of which the discounted cost is a special case). We provide two frameworks: the case were costs are bounded below, as well as the contracting framework. We characterize the set of achievable expected occupation measures as well as performance vectors. This allows us to reduce the original control dynamic problem into an infinite Linear Programming. We present a Lagrangian approach that enables us to obtain sensitivity analysis. In particular, we obtain asymptotical results for the constrained control problem: convergence of both the value and the pol...
Competitive Analysis of Randomized Paging Algorithms
, 2000
"... The paging problem is defined as follows: we are given a two-level memory system, in which one level is a fast memory, called cache, capable of holding k items, and the second level is an unbounded but slow memory. At each given time step, a request to an item is issued. Given a request to an item p ..."
Abstract
-
Cited by 59 (9 self)
- Add to MetaCart
The paging problem is defined as follows: we are given a two-level memory system, in which one level is a fast memory, called cache, capable of holding k items, and the second level is an unbounded but slow memory. At each given time step, a request to an item is issued. Given a request to an item p,amiss occurs if p is not present in the fast memory. In response to a miss, we need to choose an item q in the cache and replace it by p. The choice of q needs to be made on-line, without the knowledge of future requests. The objective is to design a replacement strategy with a small number of misses. In this paper we use competitive analysis to study the performance of randomized on-line paging algorithms. Our goal is to show how the concept of work functions, used previously mostly for the analysis of deterministic algorithms, can also be applied, in a systematic fashion, to the randomized case. We present two results: we first show that the competitive ratio of the marking algorithm is ex...
A continuation method for Nash equilibria in structured games
- In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI
, 2003
"... We describe algorithms for computing Nash equilibria in structured game representations, including both graphical games and multi-agent influence diagrams (MAIDs). The algorithms are derived from a continuation method for normal-form and extensive-form games due to Govindan and Wilson; they follow a ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
We describe algorithms for computing Nash equilibria in structured game representations, including both graphical games and multi-agent influence diagrams (MAIDs). The algorithms are derived from a continuation method for normal-form and extensive-form games due to Govindan and Wilson; they follow a trajectory through the space of perturbed games and their equilibria. Our algorithms exploit game structure through fast computation of the Jacobian of the game's payoff function. They are guaranteed to find at least one equilibrium of the game and may find more. Our approach provides the first exact algorithm for computing an exact equilibrium in graphical games with arbitrary topology, and the first algorithm to exploit fine-grain structural properties of MAIDs. We present experimental results for our algorithms. The running time for our graphical game algorithm is similar to, and often better than, the running time of previous approximate algorithms. Our algorithm for MAIDs can effectively solve games that arc much larger than those that could be solved using previous methods. 1
Finding equilibria in large sequential games of imperfect information
- In ACM Conference on Electronic Commerce
, 2006
"... Information ∗ ..."
Existence of multiagent equilibria with limited agents
- Journal of Artificial Intelligence Research
, 2002
"... Multiagent learning is a necessary yet challenging problem as multiagent systems become more prevalent and environments become more dynamic. Much of the groundbreaking work in this area draws on notable results from game theory, in particular, the concept of Nash equilibria. Learners that directly l ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
Multiagent learning is a necessary yet challenging problem as multiagent systems become more prevalent and environments become more dynamic. Much of the groundbreaking work in this area draws on notable results from game theory, in particular, the concept of Nash equilibria. Learners that directly learn an equilibrium obviously rely on their existence. Learners that instead seek to play optimally with respect to the other players also depend upon equilibria since equilibria are fixed points for learning. From another perspective, agents with limitations are real and common. These may be undesired physical limitations as well as self-imposed rational limitations, such as abstraction and approximation techniques, used to make learning tractable. This article explores the interactions of these two important concepts: equilibria and limitations in learning. We introduce the question of whether equilibria continue to exist when agents have limitations. We look at the general effects limitations can have on agent behavior, and define a natural extension of equilibria that accounts for these limitations. Using this formalization, we make three major contributions: (i) a counterexample for the general existence of equilibria with limitations, (ii) sufficient conditions on limitations that preserve their existence, (iii) three general classes of games and limitations that satisfy these conditions. We then present empirical results from a specific multiagent learning algorithm applied to a specific instance of limited agents. These results demonstrate that learning with limitations is feasible, when the conditions outlined by our theoretical analysis hold. 1.
Optimal and approximate Q-value functions for decentralized POMDPs
- J. Artificial Intelligence Research
"... Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value functi ..."
Abstract
-
Cited by 22 (9 self)
- Add to MetaCart
Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q ∗ is computed in a recursive manner by dynamic programming, and then an optimal policy is extracted from Q ∗. In this paper we study whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions. We define two forms of the optimal Q-value function for Dec-POMDPs: one that gives a normative description as the Q-value function of an optimal pure joint policy and another one that is sequentially rational and thus gives a recipe for computation. This computation, however, is infeasible for all but the smallest problems. Therefore, we analyze various approximate Q-value functions that allow for efficient computation. We describe how they relate, and we prove that they all provide an upper bound to the optimal Q-value function Q ∗. Finally, unifying some previous approaches for solving Dec-POMDPs, we describe a family of algorithms for extracting policies from such Q-value functions, and perform an experimental evaluation on existing test problems, including a new firefighting benchmark problem. 1.

