Results 1  10
of
33
Markov games as a framework for multiagent reinforcement learning
 IN PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING
, 1994
"... In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function. In this solipsistic view, secondary agents can only be part of the environment and are therefore fixed in their behavior ..."
Abstract

Cited by 607 (13 self)
 Add to MetaCart
In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function. In this solipsistic view, secondary agents can only be part of the environment and are therefore fixed in their behavior. The framework of Markov games allows us to widen this view to include multiple adaptive agents with interacting or competing goals. This paper considers a step in this direction in which exactly two agents with diametrically opposed goals share an environment. It describes a Qlearninglike algorithm for finding optimal policies and demonstrates its application to a simple twoplayer game in which the optimal policy is probabilistic.
Algorithms for Sequential Decision Making
, 1996
"... Sequential decision making is a fundamental task faced by any intelligent agent in an extended interaction with its environment; it is the act of answering the question "What should I do now?" In this thesis, I show how to answer this question when "now" is one of a finite set of ..."
Abstract

Cited by 212 (8 self)
 Add to MetaCart
(Show Context)
Sequential decision making is a fundamental task faced by any intelligent agent in an extended interaction with its environment; it is the act of answering the question "What should I do now?" In this thesis, I show how to answer this question when "now" is one of a finite set of states, "do" is one of a finite set of actions, "should" is maximize a longrun measure of reward, and "I" is an automated planning or learning system (agent). In particular,
Solution of NonLinear Ordinary Differential equations by Feedforward Neural Networks
 Mathematical and Computer Modelling
, 1994
"... ABSTRACT It is demonstrated, through theory and numerical examples, how it is possible to directly construct a feedforward neural network to approximate nonlinear ordinary differential equations without the need for training. The method, utilizing a piecewise linear map as the activation function, i ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
ABSTRACT It is demonstrated, through theory and numerical examples, how it is possible to directly construct a feedforward neural network to approximate nonlinear ordinary differential equations without the need for training. The method, utilizing a piecewise linear map as the activation function, is linear in storage, and the L2 norm of the network approximation error decreases monotonically with the increasing number of hidden layer neurons. The construction requires imposing certain constraints on the values of the input, bias, and output weights, and the attribution of certain roles to each of these parameters. All results presented used the piecewise linear activation function. However, the presented approach should also be applicable to the use of hyperbolic tangents, sigmoids, and radial basis functions.
Shape and stress analysis of symmetric CRTS re CUED/DSTRUCT/TR170
, 1997
"... The work presented in this report was carried out under an ESA contract. Responsibility for its contents resides with the authors. ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
(Show Context)
The work presented in this report was carried out under an ESA contract. Responsibility for its contents resides with the authors.
COARSE EMBEDDABILITY INTO BANACH SPACES
, 2008
"... The main purposes of this paper are (1) To survey the area of coarse embeddability of metric spaces into Banach spaces, and, in particular, coarse embeddability of different Banach spaces into each other; (2) To present new results on the problems: (a) Whether coarse nonembeddability into ℓ2 implie ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
The main purposes of this paper are (1) To survey the area of coarse embeddability of metric spaces into Banach spaces, and, in particular, coarse embeddability of different Banach spaces into each other; (2) To present new results on the problems: (a) Whether coarse nonembeddability into ℓ2 implies presence of expanderlike structures? (b) To what extent ℓ2 is the most difficult space to embed into?
AlgorithmSpecific Parallel Processing with Linear Processor Arrays
 IN ADVANCES IN COMPUTERS, M. YOVITS, ED
, 1994
"... This article discusses systematic ways to derive such mappings. The techniques are illustrated by examples involving linear arrays of processors (onedimensional processor arrays); however, unless otherwise stated, the results can be extended to arrays of arbitrary dimensions. Several linear arrays ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This article discusses systematic ways to derive such mappings. The techniques are illustrated by examples involving linear arrays of processors (onedimensional processor arrays); however, unless otherwise stated, the results can be extended to arrays of arbitrary dimensions. Several linear arrays have been implemented for specific applications as well as for "widepurpose" computing [ASAP91,92]. They are easier to build and program than arrays of higher dimensions. In particular, the connections among neighboring processors can be made very fast and, therefore, provide large communication bandwidths. For example, physical links
Numerical Solution Of A Calculus Of Variations Problem Using The Feedforward Neural Network Architecture
 Advances in Engineering Software
, 1996
"... It is demonstrated, through theory and numerical example, how it is possible to construct directly and noniteratively a feedforward neural network to solve a calculus of variations problem. The method, using the piecewise linear and cubic sigmoid transfer functions, is linear in storage and processi ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
It is demonstrated, through theory and numerical example, how it is possible to construct directly and noniteratively a feedforward neural network to solve a calculus of variations problem. The method, using the piecewise linear and cubic sigmoid transfer functions, is linear in storage and processing time. The L 2 norm of the network approximation error decreases quadratically with the piecewise linear transfer function and quartically with the piecewise cubic sigmoid as the number of hidden layer neurons increases. The construction requires imposing certain constraints on the values of the input, bias, and output weights, and the attribution of certain roles to each of these parameters. All results presented used the piecewise linear and cubic sigmoid transfer functions. However, the noniterative approach should also be applicable to the use of hyperbolic tangents and radial basis functions. NUMERICAL SOLUTION OF A CALCULUS OF VARIATIONS PROBLEM USING THE FEEDFORWARD NEURAL NETWORK...
TWO CHARACTERIZATIONS OF CONSISTENCY
"... Abstract. This paper offers two characterizations of the KrepsWilson concept of consistent beliefs. One is primarily of applied interest: beliefs are consistent iff they can be constructed by multiplying together vectors of monomials which induce the strategies. The other is primarily of conceptua ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. This paper offers two characterizations of the KrepsWilson concept of consistent beliefs. One is primarily of applied interest: beliefs are consistent iff they can be constructed by multiplying together vectors of monomials which induce the strategies. The other is primarily of conceptual interest: beliefs are consistent iff they can be induced by a “product dispersion ” whose marginal dispersions induce the strategies (a “dispersion ” is defined as a relative probability system, and a “product ” dispersion is defined as a joint dispersion whose marginal dispersions are independent). Both these characterizations are derived with linear algebra.