Results 1  10
of
17
Reinforcement Learning by Policy Search
, 2000
"... One objective of artificial intelligence is to model the behavior of an intelligent agent interacting with its environment. The environment's transformations could be modeled as a Markov chain, whose state is partially observable to the agent and affected by its actions; such processes are known as ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
One objective of artificial intelligence is to model the behavior of an intelligent agent interacting with its environment. The environment's transformations could be modeled as a Markov chain, whose state is partially observable to the agent and affected by its actions; such processes are known as partially observable Markov decision processes (POMDPs). While the environment's dynamics are assumed to obey certain rules, the agent does not know them and must learn. In this dissertation we focus on the agent's adaptation as captured by the reinforcement learning framework. Reinforcement learning means learning a policya mapping of observations into actionsbased on feedback from the environment. The learning can be viewed as browsing a set of policies while evaluating them by trial through interaction with the environment. The set of policies being searched is constrained by the architecture of the agent's controller. POMDPs require a controller to have a memory. We investigate various architectures for controllers with memory, including controllers with external memory, finite state controllers and distributed controllers for multiagent system. For these various controllers we work out the details of the algorithms which learn by ascending the gradient of expected cumulative reinforcement. Building on statistical learning theory and experiment design theory, a policy evaluation algorithm is developed for the case of experience reuse. We address the question of sufficient experience for uniform convergence of policy evaluation and obtain sample complexity bounds for various estimators. Finally, we demonstrate the performance of the proposed algorithms on several domains, the most complex of which is simulated adaptive packet routing in a telecommunication network.
VapnikChervonenkis Dimension of Recurrent Neural Networks
, 1997
"... Most of the work on the VapnikChervonenkis dimension of neural networks has been focused on feedforward networks. However, recurrent networks are also widely used in learning applications, in particular when time is a relevant parameter. This paper provides lower and upper bounds for the VC dimensi ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
Most of the work on the VapnikChervonenkis dimension of neural networks has been focused on feedforward networks. However, recurrent networks are also widely used in learning applications, in particular when time is a relevant parameter. This paper provides lower and upper bounds for the VC dimension of such networks. Several types of activation functions are discussed, including threshold, polynomial, piecewisepolynomial and sigmoidal functions. The bounds depend on two independent parameters: the number w of weights in the network, and the length k of the input sequence. In contrast, for feedforward networks, VC dimension bounds can be expressed as a function of w only. An important difference between recurrent and feedforward nets is that a fixed recurrent net can receive inputs of arbitrary length. Therefore we are particularly interested in the case k AE w. Ignoring multiplicative constants, the main results say roughly the following: ffl For architectures with activation oe = a...
VC Dimension of Neural Networks
 Neural Networks and Machine Learning
, 1998
"... . This paper presents a brief introduction to VapnikChervonenkis (VC) dimension, a quantity which characterizes the difficulty of distributionindependent learning. The paper establishes various elementary results, and discusses how to estimate the VC dimension in several examples of interest in ne ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
. This paper presents a brief introduction to VapnikChervonenkis (VC) dimension, a quantity which characterizes the difficulty of distributionindependent learning. The paper establishes various elementary results, and discusses how to estimate the VC dimension in several examples of interest in neural network theory. 1 Introduction In this expository paper, we present a brief introduction to the subject of computing and estimating the VC dimension of neural network architectures. We provide precise definitions and prove several basic results, discussing also how one estimates VC dimension in several examples of interest in neural network theory. We do not address the learning and estimationtheoretic applications of VC dimension. (Roughly, the VC dimension is a number which helps to quantify the difficulty when learning from examples. The sample complexity, that is, the number of "learning instances" that one must be exposed to, in order to be reasonably certain to derive accurate p...
Neural Systems as Nonlinear Filters
, 2000
"... Experimental data show that biological synapses behave quite differently from the symbolic synapses in all common artificial neural network models. Biological synapses are ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
Experimental data show that biological synapses behave quite differently from the symbolic synapses in all common artificial neural network models. Biological synapses are
On the Learnability of Recursive Data
 Mathematics of Control, Signals, and Systems
, 1999
"... We establish some general results concerning PAC learning: We find a characterization of the property that any consistent algorithm is PAC. It is shown that the shrinking width property is equivalent to PUAC learnability. By counterexample PAC and PUAC learning are shown to be different concepts ..."
Abstract

Cited by 10 (8 self)
 Add to MetaCart
We establish some general results concerning PAC learning: We find a characterization of the property that any consistent algorithm is PAC. It is shown that the shrinking width property is equivalent to PUAC learnability. By counterexample PAC and PUAC learning are shown to be different concepts. We find conditions ensuring that any nearly consistent algorithm is PAC or PUAC, respectively.
Complete Controllability of ContinuousTime Recurrent Neural Networks
 Systems and Control Letters
, 1997
"... This paper presents a characterization of controllability for the class of control systems commonly called (continuoustime) recurrent neural networks. The characterization involves a simple condition on the input matrix, and is proved when the activation function is the hyperbolic tangent. 1 Introd ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
This paper presents a characterization of controllability for the class of control systems commonly called (continuoustime) recurrent neural networks. The characterization involves a simple condition on the input matrix, and is proved when the activation function is the hyperbolic tangent. 1 Introduction This paper continues the study of systemtheoretic properties of recurrent networks. Assume given a locally Lipschitz map oe : R! R. By an ndimensional, minput (recurrent) oenet we mean a continuoustime control system of the form x(t) = ~oe (n) (Ax(t) + Bu(t)) ; (1) where A 2 R n\Thetan and B 2 R n\Thetam . Here, for each map oe : R! R and each positive integer n, we use ~oe (n) to denote the diagonal mapping ~oe (n) : R n ! R n : 0 B @ x 1 . . . x n 1 C A 7! 0 B @ oe(x 1 ) . . . oe(x n ) 1 C A : (2) (Sometimes one includes, in addition, an observation or measurement function y = Cx, but this paper will not deal with observation issues.) The spaces R m a...
Recurrent Networks for Structured Data  a Unifying Approach and Its Properties
 Cognitive Systems Research
, 2002
"... We consider recurrent neural networks which deal with symbolic formulas, terms, or, generally speaking, treestructured data. Approaches like the recursive autoassociative memory, discretetime recurrent networks, folding networks, tensor construction, holographic reduced representations, and recurs ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
We consider recurrent neural networks which deal with symbolic formulas, terms, or, generally speaking, treestructured data. Approaches like the recursive autoassociative memory, discretetime recurrent networks, folding networks, tensor construction, holographic reduced representations, and recursive reduced descriptions fall into this category. They share the basic dynamics of how structured data are processed: the approaches recursively encode symbolic data into a connectionistic representation or decode symbolic data from a connectionistic representation by means of a simple neural function. In this paper, we give an overview of the ability of neural networks with these dynamics to encode and decode treestructured symbolic data. The correlated tasks, approximating and learning mappings where the input domain or the output domain may consist of structured symbolic data, are examined as well.
Generalization Ability of Folding Networks
 IEEE Transactions on Knowledge and Data Engineering
, 2001
"... The information theoretical learnability of folding networks, a very successful approach capable of dealing with tree structured inputs, is examined. We find bounds on the VC, pseudo, and fat shattering dimension of folding networks with various activation functions. As a consequence, valid gen ..."
Abstract

Cited by 8 (8 self)
 Add to MetaCart
The information theoretical learnability of folding networks, a very successful approach capable of dealing with tree structured inputs, is examined. We find bounds on the VC, pseudo, and fat shattering dimension of folding networks with various activation functions. As a consequence, valid generalization of folding networks can be guaranteed. However, distribution independent bounds on the generalization error cannot exist in principle. We propose two approaches which take the specific distribution into account and allow us to derive explicit bounds on the deviation of the empirical error from the real error of a learning algorithm: The first approach requires the probability of large trees to be limited a priori, the second approach deals with situations where the maximum input height in a concrete learning example is restricted.
Generalization of Elman Networks
 Artificial Neural Networks  ICANN'97
"... The Vapnik Chervonenkis dimension of Elman networks is infinite. Here, we find constructions leading to lower bounds for the fat shattering dimension that are linear resp. of order log in the input length even in the case of limited weights and inputs. Since finiteness of this magnitude is eq ..."
Abstract

Cited by 5 (5 self)
 Add to MetaCart
The Vapnik Chervonenkis dimension of Elman networks is infinite. Here, we find constructions leading to lower bounds for the fat shattering dimension that are linear resp. of order log in the input length even in the case of limited weights and inputs. Since finiteness of this magnitude is equivalent to learnability, there is no a priori guarantee for the generalization capability of Elman networks.
A Learning Result for ContinuousTime Recurrent Neural Networks
 Systems and Control Letters
, 1998
"... The following learning problem is considered, for continuoustime recurrent neural networks having sigmoidal activation functions. Given a "black box" representing an unknown system, measurements of output derivatives are collected, for a set of randomly generated inputs, and a network is used to ap ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
The following learning problem is considered, for continuoustime recurrent neural networks having sigmoidal activation functions. Given a "black box" representing an unknown system, measurements of output derivatives are collected, for a set of randomly generated inputs, and a network is used to approximate the observed behavior. It is shown that the number of inputs needed for reliable generalization (the sample complexity of the learning problem) is upper bounded by an expression that grows polynomially with the dimension of the network and logarithmically with the number of output derivatives being matched. 1 Introduction This paper is concerned with systems defined by equations of the following type: x(t) = ~oe (n) (Ax(t) +Bu(t)) ; y(t) = Cx(t) ; (1) where A 2 R n\Thetan , B 2 R n\Thetam , C 2 R p\Thetan , and ~oe (n) : R n ! R n is the diagonal map ~oe (n) : 0 B @ x 1 . . . x n 1 C A 7! 0 B @ oe(x 1 ) . . . oe(x n ) 1 C A ; (2) and oe : R ! R is a Lipsc...