Results 1  10
of
139
Reinforcement learning: a survey
 Journal of Artificial Intelligence Research
, 1996
"... This paper surveys the field of reinforcement learning from a computerscience perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem ..."
Abstract

Cited by 1298 (23 self)
 Add to MetaCart
This paper surveys the field of reinforcement learning from a computerscience perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trialanderror interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word "reinforcement." The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
No Free Lunch Theorems for Optimization
, 1997
"... A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of “no free lunch ” (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performan ..."
Abstract

Cited by 640 (9 self)
 Add to MetaCart
A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of “no free lunch ” (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performance over another class. These theorems result in a geometric interpretation of what it means for an algorithm to be well suited to an optimization problem. Applications of the NFL theorems to informationtheoretic aspects of optimization and benchmark measures of performance are also presented. Other issues addressed include timevarying optimization problems and a priori “headtohead” minimax distinctions between optimization algorithms, distinctions that result despite the NFL theorems’ enforcing of a type of uniformity over all algorithms.
A Genetic Algorithm Tutorial
 Statistics and Computing
, 1994
"... This tutorial covers the canonical genetic algorithm as well as more experimental forms of genetic algorithms, including parallel island models and parallel cellular genetic algorithms. The tutorial also illustrates genetic search byhyperplane sampling. The theoretical foundations of genetic algorit ..."
Abstract

Cited by 231 (5 self)
 Add to MetaCart
This tutorial covers the canonical genetic algorithm as well as more experimental forms of genetic algorithms, including parallel island models and parallel cellular genetic algorithms. The tutorial also illustrates genetic search byhyperplane sampling. The theoretical foundations of genetic algorithms are reviewed, include the schema theorem as well as recently developed exact models of the canonical genetic algorithm.
Evolution of Homing Navigation in a Real Mobile Robot
 IEEE Transactions on Systems, Man, and CyberneticsPart B: Cybernetics
, 1996
"... Abstract  In this paper we describe the evolution of a discretetime recurrent neural network to control a real mobile robot. In all our experiments the evolutionary procedure is carried out entirely on the physical robot without human intervention. We showthat the autonomous development of a set o ..."
Abstract

Cited by 210 (26 self)
 Add to MetaCart
Abstract  In this paper we describe the evolution of a discretetime recurrent neural network to control a real mobile robot. In all our experiments the evolutionary procedure is carried out entirely on the physical robot without human intervention. We showthat the autonomous development of a set of behaviors for locating a battery charger and periodically returning to it can be achieved by lifting constraints in the design of the robot/environment interactions that were employed in a preliminary experiment. The emergent homing behavior is based on the autonomous development ofaninternal neural topographic map (which is not predesigned) that allows the robot to choose the appropriate trajectory as function of location and remaining energy.
Automatic creation of an autonomous agent: Genetic evolution of a neuralnetwork driven robot
 In
, 1994
"... The paper describes the results of the evolutionary development of a real, neuralnetwork driven mobile robot. The evolutionary approach tothe development of neural controllers for autonomous agents has been successfully used by many researchers, but mostif not all studies have been carried out wi ..."
Abstract

Cited by 156 (25 self)
 Add to MetaCart
The paper describes the results of the evolutionary development of a real, neuralnetwork driven mobile robot. The evolutionary approach tothe development of neural controllers for autonomous agents has been successfully used by many researchers, but mostif not all studies have been carried out with computer simulations. Instead, in this research the whole evolutionary process takes places entirely on a real robot without human intervention. Although the experiments described here tackle a simple task of navigation and obstacle avoidance, we show a number of emergent phenomena that are characteristic of autonomous agents. The neural controllers of the evolved best individuals display a full exploitation of nonlinear and recurrent connections that make them more e cient than analogous mandesigned agents. In order to fully understand and describe the robot behavior, we have also employed quantitative ethological tools [13], and showed that the adaptation dynamics conform to predictions made for animals. 1
Outlier detection for high dimensional data
, 2001
"... The outlier detection problem has important applications in the eld of fraud detection, netw ork robustness analysis, and intrusion detection. Most suc h applications are high dimensional domains in whic hthe data can con tain hundreds of dimensions. Many recen t algorithms use concepts of pro ximit ..."
Abstract

Cited by 155 (4 self)
 Add to MetaCart
The outlier detection problem has important applications in the eld of fraud detection, netw ork robustness analysis, and intrusion detection. Most suc h applications are high dimensional domains in whic hthe data can con tain hundreds of dimensions. Many recen t algorithms use concepts of pro ximity in order to nd outliers based on their relationship to the rest of the data. Ho w ever, in high dimensional space, the data is sparse and the notion of proximity fails to retain its meaningfulness. In fact, the sparsity of high dimensional data implies that every point is an almost equally good outlier from the perspective ofproximitybased de nitions. Consequently, for high dimensional data, the notion of nding meaningful outliers becomes substantially more complex and nonobvious. In this paper, w e discuss new techniques for outlier detection whic h nd the outliers by studying the behavior of projections from the data set. 1.
Neurofuzzy modeling and control
 IEEE Proceedings
, 1995
"... Abstract  Fundamental and advanced developments in neurofuzzy synergisms for modeling and control are reviewed. The essential part of neurofuzzy synergisms comes from a common framework called adaptive networks, which uni es both neural networks and fuzzy models. The fuzzy models under the framew ..."
Abstract

Cited by 147 (1 self)
 Add to MetaCart
Abstract  Fundamental and advanced developments in neurofuzzy synergisms for modeling and control are reviewed. The essential part of neurofuzzy synergisms comes from a common framework called adaptive networks, which uni es both neural networks and fuzzy models. The fuzzy models under the framework of adaptive networks is called ANFIS (AdaptiveNetworkbased Fuzzy Inference System), which possess certain advantages over neural networks. We introduce the design methods for ANFIS in both modeling and control applications. Current problems and future directions for neurofuzzy approaches are also addressed. KeywordsFuzzy logic, neural networks, fuzzy modeling, neurofuzzy modeling, neurofuzzy control, ANFIS. I.
Seeing the Light: Artificial Evolution, Real Vision
, 1994
"... This paper describes results from a specialised piece of visuorobotic equipment which allows the artificial evolution of control systems for visually guided autonomous agents acting in the real world. Preliminary experiments with the equipment are described in which dynamical recurrent networks and ..."
Abstract

Cited by 139 (14 self)
 Add to MetaCart
This paper describes results from a specialised piece of visuorobotic equipment which allows the artificial evolution of control systems for visually guided autonomous agents acting in the real world. Preliminary experiments with the equipment are described in which dynamical recurrent networks and visual sampling morphologies are concurrently evolved to allow agents to robustly perform simple visually guided tasks. Some of these control systems are shown to exhibit a surprising degree of adaptiveness when tested against generalised versions of the task for which they were evolved.
Optimal Mutation Rates in Genetic Search
"... The optimization of a single bit string by means of iterated mutation and selection of the best (a (1+1)Genetic Algorithm) is discussed with respect to three simple tness functions: The counting ones problem, a standard binary encoded integer, and a Gray coded integer optimization problem. A mutati ..."
Abstract

Cited by 114 (0 self)
 Add to MetaCart
The optimization of a single bit string by means of iterated mutation and selection of the best (a (1+1)Genetic Algorithm) is discussed with respect to three simple tness functions: The counting ones problem, a standard binary encoded integer, and a Gray coded integer optimization problem. A mutation rate schedule that is optimal with respect to the success probabilityofmutation is presented for each of the objective functions, and it turns out that the standard binary code can hamper the search process even in case of unimodal objective functions. While normally a mutation rate of 1=l (where l denotes the bit string length) is recommendable, our results indicate that a variation of the mutation rate is useful in cases where the tness function is a multimodal pseudoboolean function, where multimodality may be caused by the objective function as well as the encoding mechanism.