Results 1 - 10
of
123
Reinforcement learning: a survey
- Journal of Artificial Intelligence Research
, 1996
"... This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem ..."
Abstract
-
Cited by 1134 (21 self)
- Add to MetaCart
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word "reinforcement." The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
No Free Lunch Theorems for Optimization
, 1997
"... A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of “no free lunch ” (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performan ..."
Abstract
-
Cited by 516 (8 self)
- Add to MetaCart
A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of “no free lunch ” (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performance over another class. These theorems result in a geometric interpretation of what it means for an algorithm to be well suited to an optimization problem. Applications of the NFL theorems to information-theoretic aspects of optimization and benchmark measures of performance are also presented. Other issues addressed include time-varying optimization problems and a priori “head-to-head” minimax distinctions between optimization algorithms, distinctions that result despite the NFL theorems’ enforcing of a type of uniformity over all algorithms.
Evolution of Homing Navigation in a Real Mobile Robot
- IEEE Transactions on Systems, Man, and Cybernetics--Part B: Cybernetics
, 1996
"... Abstract | In this paper we describe the evolution of a discrete-time recurrent neural network to control a real mobile robot. In all our experiments the evolutionary procedure is carried out entirely on the physical robot without human intervention. We showthat the autonomous development of a set o ..."
Abstract
-
Cited by 194 (25 self)
- Add to MetaCart
Abstract | In this paper we describe the evolution of a discrete-time recurrent neural network to control a real mobile robot. In all our experiments the evolutionary procedure is carried out entirely on the physical robot without human intervention. We showthat the autonomous development of a set of behaviors for locating a battery charger and periodically returning to it can be achieved by lifting constraints in the design of the robot/environment interactions that were employed in a preliminary experiment. The emergent homing behavior is based on the autonomous development ofaninternal neural topographic map (which is not pre-designed) that allows the robot to choose the appropriate trajectory as function of location and remaining energy.
Automatic creation of an autonomous agent: Genetic evolution of a neural-network driven robot
- In
, 1994
"... The paper describes the results of the evolutionary development of a real, neural-network driven mobile robot. The evolutionary approach tothe development of neural controllers for autonomous agents has been successfully used by many researchers, but most-if not all- studies have been carried out wi ..."
Abstract
-
Cited by 142 (23 self)
- Add to MetaCart
The paper describes the results of the evolutionary development of a real, neural-network driven mobile robot. The evolutionary approach tothe development of neural controllers for autonomous agents has been successfully used by many researchers, but most-if not all- studies have been carried out with computer simulations. Instead, in this research the whole evolutionary process takes places entirely on a real robot without human intervention. Although the experiments described here tackle a simple task of navigation and obstacle avoidance, we show a number of emergent phenomena that are characteristic of autonomous agents. The neural controllers of the evolved best individuals display a full exploitation of non-linear and recurrent connections that make them more e cient than analogous man-designed agents. In order to fully understand and describe the robot behavior, we have also employed quantitative ethological tools [13], and showed that the adaptation dynamics conform to predictions made for animals. 1
Seeing the Light: Artificial Evolution, Real Vision
, 1994
"... This paper describes results from a specialised piece of visuo-robotic equipment which allows the artificial evolution of control systems for visually guided autonomous agents acting in the real world. Preliminary experiments with the equipment are described in which dynamical recurrent networks and ..."
Abstract
-
Cited by 131 (15 self)
- Add to MetaCart
This paper describes results from a specialised piece of visuo-robotic equipment which allows the artificial evolution of control systems for visually guided autonomous agents acting in the real world. Preliminary experiments with the equipment are described in which dynamical recurrent networks and visual sampling morphologies are concurrently evolved to allow agents to robustly perform simple visually guided tasks. Some of these control systems are shown to exhibit a surprising degree of adaptiveness when tested against generalised versions of the task for which they were evolved.
Outlier detection for high dimensional data
, 2001
"... The outlier detection problem has important applications in the eld of fraud detection, netw ork robustness analysis, and intrusion detection. Most suc h applications are high dimensional domains in whic hthe data can con tain hundreds of dimensions. Many recen t algorithms use concepts of pro ximit ..."
Abstract
-
Cited by 128 (0 self)
- Add to MetaCart
The outlier detection problem has important applications in the eld of fraud detection, netw ork robustness analysis, and intrusion detection. Most suc h applications are high dimensional domains in whic hthe data can con tain hundreds of dimensions. Many recen t algorithms use concepts of pro ximity in order to nd outliers based on their relationship to the rest of the data. Ho w ever, in high dimensional space, the data is sparse and the notion of proximity fails to retain its meaningfulness. In fact, the sparsity of high dimensional data implies that every point is an almost equally good outlier from the perspective ofproximity-based de nitions. Consequently, for high dimensional data, the notion of nding meaningful outliers becomes substantially more complex and non-obvious. In this paper, w e discuss new techniques for outlier detection whic h nd the outliers by studying the behavior of projections from the data set. 1.
Evolution in time and space - the parallel genetic algorithm
- FOUNDATIONS OF GENETIC ALGORITHMS
, 1991
"... The parallel genetic algorithm (PGA) uses two major modifications compared to the genetic algorithm. Firstly, selection for mating is distributed. Individuals live in a 2-D world. Selection of a mate is done by each individual independently in its neighborhood. Secondly, each individual may improve ..."
Abstract
-
Cited by 104 (13 self)
- Add to MetaCart
The parallel genetic algorithm (PGA) uses two major modifications compared to the genetic algorithm. Firstly, selection for mating is distributed. Individuals live in a 2-D world. Selection of a mate is done by each individual independently in its neighborhood. Secondly, each individual may improve its fitness during its lifetime by e.g. local hill-climbing. The PGA is totally asynchronous, running with maximal efficiency on MIMD parallel computers. The search strategy of the PGA is based on a small number of active and intelligent individuals, whereas a GA uses a large population of passive individuals. We will investigate the PGA with deceptive problems and the traveling salesman problem. We outline why and when the PGA is succesful. Abstractly, a PGA is a parallel search with information exchange between the individuals. If we represent the optimization problem as a fitness landscape in a certain configuration space, we see, that a PGA tries to jump from two local minima to a third, still better local minima, by using the crossover operator. This jump is (probabilistically) successful, if the fitness landscape has a certain correlation. We show the correlation for the traveling salesman problem by a configuration space analysis. The PGA explores implicitly the above correlation.
Species Adaption Genetic Algorithms: A Basis for a Continuing SAGA
, 1992
"... For Artificial Life applications it is useful to extend Genetic Algorithms from a finite search space with fixed-length genotypes to open-ended evolution with variable-length genotypes. A new theoretical analysis is required, as Holland's Schema Theorem only applies to fixed lengths. It will be argu ..."
Abstract
-
Cited by 103 (28 self)
- Add to MetaCart
For Artificial Life applications it is useful to extend Genetic Algorithms from a finite search space with fixed-length genotypes to open-ended evolution with variable-length genotypes. A new theoretical analysis is required, as Holland's Schema Theorem only applies to fixed lengths. It will be argued, using concepts of epistasis and fitness landscapes drawn from theoretical biology, that in the long run a population must havegenotypes of nearly equal length, and this length can only increase slowly. As the length increases, the population will be nearly converged, and hence evolving as a species.
Evolutionary robotics: the Sussex approach
- ROBOTICS AND AUTONOMOUS SYSTEMS
, 1997
"... ... the last 5 years. We explain and justify our distinctive approaches to (artificial) evolution, and to the nature of robot control systems that are evolved. Results are presented from research with evolved controllers for autonomous mobile robots; simulated robots, coevolved animats, real robots ..."
Abstract
-
Cited by 101 (13 self)
- Add to MetaCart
... the last 5 years. We explain and justify our distinctive approaches to (artificial) evolution, and to the nature of robot control systems that are evolved. Results are presented from research with evolved controllers for autonomous mobile robots; simulated robots, coevolved animats, real robots with software controllers, and a real robot with a controller directly evolved in hardware.

