Results 11 - 20
of
35
Memory-Based Learning for Control
- CARNEGIE MELLON UNIVERSITY
, 1995
"... The central thesis of this article is that memory-based methods provide natural and powerful mechanisms for high-autonomy learning control. This paper takes the form of a survey of the ways in which memory-based methods can and have been applied to control tasks, with an emphasis on tasks in robotic ..."
Abstract
-
Cited by 25 (3 self)
- Add to MetaCart
The central thesis of this article is that memory-based methods provide natural and powerful mechanisms for high-autonomy learning control. This paper takes the form of a survey of the ways in which memory-based methods can and have been applied to control tasks, with an emphasis on tasks in robotics and manufacturing. We explain the various forms that control tasks can take, and how this impacts on the choice of learning algorithm. We show a progression of five increasingly more complex algorithms which are applicable to increasingly more complex kinds of control tasks. We examine their empirical behavior on robotic and industrial tasks. The final section discusses the interesting impact that explicitly remembering all previous experiences has on the problem of learning control.
Learning to Catch: Applying Nearest Neighbor Algorithms to Dynamic Control Tasks (Extended Abstract)
- Selecting Models from Data: Artificial Intelligence and Statistics IV
, 1993
"... Steven L. Salzberg 1 and David W. Aha 2 1 Introduction Dynamic control problems are the subject of much research in machine learning (e.g., Selfridge, Sutton, & Barto, 1985; Sammut, 1990; Sutton, 1990). Some of these studies investigated the applicability of various k-nearest neighbor methods (D ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
Steven L. Salzberg 1 and David W. Aha 2 1 Introduction Dynamic control problems are the subject of much research in machine learning (e.g., Selfridge, Sutton, & Barto, 1985; Sammut, 1990; Sutton, 1990). Some of these studies investigated the applicability of various k-nearest neighbor methods (Dasarathy, 1990) to solve these tasks by modifying control strategies based on previously gained experience (e.g., Connell & Utgoff, 1987; Atkeson, 1989; Moore, 1990; 1991). However, these previous studies did not highlight the fact that small changes in the design of these algorithms drastically alter their learning behavior. This paper describes a preliminary study that investigates this issue in the context of a difficult dynamic control task: learning to catch a ball moving in a three-dimensional space, an important problem in robotics research (Geng et al., 1991). Our thesis in this paper is that agents can improve substantially at physical tasks by storing experiences without explicitly...
Scalable Techniques from Nonparametric Statistics for Real Time Robot Learning
, 2000
"... Locally weighted learning (LWL) is a class of techniques from nonparametric statistics that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
Locally weighted learning (LWL) is a class of techniques from nonparametric statistics that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional belief that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested on up to 90 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, polebalancing by a humanoid robot arm, and inverse-dynamics learning for a seven and a 30 degree-of-freedom robot. In all these examples, the application of our statistical n...
An Integrated Instance-Based Learning Algorithm
- Computational Intelligence
, 2000
"... The basic nearest-neighbor rule generalizes well in many domains but has several shortcomings, including inappropriate distance functions, large storage requirements, slow execution time, sensitivity to noise, and an inability to adjust its decision boundaries after storing the training data. This p ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
The basic nearest-neighbor rule generalizes well in many domains but has several shortcomings, including inappropriate distance functions, large storage requirements, slow execution time, sensitivity to noise, and an inability to adjust its decision boundaries after storing the training data. This paper proposes methods for overcoming each of these weaknesses and combines these methods into a comprehensive learning system called the Integrated Decremental Instance-Based Learning Algorithm (IDIBL) that seeks to reduce storage, improve execution speed, and increase generalization accuracy, when compared to the basic nearest neighbor algorithm and other learning models. IDIBL tunes its own parameters using a new measure of fitness that combines confidence and cross-validation (CVC) accuracy in order to avoid discretization problems with more traditional leave-one-out cross-validation (LCV). In our experiments IDIBL achieves higher generalization accuracy than other less comprehensive instance-based learning algorithms, while requiring less than onefourth the storage of the nearest neighbor algorithm and improving execution speed by a corresponding factor. In experiments on 21 datasets, IDIBL also achieves higher generalization accuracy than those reported for 16 major machine learning and neural network models.
Case-Based Acquisition of Place Knowledge
- Proceedings of the Twelfth International Conference on Machine Learning
, 1995
"... In this paper we define the task of place learning and describe one approach to this problem. The framework represents distinct places using evidence grids, a probabilistic description of occupancy. Place recognition relies on case-based classification, augmented by a registration process to correct ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
In this paper we define the task of place learning and describe one approach to this problem. The framework represents distinct places using evidence grids, a probabilistic description of occupancy. Place recognition relies on case-based classification, augmented by a registration process to correct for translations. The learning mechanism is also similar to that in casebased systems, involving the simple storage of inferred evidence grids. Experimental studies with both physical and simulated robots suggest that this approach improves place recognition with experience, that it can handle significant sensor noise, and that it scales well to increasing numbers of places. Previous researchers have studied evidence grids and place learning, but they have not combined these two powerful concepts, nor have they used the experimental methods of machine learning to evaluate their methods' abilities. 1. Introduction and Basic Concepts A physical agent exists in an environment, and knowledge ...
A Teaching Strategy for Memory-Based Control
, 1997
"... Combining different machine learning algorithms in the same system can produce benefits above and beyond what either method could achieve alone. This paper demonstrates that genetic algorithms can be used in conjunction with lazy learning to solve examples of a difficult class of delayed reinforceme ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Combining different machine learning algorithms in the same system can produce benefits above and beyond what either method could achieve alone. This paper demonstrates that genetic algorithms can be used in conjunction with lazy learning to solve examples of a difficult class of delayed reinforcement learning problems better than either method alone. This class, the class of differential games, includes numerous important control problems that arise in robotics, planning, game playing, and other areas, and solutions for differential games suggest solution strategies for the general class of planning and control problems. We conducted a series of experiments applying three learning approaches---lazy Q-learning, k-nearest neighbor (k-NN), and a genetic algorithm---to a particular differential game called a pursuit game. Our experiments demonstrate that k-NN had great difficulty solving the problem, while a lazy version of Q-learning performed moderately well and the genetic algorithm pe...
Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning
- in Neural Information Processing Systems 9
, 1996
"... Model learning combined with dynamic programming has been shown to be effective for learning control of continuous state dynamic systems. The simplest method assumes the learned model is correct and applies dynamic programming to it, but many approximators provide uncertainty estimates on the fit. H ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Model learning combined with dynamic programming has been shown to be effective for learning control of continuous state dynamic systems. The simplest method assumes the learned model is correct and applies dynamic programming to it, but many approximators provide uncertainty estimates on the fit. How can they be exploited? This paper addresses the case where the system must be prevented from having catastrophic failures during learning. We propose a new algorithm adapted from the dual control literature and use Bayesian locally weighted regression models with stochastic dynamic programming. A common reinforcement learning assumption is that aggressive exploration should be encouraged. This paper addresses the converse case in which the system has to reign in exploration. The algorithm is illustrated on a 4 dimensional simulated control problem. 1 Introduction Reinforcement learning and related grid-based dynamic programming techniques are increasingly being applied to dynamic system...
Receptive Field Weighted Regression
, 1997
"... We introduce a constructive, incremental learning system for regression problems that models data by means of spatially localized linear models. In contrast to other approaches, the size and shape of the receptive field of each locally linear model as well as the parameters of the locally linear mod ..."
Abstract
-
Cited by 11 (7 self)
- Add to MetaCart
We introduce a constructive, incremental learning system for regression problems that models data by means of spatially localized linear models. In contrast to other approaches, the size and shape of the receptive field of each locally linear model as well as the parameters of the locally linear model itself are learned independently, i.e., without the need for competition or any other kind of communication. This characteristic is accomplished by incrementally minimizing a weighted penalized local cross validation error. As a result, we obtain a learning system that can allocate resources as needed while dealing with the bias-variance dilemma in a principled way. The spatial localization of the linear models increases robustness towards negative interference. Our learning system can be interpreted as a nonparametric adaptive bandwidth smoother, as a mixture of experts where the experts are trained in isolation, and as a learning system which profits from combining independent expert knowledge on the same problem. It illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields.
Local Dimensionality Reduction for Locally Weighted Learning
, 1997
"... Incremental learning of sensorimotor transformations in high dimensional spaces is one of the basic prerequisites for the success of autonomous robot devices as well as biological movement systems. So far, due to sparsity of data in high dimensional spaces, learning in such settings requires a signi ..."
Abstract
-
Cited by 11 (6 self)
- Add to MetaCart
Incremental learning of sensorimotor transformations in high dimensional spaces is one of the basic prerequisites for the success of autonomous robot devices as well as biological movement systems. So far, due to sparsity of data in high dimensional spaces, learning in such settings requires a signi#cant amount of prior knowledge about the learning task, usually provided byahuman expert. In this paper we suggest a partial revision of the view. Based on empirical studies, it can been observed that, despite being globally high dimensional and sparse, data distributions from physical movement systems are locally low dimensional and dense. Under this assumption, we derive a learning algorithm, Locally Adaptive Subspace Regression, that exploits this property by combining a local dimensionality reduction as a preprocessing step with a nonparametric learning technique, locally weighted regression. The usefulness of the algorithm and the validity of its assumptions are illustrated for a synthetic data set and data of the inverse dynamics of an actual 7 degree-of-freedom anthropomorphic robot arm.
Applying Online Search Techniques to Continuous-State Reinforcement Learning
, 1998
"... In this paper, we describe methods for efficiently computing better solutions to control problems in continuous state spaces. We provide algorithms that exploit online search to boost the power of very approximate value functions discovered by traditional reinforcement learning techniques. We e ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
In this paper, we describe methods for efficiently computing better solutions to control problems in continuous state spaces. We provide algorithms that exploit online search to boost the power of very approximate value functions discovered by traditional reinforcement learning techniques. We examine local searches, where the agent performs a finite-depth lookahead search, and global searches, where the agent performs a search for a trajectory all the way from the current state to a goal state. The key to the success of the local methods lies in taking a value function, which gives a rough solution to the hard problem of finding good trajectories from every single state, and combining that with online search, which then gives an accurate solution to the easier problem of finding a good trajectory specifically from the current state. The key to the success of the global methods lies in using aggressive state-space search techniques such as uniform-cost search and A ,...

