Results 1 - 10
of
103
Learning to predict by the methods of temporal differences
- MACHINE LEARNING
, 1988
"... This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior. Whereas conventional prediction-learning methods assign credit by means of the difference between predi ..."
Abstract
-
Cited by 1060 (33 self)
- Add to MetaCart
This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior. Whereas conventional prediction-learning methods assign credit by means of the difference between predicted and actual outcomes, the new methods assign credit by means of the difference between temporally successive predictions. Although such temporal-difference methods have been used in Samuel's checker player, Holland's bucket brigade, and the author's Adaptive Heuristic Critic, they have remained poorly understood. Here we prove their convergence and optimality for special cases and relate them to supervised-learning methods. For most real-world prediction problems, temporal-difference methods require less memory and less peak computation than conventional methods and they produce more accurate predictions. We argue that most problems to which supervised learning is currently applied are really prediction problems of the sort to which temporal-difference methods can be applied to advantage.
ANFIS: Adaptive-Network-Based Fuzzy Inference System
, 1993
"... This paper presents the architecture and learning procedure underlying ANFIS (AdaptiveNetwork -based Fuzzy Inference System), a fuzzy inference system implemented in the framework of adaptive networks. By using a hybrid learning procedure, the proposed ANFIS can construct an input-output mapping bas ..."
Abstract
-
Cited by 323 (5 self)
- Add to MetaCart
This paper presents the architecture and learning procedure underlying ANFIS (AdaptiveNetwork -based Fuzzy Inference System), a fuzzy inference system implemented in the framework of adaptive networks. By using a hybrid learning procedure, the proposed ANFIS can construct an input-output mapping based on both human knowledge (in the form of fuzzy if-then rules) and stipulated input-output data pairs. In our simulation, we employ the ANFIS architecture to model nonlinear functions, identify nonlinear components on-linely in a control system, and predict a chaotic time series, all yielding remarkable results. Comparisons with artificail neural networks and earlier work on fuzzy modeling are listed and discussed. Other extensions of the proposed ANFIS and promising applications to automatic control and signal processing are also suggested. 1 Introduction System modeling based on conventional mathematical tools (e.g., differential equations) is not well suited for dealing with ill-define...
Exponentiated Gradient Versus Gradient Descent for Linear Predictors
- Information and Computation
, 1995
"... this paper, we concentrate on linear predictors . To any vector u 2 R ..."
Abstract
-
Cited by 196 (11 self)
- Add to MetaCart
this paper, we concentrate on linear predictors . To any vector u 2 R
Learning and Sequential Decision Making
- LEARNING AND COMPUTATIONAL NEUROSCIENCE
, 1989
"... In this report we show how the class of adaptive prediction methods that Sutton called "temporal difference," or TD, methods are related to the theory of squential decision making. TD methods have been used as "adaptive critics" in connectionist learning systems, and have been proposed as models of ..."
Abstract
-
Cited by 185 (10 self)
- Add to MetaCart
In this report we show how the class of adaptive prediction methods that Sutton called "temporal difference," or TD, methods are related to the theory of squential decision making. TD methods have been used as "adaptive critics" in connectionist learning systems, and have been proposed as models of animal learning in classical conditioning experiments. Here we relate TD methods to decision tasks formulated in terms of a stochastic dynamical system whose behavior unfolds over time under the influence of a decision maker's actions. Strategies are sought for selecting actions so as to maximize a measure of long-term payoff gain. Mathematically, tasks such as this can be formulated as Markovian decision problems, and numerous methods have been proposed for learning how to solve such problems. We show how a TD method can be understood as a novel synthesis of concepts from the theory of stochastic dynamic programming, which comprises the standard method for solving such tasks when a model of the dynamical system is available, and the theory of parameter estimation, which provides the appropriate context for studying learning rules in the form of equations for updating associative strengths in behavioral models, or connection weights in connectionist networks. Because this report is oriented primarily toward the non-engineer interested in animal learning, it presents tutorials on stochastic sequential decision tasks, stochastic dynamic programming, and parameter estimation.
The Helmholtz Machine
, 1995
"... Discovering the structure inherent in a set of patterns is a fundamental aim of statistical inference or learning. One fruitful approach is to build a parameterized stochastic generative model, independent draws from which are likely to produce the patterns. For all but the simplest generative model ..."
Abstract
-
Cited by 165 (22 self)
- Add to MetaCart
Discovering the structure inherent in a set of patterns is a fundamental aim of statistical inference or learning. One fruitful approach is to build a parameterized stochastic generative model, independent draws from which are likely to produce the patterns. For all but the simplest generative models, each pattern can be generated in exponentially many ways. It is thus intractable to adjust the parameters to maximize the probability of the observed patterns. We describe a way of finessing this combinatorial explosion by maximizing an easily computed lower bound on the probability of the observations. Our method can be viewed as a form of hierarchical self-supervised learning that may relate to the function of bottom-up and top-down cortical processing pathways.
A framework for mesencephalic dopamine systems based on predictive Hebbian learning
- J. Neurosci
, 1996
"... We develop a theoretical framework that shows how mesencephalic dopamine systems could distribute to their targets a signal that represents information about future expectations. In particular, we show how activity in the cerebral cortex can make predictions about future receipt of reward and how fl ..."
Abstract
-
Cited by 150 (19 self)
- Add to MetaCart
We develop a theoretical framework that shows how mesencephalic dopamine systems could distribute to their targets a signal that represents information about future expectations. In particular, we show how activity in the cerebral cortex can make predictions about future receipt of reward and how fluctuations in the activity levels of neurons in diffuse dopamine systems above and below baseline levels would represent errors in these predictions that are delivered to cortical and subcottical targets. We present a model for how such errors could be constructed in a real brain that is consistent with physiological results for a subset of dopaminergic neurons located in the ventral tegmental area and surrounding dopaminergic neurons. The theory also makes testable predictions about human choice behavior on a simple decision-making task. Furthermore, we show that, through a simple influence on synaptic plasticity, fluctuations in dopamine release can act to change the predictions in an appropriate manner. Key words: prediction; dopamine; diffuse ascending systems; synaptic plasticity; reinforcement learning; reward In mammals, mesencephalic dopamine neurons participate in a number of important cognitive and physiological functions including motivational processes (Wise, 1982; Fibiger and Phillips, 1986; Koob and Bloom, 1988) reward processing (Wise, 1982) working
Neuro-Fuzzy Modeling and Control
- PROCEEDINGS OF THE IEEE
, 1995
"... Fundamental and advanced developments in neuro-fuzzy synergisms for modeling and control are reviewed. The essential part of neuro-fuzzy synergisms comes from a common framework called adaptive networks, which unifies both neural networks and fuzzy models. The fuzzy models under the framework of ada ..."
Abstract
-
Cited by 110 (1 self)
- Add to MetaCart
Fundamental and advanced developments in neuro-fuzzy synergisms for modeling and control are reviewed. The essential part of neuro-fuzzy synergisms comes from a common framework called adaptive networks, which unifies both neural networks and fuzzy models. The fuzzy models under the framework of adaptive networks is called ANFIS (Adaptive-Network-based Fuzzy Inference System), which possess certain advantages over neural networks. We introduce the design methods for ANFIS in both modeling and control applications. Current problems and future directions for neuro-fuzzy approaches are also addressed.
Multi-Microphone Correlation-Based Processing for Robust Automatic Speech Recognition
- IEEE International Conference on Acoustics, Speech, and Signal Processing
, 1996
"... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . 8 1.1. The Cross-Condition Problem . . . . . . . . . . . . . . . . . . . . 8 1. ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . 8 1.1. The Cross-Condition Problem . . . . . . . . . . . . . . . . . . . . 8 1.2. Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3. Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . 10 Chapter 2. Background . . . . . . . . . . . . . . . . . . . . .12 2.1. Delay-and-Sum Beamforming . . . . . . . . . . . . . . . . . . . 12 2.1.1. Application of Delay-and-Sum Processing to Speech Recognition . . 13 2.2. Traditional Adaptive Arrays . . . . . . . . . . . . . . . . . . . . 13 2.2.1. Adaptive Noise Cancelling . . . . . . . . . . . . . . . . . . 15 2.2.2. Application of Traditional Adaptive Methods to Speech Recognition . 16 2.3. Cross-Correlation Based Arrays . . . . . . . . . . . . . . . . . . 18 2.3.1. Phenomena . . . . . . . . ....
Worst-case Quadratic Loss Bounds for Prediction Using Linear Functions and Gradient Descent
, 1996
"... In this paper we study the performance of gradient descent when applied to the problem of on-line linear prediction in arbitrary inner product spaces. We show worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about t ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
In this paper we study the performance of gradient descent when applied to the problem of on-line linear prediction in arbitrary inner product spaces. We show worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about the sequence to predict. The algorithms we use are variants and extensions of on-line gradient descent. Whereas our algorithms always predict using linear functions as hypotheses, none of our results requires the data to be linearly related. In fact, the bounds proved on the total prediction loss are typically expressed as a function of the total loss of the best fixed linear predictor with bounded norm. All the upper bounds are tight to within constants. Matching lower bounds are provided in some cases. Finally, we apply our results to the problem of on-line prediction for classes of smooth functions.
Robust Indoor Location Estimation of Stationary and Mobile Users
, 2004
"... We present algorithms for estimating the location of stationary and mobile users based on heterogeneous indoor RF technologies. We propose two location algorithms, Selective Fusion Location Estimation (SELFLOC) and Region of Confidence (RoC), which can be used in conjunction with classical location ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
We present algorithms for estimating the location of stationary and mobile users based on heterogeneous indoor RF technologies. We propose two location algorithms, Selective Fusion Location Estimation (SELFLOC) and Region of Confidence (RoC), which can be used in conjunction with classical location algorithms such as triangulation, or with thirdparty commercial location estimation systems. The SELFLOC algorithm infers the user location by selectively fusing location information from multiple wireless technologies and/or multiple classical location algorithms in a theoretically optimal manner. The RoC algorithm attempts to overcome the problem of aliasing in the signal domain, where different physical locations have similar RF characteristics, which is particularly acute when users are mobile. We have empirically validated the proposed algorithms using wireless LAN and Bluetooth technology. Our experimental results show that applying SELFLOC for stationary users when using multiple wireless technologies and multiple classical location algorithms can improve location accuracy significantly, with mean distance errors as low as 1.6 m. For mobile users we find that using RoC can allow us to obtain mean errors as low as 3.7 m. Both algorithms can be used in conjunction with a commercial location estimation system and improve its accuracy further.

