Results 1 -
3 of
3
Learning to predict by the methods of temporal differences
- MACHINE LEARNING
, 1988
"... This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior. Whereas conventional prediction-learning methods assign credit by means of the difference between predi ..."
Abstract
-
Cited by 1060 (33 self)
- Add to MetaCart
This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior. Whereas conventional prediction-learning methods assign credit by means of the difference between predicted and actual outcomes, the new methods assign credit by means of the difference between temporally successive predictions. Although such temporal-difference methods have been used in Samuel's checker player, Holland's bucket brigade, and the author's Adaptive Heuristic Critic, they have remained poorly understood. Here we prove their convergence and optimality for special cases and relate them to supervised-learning methods. For most real-world prediction problems, temporal-difference methods require less memory and less peak computation than conventional methods and they produce more accurate predictions. We argue that most problems to which supervised learning is currently applied are really prediction problems of the sort to which temporal-difference methods can be applied to advantage.
Achieving Robust Neural Representations:
, 2004
"... An important source of evidence concerning rapid adaptation and learning in the brain is the robust phenomenon of repetition suppression---the long lasting and item-specific decrease in neural activity with repeated exposure to an item, yielding sparser, sharper representations. Existing account ..."
Abstract
- Add to MetaCart
An important source of evidence concerning rapid adaptation and learning in the brain is the robust phenomenon of repetition suppression---the long lasting and item-specific decrease in neural activity with repeated exposure to an item, yielding sparser, sharper representations. Existing accounts of repetition suppression are informal and do little more than describe the phenomenon. We explore the hypothesis that repetition suppression arises from an unsupervised learning mechanism that reduces sensitivity to noise by increasing the item-specific gain of neural responses, in conjunction with the assumption that neurons are biased toward infrequent activity. This hypothesis explains key experimental observations concerning changes in neural representation with mere repetition of stimuli, regardless of task relevance. Additionally, this hypothesis explains related data concerning improved discriminability and noise robustness of individual neurons due to practice on a specific task.
Adaptive Filters with H∞ Bounds
, 2002
"... The LMS algorithm, which is widely used in the adaptive filtering community, has been proved to be H∞ optimal in [4]. In this paper we examine other performance measures in the H∞ setting which are of direct relevance to adaptive filtering and system identification. In particular, we consider the sy ..."
Abstract
- Add to MetaCart
The LMS algorithm, which is widely used in the adaptive filtering community, has been proved to be H∞ optimal in [4]. In this paper we examine other performance measures in the H∞ setting which are of direct relevance to adaptive filtering and system identification. In particular, we consider the system identification and estimation employing exponential window problems. We present explicit algorithms and the achievable bounds in each case.

