Results 1 - 10
of
83
Hierarchical mixtures of experts and the EM algorithm
- Neural Computation
, 1994
"... We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hi-erarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a max-imum likelihood ..."
Abstract
-
Cited by 634 (19 self)
- Add to MetaCart
We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hi-erarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a max-imum likelihood problem; in particular, we present an Expectation-Maximization (EM) algorithm for adjusting the parame-ters of the architecture. We also develop an on-line learning algorithm in which the pa-rameters are updated incrementally. Com-parative simulation results are presented in the robot dynamics domain. 1
Active Learning with Statistical Models
, 1995
"... For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statistically-bas ..."
Abstract
-
Cited by 402 (7 self)
- Add to MetaCart
For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.
Locally weighted learning
- Artificial Intelligence Review
, 1997
"... This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, ass ..."
Abstract
-
Cited by 370 (43 self)
- Add to MetaCart
This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning t parameters, interference between old and new data, implementing locally weighted learning e ciently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.
Regularization Theory and Neural Networks Architectures
- Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract
-
Cited by 257 (30 self)
- Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
Supervised learning from incomplete data via an EM approach
- Advances in Neural Information Processing Systems 6
, 1994
"... Real-world learning tasks may involve high-dimensional data sets with arbitrary patterns of missing data. In this paper we present a framework based on maximum likelihood density estimation for learning from such data sets. We use mixture models for the density estimates and make two distinct appeal ..."
Abstract
-
Cited by 157 (2 self)
- Add to MetaCart
Real-world learning tasks may involve high-dimensional data sets with arbitrary patterns of missing data. In this paper we present a framework based on maximum likelihood density estimation for learning from such data sets. We use mixture models for the density estimates and make two distinct appeals to the ExpectationMaximization (EM) principle (Dempster et al., 1977) in deriving a learning algorithm---EM is used both for the estimation of mixture components and for coping with missing data. The resulting algorithm is applicable to a wide range of supervised as well as unsupervised learning problems. Results from a classification benchmark---the iris data set---are presented. 1 Introduction Adaptive systems generally operate in environments that are fraught with imperfections; nonetheless they must cope with these imperfections and learn to extract as much relevant information as needed for their particular goals. One form of imperfection is incompleteness in sensing information. Inc...
Neural Networks and Statistical Models
, 1994
"... There has been much publicity about the ability of artificial neural networks to learn and generalize. In fact, the most commonly used artificial neural networks, called multilayer perceptrons, are nothing more than nonlinear regression and discriminant models that can be implemented with standard s ..."
Abstract
-
Cited by 82 (1 self)
- Add to MetaCart
There has been much publicity about the ability of artificial neural networks to learn and generalize. In fact, the most commonly used artificial neural networks, called multilayer perceptrons, are nothing more than nonlinear regression and discriminant models that can be implemented with standard statistical software. This paper explains what neural networks are, translates neural network jargon into statistical jargon, and shows the relationships between neural networks and statistical models such as generalized linear models, maximum redundancy analysis, projection pursuit, and cluster analysis.
Learning from incomplete data
, 1994
"... Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neura ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and maketwo distinct appeals to the Expectation-Maximization (EM) principle (Dempster et al., 1977)---both for the estimation of mixture components and for coping with the missing data.
Computational aspects of motor control and motor learning
- Handbook of Perception and Action: Motor Skills
, 1996
"... 1 This chapter provides a basic introduction to various of the computational issues that arise in the study of motor control and motor learning. A broad set of topics is discussed, including feedback control, feedforward control, the problem of delay, observers, learning algorithms, motor learning, ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
1 This chapter provides a basic introduction to various of the computational issues that arise in the study of motor control and motor learning. A broad set of topics is discussed, including feedback control, feedforward control, the problem of delay, observers, learning algorithms, motor learning, and reference models. The goal of the chapter is to provide a unified discussion of these topics, emphasizing the complementary roles that they play in complex control systems. The choice of topics is motivated by their relevance to problems in motor control and motor learning; however, the chapter is not intended to be a review of specific models. Rather we emphasize basic theoretical issues with broad applicability. Many of the ideas described here are developed more fully in standard textbooks in modern systems theory, particularly textbooks on discrete-time systems (˚Aström & Wittenmark, 1984), adaptive signal processing (Widrow & Stearns, 1985), and adaptive control systems (Goodwin & Sin, 1984; ˚Aström & Wittenmark, 1989). These texts assume a substantial background in control
Eye and Gaze Tracking for Interactive Graphic Display
- Machine Vision and Applications
, 2002
"... This paper describes preliminary results we have obtained in developing a computer vision system based on active IR illumination for real time gaze tracking for interactive graphic display. Unlike most of the existing gaze tracking techniques, which often require assuming a static head to work well ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
This paper describes preliminary results we have obtained in developing a computer vision system based on active IR illumination for real time gaze tracking for interactive graphic display. Unlike most of the existing gaze tracking techniques, which often require assuming a static head to work well and require a cumbersome calibration process for each person, our gaze tracker can perform robust and accurate gaze estimation without calibration and under rather significant head movement. This is made possible by a new gaze calibration procedure that identifies the mapping from pupil parameters to screen coordinates using the Generalized Regression Neural Networks (GRNN). With GRNN, the mapping does not have to be an analytical function and head movement is explicitly accounted for by the gaze mapping function. Furthermore, the mapping function can generalize to other individuals not used in the training. The e#ectiveness of our gaze tracker is demonstrated by preliminary experiments that involve gaze-contingent interactive graphic display.
Memory-Based Neural Networks For Robot Learning
- Neurocomputing
, 1995
"... This paper explores a memory-based approach to robot learning, using memorybased neural networks to learn models of the task to be performed. Steinbuch and Taylor presented neural network designs to explicitly store training data and do nearest neighbor lookup in the early 1960s. In this paper their ..."
Abstract
-
Cited by 24 (8 self)
- Add to MetaCart
This paper explores a memory-based approach to robot learning, using memorybased neural networks to learn models of the task to be performed. Steinbuch and Taylor presented neural network designs to explicitly store training data and do nearest neighbor lookup in the early 1960s. In this paper their nearest neighbor network is augmented with a local model network, which fits a local model to a set of nearest neighbors. This network design is equivalent to a statistical approach known as locally weighted regression, in which a local model is formed to answer each query, using a weighted regression in which nearby points (similar experiences) are weighted more than distant points (less relevant experiences). We illustrate this approach by describing how it has been used to enable a robot to learn a difficult juggling task. Keywords: memory-based, robot learning, locally weighted regression, nearest neighbor, local models. 1 Introduction An important problem in motor learning is approxim...

