Results 1  10
of
120
Hierarchical mixtures of experts and the EM algorithm
 Neural Computation
, 1994
"... We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood ..."
Abstract

Cited by 723 (19 self)
 Add to MetaCart
We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood problem; in particular, we present an ExpectationMaximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an online learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain. 1
Active Learning with Statistical Models
, 1995
"... For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statisticallybas ..."
Abstract

Cited by 529 (10 self)
 Add to MetaCart
For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statisticallybased learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.
Locally weighted learning
 ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, ass ..."
Abstract

Cited by 448 (52 self)
 Add to MetaCart
This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning t parameters, interference between old and new data, implementing locally weighted learning e ciently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 309 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
Supervised learning from incomplete data via an EM approach
 Advances in Neural Information Processing Systems 6
, 1994
"... Realworld learning tasks may involve highdimensional data sets with arbitrary patterns of missing data. In this paper we present a framework based on maximum likelihood density estimation for learning from such data sets. We use mixture models for the density estimates and make two distinct appeal ..."
Abstract

Cited by 184 (2 self)
 Add to MetaCart
Realworld learning tasks may involve highdimensional data sets with arbitrary patterns of missing data. In this paper we present a framework based on maximum likelihood density estimation for learning from such data sets. We use mixture models for the density estimates and make two distinct appeals to the ExpectationMaximization (EM) principle (Dempster et al., 1977) in deriving a learning algorithmEM is used both for the estimation of mixture components and for coping with missing data. The resulting algorithm is applicable to a wide range of supervised as well as unsupervised learning problems. Results from a classification benchmarkthe iris data setare presented. 1 Introduction Adaptive systems generally operate in environments that are fraught with imperfections; nonetheless they must cope with these imperfections and learn to extract as much relevant information as needed for their particular goals. One form of imperfection is incompleteness in sensing information. Inc...
Neural Networks and Statistical Models
, 1994
"... There has been much publicity about the ability of artificial neural networks to learn and generalize. In fact, the most commonly used artificial neural networks, called multilayer perceptrons, are nothing more than nonlinear regression and discriminant models that can be implemented with standard s ..."
Abstract

Cited by 99 (1 self)
 Add to MetaCart
There has been much publicity about the ability of artificial neural networks to learn and generalize. In fact, the most commonly used artificial neural networks, called multilayer perceptrons, are nothing more than nonlinear regression and discriminant models that can be implemented with standard statistical software. This paper explains what neural networks are, translates neural network jargon into statistical jargon, and shows the relationships between neural networks and statistical models such as generalized linear models, maximum redundancy analysis, projection pursuit, and cluster analysis.
Learning from incomplete data
, 1994
"... Realworld learning tasks often involve highdimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectivesthe likelihoodbased and the Bayesian. The goal is twofold: to place current neura ..."
Abstract

Cited by 58 (0 self)
 Add to MetaCart
Realworld learning tasks often involve highdimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectivesthe likelihoodbased and the Bayesian. The goal is twofold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihoodbased framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and maketwo distinct appeals to the ExpectationMaximization (EM) principle (Dempster et al., 1977)both for the estimation of mixture components and for coping with the missing data.
Computational aspects of motor control and motor learning
 Handbook of Perception and Action: Motor Skills
, 1996
"... 1 This chapter provides a basic introduction to various of the computational issues that arise in the study of motor control and motor learning. A broad set of topics is discussed, including feedback control, feedforward control, the problem of delay, observers, learning algorithms, motor learning, ..."
Abstract

Cited by 39 (2 self)
 Add to MetaCart
1 This chapter provides a basic introduction to various of the computational issues that arise in the study of motor control and motor learning. A broad set of topics is discussed, including feedback control, feedforward control, the problem of delay, observers, learning algorithms, motor learning, and reference models. The goal of the chapter is to provide a unified discussion of these topics, emphasizing the complementary roles that they play in complex control systems. The choice of topics is motivated by their relevance to problems in motor control and motor learning; however, the chapter is not intended to be a review of specific models. Rather we emphasize basic theoretical issues with broad applicability. Many of the ideas described here are developed more fully in standard textbooks in modern systems theory, particularly textbooks on discretetime systems (˚Aström & Wittenmark, 1984), adaptive signal processing (Widrow & Stearns, 1985), and adaptive control systems (Goodwin & Sin, 1984; ˚Aström & Wittenmark, 1989). These texts assume a substantial background in control
Eye and Gaze Tracking for Interactive Graphic Display
 Machine Vision and Applications
, 2002
"... This paper describes preliminary results we have obtained in developing a computer vision system based on active IR illumination for real time gaze tracking for interactive graphic display. Unlike most of the existing gaze tracking techniques, which often require assuming a static head to work well ..."
Abstract

Cited by 37 (9 self)
 Add to MetaCart
This paper describes preliminary results we have obtained in developing a computer vision system based on active IR illumination for real time gaze tracking for interactive graphic display. Unlike most of the existing gaze tracking techniques, which often require assuming a static head to work well and require a cumbersome calibration process for each person, our gaze tracker can perform robust and accurate gaze estimation without calibration and under rather significant head movement. This is made possible by a new gaze calibration procedure that identifies the mapping from pupil parameters to screen coordinates using the Generalized Regression Neural Networks (GRNN). With GRNN, the mapping does not have to be an analytical function and head movement is explicitly accounted for by the gaze mapping function. Furthermore, the mapping function can generalize to other individuals not used in the training. The e#ectiveness of our gaze tracker is demonstrated by preliminary experiments that involve gazecontingent interactive graphic display.
Median Radial Basis Functions Neural Network
 IEEE Trans. on Neural Networks
, 1996
"... Radial Basis Functions (RBF) consists of a twolayer neural network, where each hidden unit implements a kernel function. Each kernel is associated with an activation region from the input space and its output is fed to an output unit. In order to find the parameters of a neural network which embeds ..."
Abstract

Cited by 28 (15 self)
 Add to MetaCart
Radial Basis Functions (RBF) consists of a twolayer neural network, where each hidden unit implements a kernel function. Each kernel is associated with an activation region from the input space and its output is fed to an output unit. In order to find the parameters of a neural network which embeds this structure we take into consideration two different statistical approaches. The first approach uses classical estimation in the learning stage and it is based on the learning vector quantization algorithm and its second order statistics extension. After the presentation of this approach, we introduce the Median Radial Basis Functions (MRBF) algorithm based on robust estimation of the hidden unit parameters. The proposed algorithm employs the marginal median for kernel location estimation and the median of the absolute deviations for the scale parameter estimation. A histogrambased fast implementation is provided for the MRBF algorithm. The theoretical performance of the two training al...