Results 1  10
of
259,152
Greedy Function Approximation: A Gradient Boosting Machine
 Annals of Statistics
, 2000
"... Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest{descent minimization. A general gradient{descent \boosting" paradigm is developed for additi ..."
Abstract

Cited by 951 (12 self)
 Add to MetaCart
Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest{descent minimization. A general gradient{descent \boosting" paradigm is developed
Evolving Neural Networks through Augmenting Topologies
 Evolutionary Computation
"... An important question in neuroevolution is how to gain an advantage from evolving neural network topologies along with weights. We present a method, NeuroEvolution of Augmenting Topologies (NEAT), which outperforms the best fixedtopology method on a challenging benchmark reinforcement learning task ..."
Abstract

Cited by 524 (113 self)
 Add to MetaCart
An important question in neuroevolution is how to gain an advantage from evolving neural network topologies along with weights. We present a method, NeuroEvolution of Augmenting Topologies (NEAT), which outperforms the best fixedtopology method on a challenging benchmark reinforcement learning
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
, 1989
"... The exact form of a gradientfollowing learning algorithm for completely recurrent networks running in continually sampled time is derived and used as the basis for practical algorithms for temporal supervised learning tasks. These algorithms have: (1) the advantage that they do not require a precis ..."
Abstract

Cited by 529 (4 self)
 Add to MetaCart
the retention of information over time periods having either fixed or indefinite length. 1 Introduction A major problem in connectionist theory is to develop learning algorithms that can tap the full computational power of neural networks. Much progress has been made with feedforward networks, and attention
Gaussian processes for machine learning
 in: Adaptive Computation and Machine Learning
, 2006
"... Abstract. We give a basic introduction to Gaussian Process regression models. We focus on understanding the role of the stochastic process and how it is used to define a distribution over functions. We present the simple equations for incorporating training data and examine how to learn the hyperpar ..."
Abstract

Cited by 631 (2 self)
 Add to MetaCart
Abstract. We give a basic introduction to Gaussian Process regression models. We focus on understanding the role of the stochastic process and how it is used to define a distribution over functions. We present the simple equations for incorporating training data and examine how to learn
Active Learning with Statistical Models
, 1995
"... For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statist ..."
Abstract

Cited by 677 (12 self)
 Add to MetaCart
, statisticallybased learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.
Learning to rank using gradient descent
 In ICML
, 2005
"... We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function, and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. We present test results on toy data and on data f ..."
Abstract

Cited by 510 (17 self)
 Add to MetaCart
We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function, and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. We present test results on toy data and on data
Parallel Networks that Learn to Pronounce English Text
 COMPLEX SYSTEMS
, 1987
"... This paper describes NETtalk, a class of massivelyparallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed h ..."
Abstract

Cited by 548 (5 self)
 Add to MetaCart
This paper describes NETtalk, a class of massivelyparallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed
Reinforcement Learning I: Introduction
, 1998
"... In which we try to give a basic intuitive sense of what reinforcement learning is and how it differs and relates to other fields, e.g., supervised learning and neural networks, genetic algorithms and artificial life, control theory. Intuitively, RL is trial and error (variation and selection, search ..."
Abstract

Cited by 5500 (120 self)
 Add to MetaCart
In which we try to give a basic intuitive sense of what reinforcement learning is and how it differs and relates to other fields, e.g., supervised learning and neural networks, genetic algorithms and artificial life, control theory. Intuitively, RL is trial and error (variation and selection
Complete discrete 2D Gabor transforms by neural networks for image analysis and compression
, 1988
"... AbstractA threelayered neural network is described for transforming twodimensional discrete signals into generalized nonorthogonal 2D “Gabor ” representations for image analysis, segmentation, and compression. These transforms are conjoint spatiahpectral representations [lo], [15], which provide ..."
Abstract

Cited by 475 (8 self)
 Add to MetaCart
AbstractA threelayered neural network is described for transforming twodimensional discrete signals into generalized nonorthogonal 2D “Gabor ” representations for image analysis, segmentation, and compression. These transforms are conjoint spatiahpectral representations [lo], [15], which
A new learning algorithm for blind signal separation

, 1996
"... A new online learning algorithm which minimizes a statistical dependency among outputs is derived for blind separation of mixed signals. The dependency is measured by the average mutual information (MI) of the outputs. The source signals and the mixing matrix are unknown except for the number of ..."
Abstract

Cited by 614 (80 self)
 Add to MetaCart
of the sources. The GramCharlier expansion instead of the Edgeworth expansion is used in evaluating the MI. The natural gradient approach is used to minimize the MI. A novel activation function is proposed for the online learning algorithm which has an equivariant property and is easily implemented on a neural
Results 1  10
of
259,152