Results 1  10
of
62,878
A learning algorithm for Boltzmann machines
 Cognitive Science
, 1985
"... The computotionol power of massively parallel networks of simple processing elements resides in the communication bandwidth provided by the hardware connections between elements. These connections con allow a significant fraction of the knowledge of the system to be applied to an instance of a probl ..."
Abstract

Cited by 586 (13 self)
 Add to MetaCart
to a general learning rule for modifying the connection strengths so as to incorporate knowledge obout o task domain in on efficient way. We describe some simple examples in which the learning algorithm creates internal representations thot ore demonstrobly the most efficient way of using
Gaussian processes for machine learning
 in: Adaptive Computation and Machine Learning
, 2006
"... Abstract. We give a basic introduction to Gaussian Process regression models. We focus on understanding the role of the stochastic process and how it is used to define a distribution over functions. We present the simple equations for incorporating training data and examine how to learn the hyperpar ..."
Abstract

Cited by 631 (2 self)
 Add to MetaCart
the hyperparameters using the marginal likelihood. We explain the practical advantages of Gaussian Process and end with conclusions and a look at the current trends in GP work. Supervised learning in the form of regression (for continuous outputs) and classification (for discrete outputs) is an important constituent
Greedy Function Approximation: A Gradient Boosting Machine
 Annals of Statistics
, 2000
"... Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest{descent minimization. A general gradient{descent \boosting" paradigm is developed for additi ..."
Abstract

Cited by 951 (12 self)
 Add to MetaCart
Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest{descent minimization. A general gradient{descent \boosting" paradigm is developed
Ensemble Methods in Machine Learning
 MULTIPLE CLASSIFIER SYSTEMS, LBCS1857
, 2000
"... Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a (weighted) vote of their predictions. The original ensemble method is Bayesian averaging, but more recent algorithms include errorcorrecting output coding, Bagging, and boostin ..."
Abstract

Cited by 607 (3 self)
 Add to MetaCart
Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a (weighted) vote of their predictions. The original ensemble method is Bayesian averaging, but more recent algorithms include errorcorrecting output coding, Bagging
Sparse Bayesian Learning and the Relevance Vector Machine
, 2001
"... This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classication tasks utilising models linear in the parameters. Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the `relevance vec ..."
Abstract

Cited by 958 (5 self)
 Add to MetaCart
vector machine' (RVM), a model of identical functional form to the popular and stateoftheart `support vector machine' (SVM). We demonstrate that by exploiting a probabilistic Bayesian learning framework, we can derive accurate prediction models which typically utilise dramatically fewer
Training restricted Boltzmann machines using approximations to the likelihood gradient
 Proceedings of the 25th international conference on Machine learning
, 2008
"... A new algorithm for training Restricted Boltzmann Machines is introduced. The algorithm, named Persistent Contrastive Divergence, is different from the standard Contrastive Divergence algorithms in that it aims to draw samples from almost exactly the model distribution. It is compared to some standa ..."
Abstract

Cited by 145 (3 self)
 Add to MetaCart
A new algorithm for training Restricted Boltzmann Machines is introduced. The algorithm, named Persistent Contrastive Divergence, is different from the standard Contrastive Divergence algorithms in that it aims to draw samples from almost exactly the model distribution. It is compared to some
SemiSupervised Learning Literature Survey
, 2006
"... We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter ..."
Abstract

Cited by 757 (8 self)
 Add to MetaCart
We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a
Just Relax: Convex Programming Methods for Identifying Sparse Signals in Noise
, 2006
"... This paper studies a difficult and fundamental problem that arises throughout electrical engineering, applied mathematics, and statistics. Suppose that one forms a short linear combination of elementary signals drawn from a large, fixed collection. Given an observation of the linear combination that ..."
Abstract

Cited by 496 (2 self)
 Add to MetaCart
that has been contaminated with additive noise, the goal is to identify which elementary signals participated and to approximate their coefficients. Although many algorithms have been proposed, there is little theory which guarantees that these algorithms can accurately and efficiently solve the problem
Locally weighted learning
 ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, ass ..."
Abstract

Cited by 594 (53 self)
 Add to MetaCart
This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias
Boltzmann machines
, 2007
"... A Boltzmann Machine is a network of symmetrically connected, neuronlike units that make stochastic decisions about whether to be on or off. Boltzmann machines have a simple learning algorithm that allows them to discover interesting features in datasets composed of binary vectors. The learning algor ..."
Abstract

Cited by 220 (21 self)
 Add to MetaCart
and are used to represent the cost function of an optimization problem. The stochastic dynamics of a Boltzmann machine then allow it to sample binary state vectors that represent good solutions to the optimization problem. For a learning problem, the Boltzmann machine is shown a set of binary data vectors
Results 1  10
of
62,878