Results 1  10
of
43
Gaussian Processes for Regression
 Advances in Neural Information Processing Systems 8
, 1996
"... The Bayesian analysis of neural networks is difficult because a simple prior over weights implies a complex prior distribution over functions. In this paper we investigate the use of Gaussian process priors over functions, which permit the predictive Bayesian analysis for fixed values of hyperparame ..."
Abstract

Cited by 219 (18 self)
 Add to MetaCart
The Bayesian analysis of neural networks is difficult because a simple prior over weights implies a complex prior distribution over functions. In this paper we investigate the use of Gaussian process priors over functions, which permit the predictive Bayesian analysis for fixed values of hyperparameters to be carried out exactly using matrix operations. Two methods, using optimization and averaging (via Hybrid Monte Carlo) over hyperparameters have been tested on a number of challenging problems and have produced excellent results.
Prediction With Gaussian Processes: From Linear Regression To Linear Prediction And Beyond
 Learning and Inference in Graphical Models
, 1997
"... The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. Th ..."
Abstract

Cited by 195 (4 self)
 Add to MetaCart
The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. This leads in to a more general discussion of Gaussian processes in section 4. Section 5 deals with further issues, including hierarchical modelling and the setting of the parameters that control the Gaussian process, the covariance functions for neural network models and the use of Gaussian processes in classification problems. PREDICTION WITH GAUSSIAN PROCESSES: FROM LINEAR REGRESSION TO LINEAR PREDICTION AND BEYOND 2 1 Introduction In the last decade neural networks have been used to tackle regression and classification problems, with some notable successes. It has also been widely recognized that they form a part of a wide variety of nonlinear statistical techniques that can be used for...
Bayesian Classification with Gaussian Processes
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... We consider the problem of assigning an input vector x to one of m classes by predicting P (cjx) for c = 1; : : : ; m. For a twoclass problem, the probability of class 1 given x is estimated by oe(y(x)), where oe(y) = 1=(1 + e ). A Gaussian process prior is placed on y(x), and is combined wi ..."
Abstract

Cited by 130 (1 self)
 Add to MetaCart
We consider the problem of assigning an input vector x to one of m classes by predicting P (cjx) for c = 1; : : : ; m. For a twoclass problem, the probability of class 1 given x is estimated by oe(y(x)), where oe(y) = 1=(1 + e ). A Gaussian process prior is placed on y(x), and is combined with the training data to obtain predictions for new x points.
Gaussian processes for ordinal regression
 Journal of Machine Learning Research
, 2004
"... We present a probabilistic kernel approach to ordinal regression based on Gaussian processes. A threshold model that generalizes the probit function is used as the likelihood function for ordinal variables. Two inference techniques, based on the Laplace approximation and the expectation propagation ..."
Abstract

Cited by 68 (1 self)
 Add to MetaCart
We present a probabilistic kernel approach to ordinal regression based on Gaussian processes. A threshold model that generalizes the probit function is used as the likelihood function for ordinal variables. Two inference techniques, based on the Laplace approximation and the expectation propagation algorithm respectively, are derived for hyperparameter learning and model selection. We compare these two Gaussian process approaches with a previous ordinal regression method based on support vector machines on some benchmark and realworld data sets, including applications of ordinal regression to collaborative filtering and gene expression analysis. Experimental results on these data sets verify the usefulness of our approach.
Gaussian Processes  A Replacement for Supervised Neural Networks?
"... These lecture notes are based on the work of Neal (1996), Williams and ..."
Abstract

Cited by 53 (0 self)
 Add to MetaCart
These lecture notes are based on the work of Neal (1996), Williams and
Preference learning with gaussian processes
 In Proc. ICML*2005
, 2005
"... In this paper, we propose a probabilistic kernel approach to preference learning based on Gaussian processes. A new likelihood function is proposed to capture the preference relations in the Bayesian framework. The generalized formulation is also applicable to tackle many multiclass problems. The ov ..."
Abstract

Cited by 44 (3 self)
 Add to MetaCart
In this paper, we propose a probabilistic kernel approach to preference learning based on Gaussian processes. A new likelihood function is proposed to capture the preference relations in the Bayesian framework. The generalized formulation is also applicable to tackle many multiclass problems. The overall approach has the advantages of Bayesian methods for model selection and probabilistic prediction. Experimental results compared against the constraint classification approach on several benchmark datasets verify the usefulness of this algorithm. 1.
Bayesian Neural Networks and Density Networks
 Nuclear Instruments and Methods in Physics Research, A
, 1994
"... This paper reviews the Bayesian approach to learning in neural networks, then introduces a new adaptive model, the density network. This is a neural network for which target outputs are provided, but the inputs are unspecied. When a probability distribution is placed on the unknown inputs, a latent ..."
Abstract

Cited by 39 (8 self)
 Add to MetaCart
This paper reviews the Bayesian approach to learning in neural networks, then introduces a new adaptive model, the density network. This is a neural network for which target outputs are provided, but the inputs are unspecied. When a probability distribution is placed on the unknown inputs, a latent variable model is dened that is capable of discovering the underlying dimensionality of a data set. A Bayesian learning algorithm for these networks is derived and demonstrated. 1 Introduction to the Bayesian view of learning A binary classier is a parameterized mapping from an input x to an output y 2 [0; 1]); when its parameters w are specied, the classier states the probability that an input x belongs to class t = 1, rather than the alternative t = 0. Consider a binary classier which models the probability as a sigmoid function of x: P (t = 1jx; w;H) = y(x; w;H) = 1 1 + e wx (1) This form of model is known to statisticians as a linear logistic model, and in the neural networks ...
Bayesian Automatic Relevance Determination Algorithms For Classifying Gene Expression Data
, 2002
"... Motivation: We investigate two new Bayesian classification algorithms incorporating feature selection. These algorithms are applied to the classification of gene expression data derived from cDNA microarrays. ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
Motivation: We investigate two new Bayesian classification algorithms incorporating feature selection. These algorithms are applied to the classification of gene expression data derived from cDNA microarrays.
Bayesian Restoration Using a New Nonstationary EdgePreserving Image Prior
 IEEE TRANSACTIONS ON IMAGE PROCESSING
, 2006
"... In this paper, we propose a class of image restoration algorithms based on the Bayesian approach and a new hierarchical spatially adaptive image prior. The proposed prior has the following two desirable features. First, it models the local image discontinuities in different directions with a model w ..."
Abstract

Cited by 20 (4 self)
 Add to MetaCart
In this paper, we propose a class of image restoration algorithms based on the Bayesian approach and a new hierarchical spatially adaptive image prior. The proposed prior has the following two desirable features. First, it models the local image discontinuities in different directions with a model which is continuous valued. Thus, it preserves edges and generalizes the on/off (binary) line process idea used in previous image priors within the context of Markov random fields (MRFs). Second, it is Gaussian in nature and provides estimates that are easy to compute. Using this new hierarchical prior, two restoration algorithms are derived. The first is based on the maximum a posteriori principle and the second on the Bayesian methodology. Numerical experiments are presented that compare the proposed algorithms among themselves and with previous stationary and non stationary MRFbased with line process algorithms. These experiments demonstrate the advantages of the proposed prior.
Bayesian support vector regression using a unified loss function
 IEEE Transactions on Neural Networks
, 2004
"... In this paper, we use a unified loss function, called the soft insensitive loss function, for Bayesian support vector regression. We follow standard Gaussian processes for regression to set up the Bayesian framework, in which the unified loss function is used in the likelihood evaluation. Under this ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
In this paper, we use a unified loss function, called the soft insensitive loss function, for Bayesian support vector regression. We follow standard Gaussian processes for regression to set up the Bayesian framework, in which the unified loss function is used in the likelihood evaluation. Under this framework, the maximum a posteriori estimate of the function values corresponds to the solution of an extended support vector regression problem. The overall approach has the merits of support vector regression such as convex quadratic programming and sparsity in solution representation. It also has the advantages of Bayesian methods for model adaptation and error bars of its predictions. Experimental results on simulated and realworld data sets indicate that the approach works well even on large data sets.