The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. This leads in to a more general discussion of Gaussian processes in section 4. Section 5 deals with further issues, including hierarchical modelling and the setting of the parameters that control the Gaussian process, the covariance functions for neural network models and the use of Gaussian processes in classification problems. PREDICTION WITH GAUSSIAN PROCESSES: FROM LINEAR REGRESSION TO LINEAR PREDICTION AND BEYOND 2 1 Introduction In the last decade neural networks have been used to tackle regression and classification problems, with some notable successes. It has also been widely recognized that they form a part of a wide variety of non-linear statistical techniques that can be used for...
|
5044
|
Statistical Learning Theory
– Vapnik
- 1998
|
|
3316
|
Neural Networks for Pattern Recognition
– Bishop
- 1995
|
|
769
|
Spline Models for Observational Data
– WAHBA
- 1990
|
|
767
|
Pattern recognition and Neural Networks
– Ripley
- 1996
|
|
691
|
Generalized Additive Models
– Hastie, Tibshirani
- 1990
|
|
606
|
Bayesian Data Analysis
– Gelman, Carlin, et al.
- 1995
|
|
604
|
Statistics for spatial data
– Cressie
- 1993
|
|
523
|
Networks for approximation and learning
– Poggio, Girosi
- 1990
|
|
422
|
Bayesian Learning for Neural Networks
– Neal
- 1996
|
|
303
|
A practical Bayesian framework for backpropagation networks
– MacKay
- 1992
|
|
265
|
Nonparametric Regression and Generalized Linear Models
– Green, Silverman
- 1994
|
|
247
|
Tiao, Bayesian inference in statistical analysis
– Box, C
|
|
182
|
Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition
– Bridle
- 1989
|
|
164
|
Theoretical foundations of the potential function method in pattern recognition learning
– Aizerman, Braverman, et al.
- 1964
|
|
153
|
Design and Analysis of Computer Experiments
– Sacks, J, et al.
- 1989
|
|
142
|
Gaussian Processes for Regression
– Williams, Rasmussen
- 1995
|
|
111
|
Some aspects of the spline smoothing approach to non-parametric regression curve fitting
– Silverman
- 1985
|
|
109
|
Mining Geostatistics
– Journel, Huijbregts
- 1978
|
|
95
|
Evaluation of Gaussian Processes and other Methods for Non-Linear Regression
– Rasmussen
- 1996
|
|
85
|
A correspondence between Bayesian estimation of stochastic processes and smoothing by splines
– Kimeldorf, Wahba
- 1970
|
|
80
|
Monte Carlo implementation of Gaussian process models for Bayesian regression and classification
– Neal
- 1997
|
|
53
|
Efficient Implementation of Gaussian Processes,” paper
– Gibbs, MacKay
- 1997
|
|
44
|
Automatic Smoothing of Regression Functions in Generalized Linear Models
– O'Sullivan, Yandell, et al.
- 1986
|
|
41
|
Maximum Likelihood Estimation of Models for Residual Covariance in Spatial Regression
– Mardia, Marshall
- 1984
|
|
39
|
Bayesian methods for backpropagation networks
– MacKay
- 1994
|
|
34
|
Some new results on neural network approximation
– Hornik
- 1993
|
|
27
|
A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines
– Hutchinson
- 1990
|
|
26
|
Prediction and Regulation by Linear Least-Square Methods
– Whittle
- 1963
|
|
25
|
A Bayesian analysis of kriging
– Handcock, Stein
- 1993
|
|
24
|
Gaussian Processes for Bayesian Classification via Hybrid Monte Carlo
– Barber, Williams
- 1997
|
|
23
|
Curve fitting and optimal design for prediction (with discussion
– O’Hagan
- 1978
|
|
23
|
Mathematical Theory of Probability and Statistics
– Mises
- 1964
|
|
18
|
Bayesian numerical analysis
– Skilling
- 1993
|
|
15
|
Regression with input-dependent noise: A gaussian process treatment
– Goldberg, Williams, et al.
- 1998
|
|
15
|
Computing with infinite networks
– Williams
- 1997
|
|
14
|
Computation with infinite neural networks
– Williams
- 1998
|
|
10
|
Pseudosplines
– Hastie
- 1996
|
|
4
|
Density Ratios, Empirical Likelihood and Cot Death
– Silverman
- 1978
|
|
2
|
Variational Gaussian Process Classifiers. Draft manuscript, available via http://wol.ra.phy.cam.ac.uk/mackay/homepage.html
– Gibbs, MacKay
- 1997
|
|
2
|
Nonparametric estimation of nonstationary covariance structure
– Sampson, Guttorp
- 1992
|
|
2
|
Gaussian processes for Bayesian classi cation via hybrid Monte Carlo
– Barber
- 1997
|
|
2
|
Astochastic estimator of the trace of the in uence matrix for Laplacian smoothing splines
– Hutchinson
- 1990
|
|
1
|
A fast "Monte Carlo cross-validation" procedure for large least squares problems with noisy data
– Girard
- 1989
|
|
1
|
E cient Implementation of Gaussian Processes. Draft manuscript, available from http://wol.ra.phy.cam.ac.uk/mackay/homepage.html
– Gibbs, MacKay
- 1997
|
|
1
|
Variational Gaussian Process Classi ers. Draft manuscript, available via http://wol.ra.phy.cam.ac.uk/mackay/homepage.html
– Gibbs, MacKay
- 1997
|
|
1
|
Computation with in nite neural networks
– Williams
- 1997
|
|
1
|
Computing with in nite networks
– Williams
- 1997
|