Results 1  10
of
38
A unifying view of sparse approximate Gaussian process regression
 Journal of Machine Learning Research
, 2005
"... We provide a new unifying view, including all existing proper probabilistic sparse approximations for Gaussian process regression. Our approach relies on expressing the effective prior which the methods are using. This allows new insights to be gained, and highlights the relationship between existin ..."
Abstract

Cited by 80 (3 self)
 Add to MetaCart
We provide a new unifying view, including all existing proper probabilistic sparse approximations for Gaussian process regression. Our approach relies on expressing the effective prior which the methods are using. This allows new insights to be gained, and highlights the relationship between existing methods. It also allows for a clear theoretically justified ranking of the closeness of the known approximations to the corresponding full GPs. Finally we point directly to designs of new better sparse approximations, combining the best of the existing strategies, within attractive computational constraints.
Implementing approximate Bayesian inference for latent Gaussian models using integrated nested Laplace approximations: A manual for the inlaprogram
, 2008
"... Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalised) linear models, (generalised) additive models, smoothingspline models, statespace models, semiparametric regression, spatial and spatiotemp ..."
Abstract

Cited by 79 (16 self)
 Add to MetaCart
Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalised) linear models, (generalised) additive models, smoothingspline models, statespace models, semiparametric regression, spatial and spatiotemporal models, logGaussian Coxprocesses, geostatistical and geoadditive models. In this paper we consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with nonGaussian response variables. The posterior marginals are not available in closed form due to the nonGaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, both in terms of convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations
Bayesian inference and optimal design in the sparse linear model
 Workshop on Artificial Intelligence and Statistics
"... The linear model with sparsityfavouring prior on the coefficients has important applications in many different domains. In machine learning, most methods to date search for maximum a posteriori sparse solutions and neglect to represent posterior uncertainties. In this paper, we address problems of ..."
Abstract

Cited by 62 (12 self)
 Add to MetaCart
The linear model with sparsityfavouring prior on the coefficients has important applications in many different domains. In machine learning, most methods to date search for maximum a posteriori sparse solutions and neglect to represent posterior uncertainties. In this paper, we address problems of Bayesian optimal design (or experiment planning), for which accurate estimates of uncertainty are essential. To this end, we employ expectation propagation approximate inference for the linear model with Laplace prior, giving new insight into numerical stability properties and proposing a robust algorithm. We also show how to estimate model hyperparameters by empirical Bayesian maximisation of the marginal likelihood, and propose ideas in order to scale up the method to very large underdetermined problems. We demonstrate the versatility of our framework on the application of gene regulatory network identification from microarray expression data, where both the Laplace prior and the active experimental design approach are shown to result in significant improvements. We also address the problem of sparse coding of natural images, and show how our framework can be used for compressive sensing tasks. Part of this work appeared in Seeger et al. (2007b). The gene network identification application appears in Steinke et al. (2007).
Variational Bayesian multinomial probit regression with Gaussian process priors
 Neural Computation
, 2005
"... It is well known in the statistics literature that augmenting binary and polychotomous response models with Gaussian latent variables enables exact Bayesian analysis via Gibbs sampling from the parameter posterior. By adopting such a data augmentation strategy, dispensing with priors over regression ..."
Abstract

Cited by 33 (10 self)
 Add to MetaCart
It is well known in the statistics literature that augmenting binary and polychotomous response models with Gaussian latent variables enables exact Bayesian analysis via Gibbs sampling from the parameter posterior. By adopting such a data augmentation strategy, dispensing with priors over regression coefficients in favour of Gaussian Process (GP) priors over functions, and employing variational approximations to the full posterior we obtain efficient computational methods for Gaussian Process classification in the multiclass setting 1. The model augmentation with additional latent variables ensures full a posteriori class coupling whilst retaining the simple a priori independent GP covariance structure from which sparse approximations, such as multiclass Informative Vector Machines (IVM), emerge in a very natural and straightforward manner. This is the first time that a fully Variational Bayesian treatment for multiclass GP classification has been developed without having to resort to additional explicit approximations to the nonGaussian likelihood term. Empirical comparisons with exact analysis via MCMC and Laplace approximations illustrate the utility of the variational approximation as a computationally economic alternative to full MCMC and it is shown to be more accurate than the Laplace approximation. 1
Approximations for Binary Gaussian Process Classification
"... We provide a comprehensive overview of many recent algorithms for approximate inference in Gaussian process models for probabilistic binary classification. The relationships between several approaches are elucidated theoretically, and the properties of the different algorithms are corroborated by ex ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
We provide a comprehensive overview of many recent algorithms for approximate inference in Gaussian process models for probabilistic binary classification. The relationships between several approaches are elucidated theoretically, and the properties of the different algorithms are corroborated by experimental results. We examine both 1) the quality of the predictive distributions and 2) the suitability of the different marginal likelihood approximations for model selection (selecting hyperparameters) and compare to a gold standard based on MCMC. Interestingly, some methods produce good predictive distributions although their marginal likelihood approximations are poor. Strong conclusions are drawn about the methods: The Expectation Propagation algorithm is almost always the method of choice unless the computational budget is very tight. We also extend existing methods in various ways, and provide unifying code implementing all approaches. Keywords: Gaussian process priors, probabilistic classification, Laplaces’s approximation, expectation propagation, variational bounding, mean field methods, marginal likelihood evidence,
Elliptical slice sampling
 JMLR: W&CP
"... Many probabilistic models introduce strong dependencies between variables using a latent multivariate Gaussian distribution or a Gaussian process. We present a new Markov chain Monte Carlo algorithm for performing inference in models with multivariate Gaussian priors. Its key properties are: 1) it h ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
Many probabilistic models introduce strong dependencies between variables using a latent multivariate Gaussian distribution or a Gaussian process. We present a new Markov chain Monte Carlo algorithm for performing inference in models with multivariate Gaussian priors. Its key properties are: 1) it has simple, generic code applicable to many models, 2) it has no free parameters, 3) it works well for a variety of Gaussian process based models. These properties make our method ideal for use while model building, removing the need to spend time deriving and tuning updates for more complex algorithms.
Approximation Methods for Gaussian Process Regression
, 2007
"... A wealth of computationally efficient approximation methods for Gaussian process regression have been recently proposed. We give a unifying overview of sparse approximations, following QuiñoneroCandela and Rasmussen (2005), and a brief review of approximate matrixvector multiplication methods. 1 ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
A wealth of computationally efficient approximation methods for Gaussian process regression have been recently proposed. We give a unifying overview of sparse approximations, following QuiñoneroCandela and Rasmussen (2005), and a brief review of approximate matrixvector multiplication methods. 1
Slice sampling covariance hyperparameters of latent Gaussian models
 IN ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 23
, 2010
"... The Gaussian process (GP) is a popular way to specify dependencies between random variables in a probabilistic model. In the Bayesian framework the covariance structure can be specified using unknown hyperparameters. Integrating over these hyperparameters considers different possible explanations fo ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
The Gaussian process (GP) is a popular way to specify dependencies between random variables in a probabilistic model. In the Bayesian framework the covariance structure can be specified using unknown hyperparameters. Integrating over these hyperparameters considers different possible explanations for the data when making predictions. This integration is often performed using Markov chain Monte Carlo (MCMC) sampling. However, with nonGaussian observations standard hyperparameter sampling approaches require careful tuning and may converge slowly. In this paper we present a slice sampling approach that requires little tuning while mixing well in both strong and weakdata regimes.
Bayesian Gaussian Process Classification with the EMEP Algorithm
"... Abstract—Gaussian process classifiers (GPCs) are Bayesian probabilistic kernel classifiers. In GPCs, the probability of belonging to a certain class at an input location is monotonically related to the value of some latent function at that location. Starting from a Gaussian process prior over this l ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
Abstract—Gaussian process classifiers (GPCs) are Bayesian probabilistic kernel classifiers. In GPCs, the probability of belonging to a certain class at an input location is monotonically related to the value of some latent function at that location. Starting from a Gaussian process prior over this latent function, data are used to infer both the posterior over the latent function and the values of hyperparameters to determine various aspects of the function. Recently, the expectation propagation (EP) approach has been proposed to infer the posterior over the latent function. Based on this work, we present an approximate EM algorithm, the EMEP algorithm, to learn both the latent function and the hyperparameters. This algorithm is found to converge in practice and provides an efficient Bayesian framework for learning hyperparameters of the kernel. A multiclass extension of the EMEP algorithm for GPCs is also derived. In the experimental results, the EMEP algorithms are as good or better than other methods for GPCs or Support Vector Machines (SVMs) with crossvalidation. Index Terms—Gaussian process classification, Bayesian methods, kernel methods, expectation propagation, EMEP algorithm. 1
Assesing approximations for gaussian process classification
 In Advances in Neural Information Processing Systems
, 2005
"... Gaussian processes are attractive models for probabilistic classification but unfortunately exact inference is analytically intractable. We compare Laplace’s method and Expectation Propagation (EP) focusing on marginal likelihood estimates and predictive performance. We explain theoretically and cor ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
Gaussian processes are attractive models for probabilistic classification but unfortunately exact inference is analytically intractable. We compare Laplace’s method and Expectation Propagation (EP) focusing on marginal likelihood estimates and predictive performance. We explain theoretically and corroborate empirically that EP is superior to Laplace. We also compare to a sophisticated MCMC scheme and show that EP is surprisingly accurate. In recent years models based on Gaussian process (GP) priors have attracted much attention in the machine learning community. Whereas inference in the GP regression model with Gaussian noise can be done analytically, probabilistic classification using GPs is analytically intractable. Several approaches to approximate Bayesian inference have been suggested, including Laplace’s approximation, Expectation Propagation (EP), variational approximations and Markov chain Monte Carlo (MCMC) sampling, some of these in conjunction with generalisation bounds, online learning schemes and sparse approximations.