Results 1  10
of
62
Variational heteroscedastic Gaussian process regression
 In 28th International Conference on Machine Learning (ICML11
, 2011
"... Standard Gaussian processes (GPs) model observations ’ noise as constant throughout input space. This is often a too restrictive assumption, but one that is needed for GP inference to be tractable. In this work we present a nonstandard variational approximation that allows accurate inference in het ..."
Abstract

Cited by 32 (5 self)
 Add to MetaCart
Standard Gaussian processes (GPs) model observations ’ noise as constant throughout input space. This is often a too restrictive assumption, but one that is needed for GP inference to be tractable. In this work we present a nonstandard variational approximation that allows accurate inference in heteroscedastic GPs (i.e., under inputdependent noise conditions). Computational cost is roughly twice that of the standard GP, and also scales as O(n3). Accuracy is verified by comparing with the golden standard MCMC and its effectiveness is illustrated on several synthetic and real datasets of diverse characteristics. An application to volatility forecasting is also considered. 1.
The Variational Gaussian Approximation Revisited
, 2009
"... The variational approximation of posterior distributions by multivariate Gaussians has been much less popular in the Machine Learning community compared to the corresponding approximation by factorising distributions. This is for a good reason: the Gaussian approximation is in general plagued by an ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
(Show Context)
The variational approximation of posterior distributions by multivariate Gaussians has been much less popular in the Machine Learning community compared to the corresponding approximation by factorising distributions. This is for a good reason: the Gaussian approximation is in general plagued by an O(N 2) number of variational parameters to be optimised, N being the number of random variables. In this work, we discuss the relationship between the Laplace and the variational approximation and we show that for models with Gaussian priors and factorising likelihoods, the number of variational parameters is actually O(N). The approach is applied to Gaussian process regression with nonGaussian likelihoods. 1
Nonconjugate Variational Message Passing for Multinomial and Binary Regression
"... Variational Message Passing (VMP) is an algorithmic implementation of the Variational Bayes (VB) method which applies only in the special case of conjugate exponential family models. We propose an extension to VMP, which we refer to as Nonconjugate Variational Message Passing (NCVMP) which aims to ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
(Show Context)
Variational Message Passing (VMP) is an algorithmic implementation of the Variational Bayes (VB) method which applies only in the special case of conjugate exponential family models. We propose an extension to VMP, which we refer to as Nonconjugate Variational Message Passing (NCVMP) which aims to alleviate this restriction while maintaining modularity, allowing choice in how expectations are calculated, and integrating into an existing messagepassing framework: Infer.NET. We demonstrate NCVMP on logistic binary and multinomial regression. In the multinomial case we introduce a novel variational bound for the softmax factor which is tighter than other commonly used bounds whilst maintaining computational tractability. 1
FixedForm Variational Posterior Approximation through Stochastic Linear Regression.” Bayesian Analysis
, 2013
"... We propose a general algorithm for approximating nonstandard Bayesian posterior distributions. The algorithm minimizes the KullbackLeibler divergence of an approximating distribution to the intractable posterior distribution. Our method can be used to approximate any posterior distribution, provid ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
(Show Context)
We propose a general algorithm for approximating nonstandard Bayesian posterior distributions. The algorithm minimizes the KullbackLeibler divergence of an approximating distribution to the intractable posterior distribution. Our method can be used to approximate any posterior distribution, provided that it is given in closed form up to the proportionality constant. The approximation can be any distribution in the exponential family or any mixture of such distributions, which means that it can be made arbitrarily precise. Several examples illustrate the speed and accuracy of our approximation method in practice. 1
Concave Gaussian Variational Approximations for Inference in LargeScale Bayesian Linear Models
"... Two popular approaches to forming bounds in approximate Bayesian inference are local variational methods and minimal KullbackLeibler divergence methods. For a large class of models we explicitly relate the two approaches, showing that the local variational method is equivalent to a weakened form of ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
Two popular approaches to forming bounds in approximate Bayesian inference are local variational methods and minimal KullbackLeibler divergence methods. For a large class of models we explicitly relate the two approaches, showing that the local variational method is equivalent to a weakened form of KullbackLeibler Gaussian approximation. This gives a strong motivation to develop efficient methods for KL minimisation. An important and previously unproven property of the KL variational Gaussian bound is that it is a concave function in the parameters of the Gaussian for log concave sites. This observation, along with compact concave parametrisations of the covariance, enables us to develop fast scalable optimisation procedures to obtain lower bounds on the marginal likelihood in large scale Bayesian linear models. 1 BAYESIAN MODELS For parameter w and data D, a large class of Bayesian models describe posteriors of the form p(wD) = 1 N (w µ, Σ) φ(w), (1.1) Z∫ Z = N (w µ, Σ) φ(w)dw for a Gaussian factor N (w µ, Σ) and positive potential function φ(w). This class includes generalised linear models, see e.g. Hardin and Hilbe (2007), and Gaussian noise models in inverse modeling, see e.g. Wipf and Nagarajan (2009). A classic example is Bayesian logistic regression in which N (w µ, Σ) is the
Collaborative gaussian processes for preference learning
 In NIPS
, 2012
"... We present a new model based on Gaussian processes (GPs) for learning pairwise preferences expressed by multiple users. Inference is simplified by using a preference kernel for GPs which allows us to combine supervised GP learning of user preferences with unsupervised dimensionality reduction for ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
(Show Context)
We present a new model based on Gaussian processes (GPs) for learning pairwise preferences expressed by multiple users. Inference is simplified by using a preference kernel for GPs which allows us to combine supervised GP learning of user preferences with unsupervised dimensionality reduction for multiuser systems. The model not only exploits collaborative information from the shared structure in user behavior, but may also incorporate user features if they are available. Approximate inference is implemented using a combination of expectation propagation and variational Bayes. Finally, we present an efficient active learning strategy for querying preferences. The proposed technique performs favorably on realworld data against stateoftheart multiuser preference learning algorithms. 1
Bayesian Modeling with Gaussian Processes using the GPstuff Toolbox
, 2014
"... Gaussian processes (GP) are powerful tools for probabilistic modeling purposes. They can be used to define prior distributions over latent functions in hierarchical Bayesian models. The prior over functions is defined implicitly by the mean and covariance function, which determine the smoothness and ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
(Show Context)
Gaussian processes (GP) are powerful tools for probabilistic modeling purposes. They can be used to define prior distributions over latent functions in hierarchical Bayesian models. The prior over functions is defined implicitly by the mean and covariance function, which determine the smoothness and variability of the function. The inference can then be conducted directly in the function space by evaluating or approximating the posterior process. Despite their attractive theoretical properties GPs provide practical challenges in their implementation. GPstuff is a versatile collection of computational tools for GP models compatible with Linux and Windows MATLAB and Octave. It includes, among others, various inference methods, sparse approximations and tools for model assessment. In this work, we review these tools and demonstrate the use of GPstuff in several models.
Robust Gaussian Process Regression with a Studentt Likelihood
"... This paper considers the robust and efficient implementation of Gaussian process regression with a Studentt observation model, which has a nonlogconcave likelihood. The challenge with the Studentt model is the analytically intractable inference which is why several approximative methods have bee ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
This paper considers the robust and efficient implementation of Gaussian process regression with a Studentt observation model, which has a nonlogconcave likelihood. The challenge with the Studentt model is the analytically intractable inference which is why several approximative methods have been proposed. Expectation propagation (EP) has been found to be a very accurate method in many empirical studies but the convergence of EP is known to be problematic with models containing nonlogconcave site functions. In this paper we illustrate the situations where standard EP fails to converge and review different modifications and alternative algorithms for improving the convergence. We demonstrate that convergence problems may occur during the typeII maximum a posteriori (MAP) estimation of the hyperparameters and show that standard EP may not converge in the MAP values with some difficult data sets. We present a robust implementation which relies primarily on parallel EP updates and uses a momentmatchingbased doubleloop algorithm with adaptively selected step size in difficult cases. The predictive performance of EP is compared with Laplace, variational Bayes, and Markov chain Monte Carlo approximations. Keywords: Gaussian process, robust regression, Studentt distribution, approximate inference, expectation propagation
Fast Dual Variational Inference for NonConjugate Latent Gaussian Models
"... Latent Gaussian models (LGMs) are widely used in statistics and machine learning. Bayesian inference in nonconjugate LGMs is difficult due to intractable integrals involving the Gaussian prior and nonconjugate likelihoods. Algorithms based on variational Gaussian (VG) approximations are widely emp ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
Latent Gaussian models (LGMs) are widely used in statistics and machine learning. Bayesian inference in nonconjugate LGMs is difficult due to intractable integrals involving the Gaussian prior and nonconjugate likelihoods. Algorithms based on variational Gaussian (VG) approximations are widely employed since they strike a favorable balance between accuracy, generality, speed, and ease of use. However, the structure of the optimization problems associated with these approximations remains poorly understood, and standard solvers take too long to converge. We derive a novel dual variational inference approach that exploits the convexity property of the VG approximations. We obtain an algorithm that solves a convex optimization problem, reduces the number of variational parameters, and converges much faster than previous methods. Using realworld data, we demonstrate these advantages on a variety of LGMs, including Gaussian
Fast Convergent Algorithms for Expectation Propagation Approximate Bayesian Inference
"... We propose a novel algorithm to solve the expectation propagation relaxation of Bayesian inference for continuousvariable graphical models. In contrast to most previous algorithms, our method is provably convergent. By marrying convergent EP ideas from [15] with covariance decoupling techniques [23 ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
We propose a novel algorithm to solve the expectation propagation relaxation of Bayesian inference for continuousvariable graphical models. In contrast to most previous algorithms, our method is provably convergent. By marrying convergent EP ideas from [15] with covariance decoupling techniques [23, 13], it runs at least an order of magnitude faster than the most common EP solver. 1