## A Variational Bayesian Framework for Graphical Models (2000)

Venue: | In Advances in Neural Information Processing Systems 12 |

Citations: | 189 - 6 self |

### BibTeX

@INPROCEEDINGS{Attias00avariational,

author = {Hagai Attias},

title = {A Variational Bayesian Framework for Graphical Models},

booktitle = {In Advances in Neural Information Processing Systems 12},

year = {2000},

pages = {209--215},

publisher = {MIT Press}

}

### Years of Citing Articles

### OpenURL

### Abstract

This paper presents a novel practical framework for Bayesian model averaging and model selection in probabilistic graphical models. Our approach approximates full posterior distributions over model parameters and structures, as well as latent variables, in an analytical manner. These posteriors fall out of a free-form optimization procedure, which naturally incorporates conjugate priors. Unlike in large sample approximations, the posteriors are generally nonGaussian and no Hessian needs to be computed. Predictive quantities are obtained analytically. The resulting algorithm generalizes the standard Expectation Maximization algorithm, and its convergence is guaranteed. We demonstrate that this approach can be applied to a large class of models in several domains, including mixture models and source separation. 1 Introduction A standard method to learn a graphical model 1 from data is maximum likelihood (ML). Given a training dataset, ML estimates a single optimal value f...

### Citations

476 | Probabilistic principal component analysis
- Tipping, Bishop
- 1999
(Show Context)
Citation Context ...nfortunately, computations in the Bayesian framework are intractable even for 1 We use the term `model' to refer collectively to parameters and structure. very simple cases (e.g. factor analysis; see =-=[2]-=-). Most existing approximation methods fall into two classes [3]: Markov chain Monte Carlo methods and large sample methods (e.g., Laplace approximation). MCMC methods attempt to achieve exact results... |

440 | On Bayesian analysis of mixtures with an unknown number of components
- Richardson, Green
- 1997
(Show Context)
Citation Context ...ure components are still open. Whereas in theory the Bayesian approach provides a solution, no satisfactory practical algorithm has emerged from the application of involved sampling techniques (e.g., =-=-=-[7]) and approximation methods [3] to this problem. We now present the solution provided by VB. We consider models of the form p(yn j ; m) = m X s=1 p(yn j s n = s; ) p(s n = s j ) ; (7) where yn deno... |

221 | Independent factor analysis
- Attias
- 1999
(Show Context)
Citation Context ... spirit of [8]. (b) Let the parameter posterior q() fall out of free-form optimization, as before. We illustrate this approach in the context of the blind source separation (BSS) problem (see, e.g., [=-=1]-=-). This problem is described by yn = Hxn + un , where xn is an unobserved m-dim source vector at instance n, H is an unknown mixing matrix, and the noise un is Normally distributed with an unknown pre... |

127 | Keeping neural networks simple by minimizing the description length of the weights
- Hinton, Camp
- 1993
(Show Context)
Citation Context ...riational Bayes (VB), a practical framework for Bayesian computations in graphical models. VB draws together variational ideas from intractable latent variables models [8] and from Bayesian inference =-=[4,5,9]-=-, which, in turn, draw on the work of [6]. This framework facilitates analytical calculations of posterior distributions over the hidden variables, parameters and structures. The posteriors fall out o... |

52 |
A view of the em algorithm that justi�es incremental, sparse, and other variants
- Neal, Hinton
- 1998
(Show Context)
Citation Context ...or Bayesian computations in graphical models. VB draws together variational ideas from intractable latent variables models [8] and from Bayesian inference [4,5,9], which, in turn, draw on the work of =-=[6]-=-. This framework facilitates analytical calculations of posterior distributions over the hidden variables, parameters and structures. The posteriors fall out of a free-form optimization procedure whic... |

30 |
E cient approximations for the marginal likelihood of Bayesian networks with hidden variables
- Chickering, Heckerman
- 1997
(Show Context)
Citation Context ...able even for 1 We use the term `model' to refer collectively to parameters and structure. very simple cases (e.g. factor analysis; see [2]). Most existing approximation methods fall into two classes =-=[3]-=-: Markov chain Monte Carlo methods and large sample methods (e.g., Laplace approximation). MCMC methods attempt to achieve exact results but typically require vast computational resources, and become ... |

15 |
Bayesian logistic regression: A variational approach
- Jaakkola, Jordan
- 1997
(Show Context)
Citation Context ...riational Bayes (VB), a practical framework for Bayesian computations in graphical models. VB draws together variational ideas from intractable latent variables models [8] and from Bayesian inference =-=[4,5,9]-=-, which, in turn, draw on the work of [6]. This framework facilitates analytical calculations of posterior distributions over the hidden variables, parameters and structures. The posteriors fall out o... |

13 |
Mean theory for sigmoid belief networks
- Saul, Jaakkola, et al.
- 1996
(Show Context)
Citation Context ...sive. In this paper I present Variational Bayes (VB), a practical framework for Bayesian computations in graphical models. VB draws together variational ideas from intractable latent variables models =-=[8]-=- and from Bayesian inference [4,5,9], which, in turn, draw on the work of [6]. This framework facilitates analytical calculations of posterior distributions over the hidden variables, parameters and s... |

2 |
Bayesian Methods for Mixture of Experts,” NIPS8
- Waterhouse
- 1995
(Show Context)
Citation Context ...riational Bayes (VB), a practical framework for Bayesian computations in graphical models. VB draws together variational ideas from intractable latent variables models [8] and from Bayesian inference =-=[4,5,9]-=-, which, in turn, draw on the work of [6]. This framework facilitates analytical calculations of posterior distributions over the hidden variables, parameters and structures. The posteriors fall out o... |