Results 1 
5 of
5
Loopy Belief Propagation for Approximate Inference: An Empirical Study
 In Proceedings of Uncertainty in AI
, 1999
"... Recently, researchers have demonstrated that "loopy belief propagation"  the use of Pearl's polytree algorithm in a Bayesian network with loops  can perform well in the context of errorcorrecting codes. The most dramatic instance of this is the near Shannonlimit performance of "Turbo ..."
Abstract

Cited by 463 (18 self)
 Add to MetaCart
Recently, researchers have demonstrated that "loopy belief propagation"  the use of Pearl's polytree algorithm in a Bayesian network with loops  can perform well in the context of errorcorrecting codes. The most dramatic instance of this is the near Shannonlimit performance of "Turbo Codes"  codes whose decoding algorithm is equivalent to loopy belief propagation in a chainstructured Bayesian network. In this paper we ask: is there something special about the errorcorrecting code context, or does loopy propagation work as an approximate inference scheme in a more general setting? We compare the marginals computed using loopy propagation to the exact ones in four Bayesian network architectures, including two realworld networks: ALARM and QMR. We find that the loopy beliefs often converge and when they do, they give a good approximation to the correct marginals. However, on the QMR network, the loopy beliefs oscillated and had no obvious relationship ...
Mean Field Theory for Sigmoid Belief Networks
 Journal of Artificial Intelligence Research
, 1996
"... We develop a mean field theory for sigmoid belief networks based on ideas from statistical mechanics. ..."
Abstract

Cited by 116 (12 self)
 Add to MetaCart
We develop a mean field theory for sigmoid belief networks based on ideas from statistical mechanics.
The Neural Autoregressive Distribution Estimator
 In AISTATSâ€™2011
, 2011
"... We describe a new approach for modeling the distribution of highdimensional vectors of discrete variables. This model is inspired by the restricted Boltzmann machine (RBM), which has been shown to be a powerful model of such distributions. However, an RBM typically does not provide a tractable dist ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
We describe a new approach for modeling the distribution of highdimensional vectors of discrete variables. This model is inspired by the restricted Boltzmann machine (RBM), which has been shown to be a powerful model of such distributions. However, an RBM typically does not provide a tractable distribution estimator, since evaluating the probability it assigns to some given observation requires the computation of the socalled partition function, which itself is intractable for RBMs of even moderate size. Our model circumvents this difficulty by decomposing the joint distribution of observations into tractable conditional distributions and modeling each conditional using a nonlinear function similar to a conditional of an RBM. Our model can also be interpreted as an autoencoder wired such that its output can be used to assign valid probabilities to observations. We show that this new model outperforms other multivariate binary distribution estimators on several datasets and performs similarly to a large (but intractable) RBM. 1
An Introduction to Variational Methods for Graphical Methods
 Machine Learning
, 1998
"... . This paper presents a tutorial introduction to the use of variational methods for inference and learning in graphical models (Bayesian networks and Markov random fields). We present a number of examples of graphical models, including the QMRDT database, the sigmoid belief network, the Boltzmann m ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
. This paper presents a tutorial introduction to the use of variational methods for inference and learning in graphical models (Bayesian networks and Markov random fields). We present a number of examples of graphical models, including the QMRDT database, the sigmoid belief network, the Boltzmann machine, and several variants of hidden Markov models, in which it is infeasible to run exact inference algorithms. We then introduce variational methods, which exploit laws of large numbers to transform the original graphical model into a simplified graphical model in which inference is efficient. Inference in the simpified model provides bounds on probabilities of interest in the original model. We describe a general framework for generating variational transformations based on convex duality. Finally we return to the examples and demonstrate how variational algorithms can be formulated in each case.
A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction: Insights and New Models
"... We introduce a new perspective on spectral dimensionality reduction which views these methods as Gaussian Markov random fields (GRFs). Our unifying perspective is based on the maximum entropy principle which is in turn inspired by maximum variance unfolding. The resulting model, which we call maximu ..."
Abstract
 Add to MetaCart
We introduce a new perspective on spectral dimensionality reduction which views these methods as Gaussian Markov random fields (GRFs). Our unifying perspective is based on the maximum entropy principle which is in turn inspired by maximum variance unfolding. The resulting model, which we call maximum entropy unfolding (MEU) is a nonlinear generalization of principal component analysis. We relate the model to Laplacian eigenmaps and isomap. We show that parameter fitting in the locally linear embedding (LLE) is approximate maximum likelihood MEU. We introduce a variant of LLE that performs maximum likelihood exactly: Acyclic LLE (ALLE). We show that MEU and ALLE are competitive with the leading spectral approaches on a robot navigation visualization and a human motion capture data set. Finally the maximum likelihood perspective allows us to introduce a new approach to dimensionality reduction based on L1 regularization of the Gaussian random field via the graphical lasso. 1.