Results 1  10
of
16
Graphical models, exponential families, and variational inference. Foundations Trends
 Ihler (ihler@ics.uci.edu), University of California, Irvine. Michael
"... The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building largescale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fiel ..."
Abstract

Cited by 428 (27 self)
 Add to MetaCart
The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building largescale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide varietyof algorithms — among them sumproduct, cluster variational methods, expectationpropagation, mean field methods, maxproduct and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in largescale statistical models. 1
Tightening LP Relaxations for MAP using Message Passing
"... Linear Programming (LP) relaxations have become powerful tools for finding the most probable (MAP) configuration in graphical models. These relaxations can be solved efficiently using messagepassing algorithms such as belief propagation and, when the relaxation is tight, provably find the MAP confi ..."
Abstract

Cited by 65 (10 self)
 Add to MetaCart
Linear Programming (LP) relaxations have become powerful tools for finding the most probable (MAP) configuration in graphical models. These relaxations can be solved efficiently using messagepassing algorithms such as belief propagation and, when the relaxation is tight, provably find the MAP configuration. The standard LP relaxation is not tight enough in many realworld problems, however, and this has lead to the use of higher order clusterbased LP relaxations. The computational cost increases exponentially with the size of the clusters and limits the number and type of clusters we can use. We propose to solve the cluster selection problem monotonically in the dual LP, iteratively selecting clusters with guaranteed improvement, and quickly resolving with the added clusters by reusing the existing solution. Our dual messagepassing algorithm finds the MAP configuration in protein sidechain placement, protein design, and stereo problems, in cases where the standard LP relaxation fails. 1
Divergence Measures and Message Passing
, 2005
"... This paper presents a unifying view of messagepassing algorithms, as methods to approximate a complex Bayesian network by a simpler network with minimum information divergence. In this view, the difference between meanfield methods and belief propagation is not the amount of structure they model, b ..."
Abstract

Cited by 48 (2 self)
 Add to MetaCart
This paper presents a unifying view of messagepassing algorithms, as methods to approximate a complex Bayesian network by a simpler network with minimum information divergence. In this view, the difference between meanfield methods and belief propagation is not the amount of structure they model, but only the measure of loss they minimize (‘exclusive ’ versus ‘inclusive’ KullbackLeibler divergence). In each case, messagepassing arises by minimizing a localized version of the divergence, local to each factor. By examining these divergence measures, we can intuit the types of solution they prefer (symmetrybreaking, for example) and their suitability for different tasks. Furthermore, by considering a wider variety of divergence measures (such as alphadivergences), we can achieve different complexity and performance goals. 1
An edge deletion semantics for belief propagation and its practical impact on approximation quality
 In AAAI
, 2006
"... We show in this paper that the influential algorithm of iterative belief propagation can be understood in terms of exact inference on a polytree, which results from deleting enough edges from the original network. We show that deleting edges implies adding new parameters into a network, and that the ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
We show in this paper that the influential algorithm of iterative belief propagation can be understood in terms of exact inference on a polytree, which results from deleting enough edges from the original network. We show that deleting edges implies adding new parameters into a network, and that the iterations of belief propagation are searching for values of these new parameters which satisfy intuitive conditions that we characterize. The new semantics lead to the following question: Can one improve the quality of approximations computed by belief propagation by recovering some of the deleted edges, while keeping the network easy enough for exact inference? We show in this paper that the answer is yes, leading to another question: How do we choose which edges to recover? To answer, we propose a specific method based on mutual information which is motivated by the edge deletion semantics. Empirically, we provide experimental results showing that the quality of approximations can be improved without incurring much additional computational cost. We also show that recovering certain edges with low mutual information may not be worthwhile as they increase the computational complexity, without necessarily improving the quality of approximations.
JoinGraph Propagation Algorithms
"... The paper investigates parameterized approximate messagepassing schemes that are based on bounded inference and are inspired by Pearl’s belief propagation algorithm (BP). We start with the bounded inference miniclustering algorithm and then move to the iterative scheme called Iterative JoinGraph ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
The paper investigates parameterized approximate messagepassing schemes that are based on bounded inference and are inspired by Pearl’s belief propagation algorithm (BP). We start with the bounded inference miniclustering algorithm and then move to the iterative scheme called Iterative JoinGraph Propagation (IJGP), that combines both iteration and bounded inference. The algorithm IJGP belongs to the class of Generalized Belief Propagation algorithms, a framework that allowed connections with approximate algorithms from statistical physics and is shown empirically to surpass the performance of miniclustering and belief propagation, as well as a number of other stateoftheart algorithms on several classes of networks. We also provide insight into the accuracy of IBP and IJGP by relating these algorithms to well known classes of constraint propagation schemes.
Loop corrections for approximate inference on factor graphs
 Journal of Machine Learning Research
"... We propose a method to improve approximate inference methods by correcting for the influence of loops in the graphical model. The method is a generalization and alternative implementation of a recent idea from Montanari and Rizzo (2005). It is applicable to arbitrary factor graphs, provided that the ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
We propose a method to improve approximate inference methods by correcting for the influence of loops in the graphical model. The method is a generalization and alternative implementation of a recent idea from Montanari and Rizzo (2005). It is applicable to arbitrary factor graphs, provided that the size of the Markov blankets is not too large. It consists of two steps: (i) an approximate inference method, for example, belief propagation, is used to approximate cavity distributions for each variable (i.e., probability distributions on the Markov blanket of a variable for a modified graphical model in which the factors involving that variable have been removed); (ii) all cavity distributions are improved by a messagepassing algorithm that cancels out approximation errors by imposing certain consistency constraints. This loop correction (LC) method usually gives significantly better results than the original, uncorrected, approximate inference algorithm that is used to estimate the effect of loops. Indeed, we often observe that the loopcorrected error is approximately the square of the error of the uncorrected approximate inference method. In this article, we compare different variants of the loop correction method with other approximate inference methods on a variety of graphical models, including “real world ” networks, and conclude that the LC method generally obtains the most accurate results.
Approximate Inference in Graphical Models using LP Relaxations
, 2010
"... Graphical models such as Markov random fields have been successfully applied to a wide variety of fields, from computer vision and natural language processing, to computational biology. Exact probabilistic inference is generally intractable in complex models having many dependencies between the vari ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Graphical models such as Markov random fields have been successfully applied to a wide variety of fields, from computer vision and natural language processing, to computational biology. Exact probabilistic inference is generally intractable in complex models having many dependencies between the variables. We present new approaches to approximate inference based on linear programming (LP) relaxations. Our algorithms optimize over the cycle relaxation of the marginal polytope, which we show to be closely related to the first lifting of the SheraliAdams hierarchy, and is significantly tighter than the pairwise LP relaxation. We show how to efficiently optimize over the cycle relaxation using a cuttingplane algorithm that iteratively introduces constraints into the relaxation. We provide a criterion to determine which constraints would be most helpful in tightening the relaxation, and give efficient algorithms for solving the search problem of finding the best cycle constraint to add according to this criterion.
Cooled and Relaxed Survey Propagation for MRFs
"... We describe a new algorithm, Relaxed Survey Propagation (RSP), for finding MAP configurations in Markov random fields. We compare its performance with stateoftheart algorithms including the maxproduct belief propagation, its sequential treereweighted variant, residual (sumproduct) belief propa ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
We describe a new algorithm, Relaxed Survey Propagation (RSP), for finding MAP configurations in Markov random fields. We compare its performance with stateoftheart algorithms including the maxproduct belief propagation, its sequential treereweighted variant, residual (sumproduct) belief propagation, and treestructured expectation propagation. We show that it outperforms all approaches for Ising models with mixed couplings, as well as on a web person disambiguation task formulated as a supervised clustering problem. 1
Focusing Generalizations of Belief Propagation on Targeted Queries
"... A recent formalization of Iterative Belief Propagation (IBP) has shown that it can be understood as an exact inference algorithm on an approximate model that results from deleting every model edge. This formalization has led to (1) new realizations of Generalized Belief Propagation (GBP) in which ed ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
A recent formalization of Iterative Belief Propagation (IBP) has shown that it can be understood as an exact inference algorithm on an approximate model that results from deleting every model edge. This formalization has led to (1) new realizations of Generalized Belief Propagation (GBP) in which edges are recovered incrementally to improve approximation quality, and (2) edgerecovery heuristics that are motivated by improving the approximation quality of all node marginals in a graphical model. In this paper, we propose new edgerecovery heuristics, which are focused on improving the approximations of targeted node marginals. The new heuristics are based on newlyidentified properties of edge deletion, and in turn IBP, which guarantee the exactness of edge deletion in simple and idealized cases. These properties also suggest new improvements to IBP approximations which are based on performing edgebyedge corrections on targeted marginals, which are less costly than improvements based on edge recovery.
A conditional game for comparing approximations
 In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS11
, 2011
"... We present a “conditional game ” to be played between two approximate inference algorithms. We prove that exact inference is an optimal strategy and demonstrate how the game can be used to estimate the relative accuracy of two different approximations in the absence of exact marginals. 1 ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We present a “conditional game ” to be played between two approximate inference algorithms. We prove that exact inference is an optimal strategy and demonstrate how the game can be used to estimate the relative accuracy of two different approximations in the absence of exact marginals. 1