Results 1 - 10
of
14
Graphical models, exponential families, and variational inference. Foundations Trends
- Ihler (ihler@ics.uci.edu), University of California, Irvine. Michael
"... The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fiel ..."
Abstract
-
Cited by 268 (20 self)
- Add to MetaCart
The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide varietyof algorithms — among them sum-product, cluster variational methods, expectation-propagation, mean field methods, max-product and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models. 1
Tightening LP Relaxations for MAP using Message Passing
"... Linear Programming (LP) relaxations have become powerful tools for finding the most probable (MAP) configuration in graphical models. These relaxations can be solved efficiently using message-passing algorithms such as belief propagation and, when the relaxation is tight, provably find the MAP confi ..."
Abstract
-
Cited by 38 (8 self)
- Add to MetaCart
Linear Programming (LP) relaxations have become powerful tools for finding the most probable (MAP) configuration in graphical models. These relaxations can be solved efficiently using message-passing algorithms such as belief propagation and, when the relaxation is tight, provably find the MAP configuration. The standard LP relaxation is not tight enough in many real-world problems, however, and this has lead to the use of higher order cluster-based LP relaxations. The computational cost increases exponentially with the size of the clusters and limits the number and type of clusters we can use. We propose to solve the cluster selection problem monotonically in the dual LP, iteratively selecting clusters with guaranteed improvement, and quickly re-solving with the added clusters by reusing the existing solution. Our dual message-passing algorithm finds the MAP configuration in protein sidechain placement, protein design, and stereo problems, in cases where the standard LP relaxation fails. 1
An edge deletion semantics for belief propagation and its practical impact on approximation quality
- In AAAI
, 2006
"... We show in this paper that the influential algorithm of iterative belief propagation can be understood in terms of exact inference on a polytree, which results from deleting enough edges from the original network. We show that deleting edges implies adding new parameters into a network, and that the ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
We show in this paper that the influential algorithm of iterative belief propagation can be understood in terms of exact inference on a polytree, which results from deleting enough edges from the original network. We show that deleting edges implies adding new parameters into a network, and that the iterations of belief propagation are searching for values of these new parameters which satisfy intuitive conditions that we characterize. The new semantics lead to the following question: Can one improve the quality of approximations computed by belief propagation by recovering some of the deleted edges, while keeping the network easy enough for exact inference? We show in this paper that the answer is yes, leading to another question: How do we choose which edges to recover? To answer, we propose a specific method based on mutual information which is motivated by the edge deletion semantics. Empirically, we provide experimental results showing that the quality of approximations can be improved without incurring much additional computational cost. We also show that recovering certain edges with low mutual information may not be worthwhile as they increase the computational complexity, without necessarily improving the quality of approximations.
Join-Graph Propagation Algorithms
"... The paper investigates parameterized approximate message-passing schemes that are based on bounded inference and are inspired by Pearl’s belief propagation algorithm (BP). We start with the bounded inference mini-clustering algorithm and then move to the iterative scheme called Iterative Join-Graph ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
The paper investigates parameterized approximate message-passing schemes that are based on bounded inference and are inspired by Pearl’s belief propagation algorithm (BP). We start with the bounded inference mini-clustering algorithm and then move to the iterative scheme called Iterative Join-Graph Propagation (IJGP), that combines both iteration and bounded inference. The algorithm IJGP belongs to the class of Generalized Belief Propagation algorithms, a framework that allowed connections with approximate algorithms from statistical physics and is shown empirically to surpass the performance of mini-clustering and belief propagation, as well as a number of other state-of-the-art algorithms on several classes of networks. We also provide insight into the accuracy of IBP and IJGP by relating these algorithms to well known classes of constraint propagation schemes.
Loop corrections for approximate inference on factor graphs
- Journal of Machine Learning Research
"... We propose a method to improve approximate inference methods by correcting for the influence of loops in the graphical model. The method is a generalization and alternative implementation of a recent idea from Montanari and Rizzo (2005). It is applicable to arbitrary factor graphs, provided that the ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
We propose a method to improve approximate inference methods by correcting for the influence of loops in the graphical model. The method is a generalization and alternative implementation of a recent idea from Montanari and Rizzo (2005). It is applicable to arbitrary factor graphs, provided that the size of the Markov blankets is not too large. It consists of two steps: (i) an approximate inference method, for example, belief propagation, is used to approximate cavity distributions for each variable (i.e., probability distributions on the Markov blanket of a variable for a modified graphical model in which the factors involving that variable have been removed); (ii) all cavity distributions are improved by a message-passing algorithm that cancels out approximation errors by imposing certain consistency constraints. This loop correction (LC) method usually gives significantly better results than the original, uncorrected, approximate inference algorithm that is used to estimate the effect of loops. Indeed, we often observe that the loop-corrected error is approximately the square of the error of the uncorrected approximate inference method. In this article, we compare different variants of the loop correction method with other approximate inference methods on a variety of graphical models, including “real world ” networks, and conclude that the LC method generally obtains the most accurate results.
Focusing Generalizations of Belief Propagation on Targeted Queries
"... A recent formalization of Iterative Belief Propagation (IBP) has shown that it can be understood as an exact inference algorithm on an approximate model that results from deleting every model edge. This formalization has led to (1) new realizations of Generalized Belief Propagation (GBP) in which ed ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
A recent formalization of Iterative Belief Propagation (IBP) has shown that it can be understood as an exact inference algorithm on an approximate model that results from deleting every model edge. This formalization has led to (1) new realizations of Generalized Belief Propagation (GBP) in which edges are recovered incrementally to improve approximation quality, and (2) edgerecovery heuristics that are motivated by improving the approximation quality of all node marginals in a graphical model. In this paper, we propose new edge-recovery heuristics, which are focused on improving the approximations of targeted node marginals. The new heuristics are based on newly-identified properties of edge deletion, and in turn IBP, which guarantee the exactness of edge deletion in simple and idealized cases. These properties also suggest new improvements to IBP approximations which are based on performing edge-by-edge corrections on targeted marginals, which are less costly than improvements based on edge recovery.
Approximate Inference in Graphical Models using LP Relaxations
, 2010
"... Graphical models such as Markov random fields have been successfully applied to a wide variety of fields, from computer vision and natural language processing, to computational biology. Exact probabilistic inference is generally intractable in complex models having many dependencies between the vari ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Graphical models such as Markov random fields have been successfully applied to a wide variety of fields, from computer vision and natural language processing, to computational biology. Exact probabilistic inference is generally intractable in complex models having many dependencies between the variables. We present new approaches to approximate inference based on linear programming (LP) relaxations. Our algorithms optimize over the cycle relaxation of the marginal polytope, which we show to be closely related to the first lifting of the Sherali-Adams hierarchy, and is significantly tighter than the pairwise LP relaxation. We show how to efficiently optimize over the cycle relaxation using a cutting-plane algorithm that iteratively introduces constraints into the relaxation. We provide a criterion to determine which constraints would be most helpful in tightening the relaxation, and give efficient algorithms for solving the search problem of finding the best cycle constraint to add according to this criterion.
Cooled and Relaxed Survey Propagation for MRFs
"... We describe a new algorithm, Relaxed Survey Propagation (RSP), for finding MAP configurations in Markov random fields. We compare its performance with state-of-the-art algorithms including the max-product belief propagation, its sequential tree-reweighted variant, residual (sum-product) belief propa ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We describe a new algorithm, Relaxed Survey Propagation (RSP), for finding MAP configurations in Markov random fields. We compare its performance with state-of-the-art algorithms including the max-product belief propagation, its sequential tree-reweighted variant, residual (sum-product) belief propagation, and tree-structured expectation propagation. We show that it outperforms all approaches for Ising models with mixed couplings, as well as on a web person disambiguation task formulated as a supervised clustering problem. 1
A conditional game for comparing approximations
- In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS-11
, 2011
"... We present a “conditional game ” to be played between two approximate inference algorithms. We prove that exact inference is an optimal strategy and demonstrate how the game can be used to estimate the relative accuracy of two different approximations in the absence of exact marginals. 1 ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present a “conditional game ” to be played between two approximate inference algorithms. We prove that exact inference is an optimal strategy and demonstrate how the game can be used to estimate the relative accuracy of two different approximations in the absence of exact marginals. 1
Truncating the Loop Series Expansion for Belief Propagation
"... Recently, Chertkov and Chernyak (2006b) derived an exact expression for the partition sum (normalization constant) corresponding to a graphical model, which is an expansion around the belief propagation (BP) solution. By adding correction terms to the BP free energy, one for each “generalized loop ” ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Recently, Chertkov and Chernyak (2006b) derived an exact expression for the partition sum (normalization constant) corresponding to a graphical model, which is an expansion around the belief propagation (BP) solution. By adding correction terms to the BP free energy, one for each “generalized loop ” in the factor graph, the exact partition sum is obtained. However, the usually enormous number of generalized loops generally prohibits summation over all correction terms. In this article we introduce truncated loop series BP (TLSBP), a particular way of truncating the loop series of Chertkov & Chernyak by considering generalized loops as compositions of simple loops. We analyze the performance of TLSBP in different scenarios, including the Ising model on square grids and regular random graphs, and on PROMEDAS, a large probabilistic medical diagnostic system. We show that TLSBP often improves upon the accuracy of the BP solution, at the expense of increased computation time. We also show that the performance of TLSBP strongly depends on the degree of interaction between the variables. For weak interactions, truncating the series leads to significant improvements, whereas for strong interactions it can be ineffective, even if a high number of terms is considered.

