Results 1 - 10
of
32
Graphical models, exponential families, and variational inference. Foundations Trends
- Ihler (ihler@ics.uci.edu), University of California, Irvine. Michael
"... The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fiel ..."
Abstract
-
Cited by 268 (20 self)
- Add to MetaCart
The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide varietyof algorithms — among them sum-product, cluster variational methods, expectation-propagation, mean field methods, max-product and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models. 1
Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations
"... We present a novel message passing algorithm for approximating the MAP problem in graphical models. The algorithm is similar in structure to max-product but unlike max-product it always converges, and can be proven to find the exact MAP solution in various settings. The algorithm is derived via bloc ..."
Abstract
-
Cited by 45 (9 self)
- Add to MetaCart
We present a novel message passing algorithm for approximating the MAP problem in graphical models. The algorithm is similar in structure to max-product but unlike max-product it always converges, and can be proven to find the exact MAP solution in various settings. The algorithm is derived via block coordinate descent in a dual of the LP relaxation of MAP, but does not require any tunable parameters such as step size or tree weights. We also describe a generalization of the method to cluster based potentials. The new method is tested on synthetic and real-world problems, and compares favorably with previous approaches. Graphical models are an effective approach for modeling complex objects via local interactions. In such models, a distribution over a set of variables is assumed to factor according to cliques of a graph with potentials assigned to each clique. Finding the assignment with highest probability in these models is key to using them in practice, and is often referred to as the MAP (maximum aposteriori) assignment problem. In the general case the problem is NP hard, with complexity exponential in the tree-width of the underlying graph.
Tightening LP Relaxations for MAP using Message Passing
"... Linear Programming (LP) relaxations have become powerful tools for finding the most probable (MAP) configuration in graphical models. These relaxations can be solved efficiently using message-passing algorithms such as belief propagation and, when the relaxation is tight, provably find the MAP confi ..."
Abstract
-
Cited by 38 (8 self)
- Add to MetaCart
Linear Programming (LP) relaxations have become powerful tools for finding the most probable (MAP) configuration in graphical models. These relaxations can be solved efficiently using message-passing algorithms such as belief propagation and, when the relaxation is tight, provably find the MAP configuration. The standard LP relaxation is not tight enough in many real-world problems, however, and this has lead to the use of higher order cluster-based LP relaxations. The computational cost increases exponentially with the size of the clusters and limits the number and type of clusters we can use. We propose to solve the cluster selection problem monotonically in the dual LP, iteratively selecting clusters with guaranteed improvement, and quickly re-solving with the added clusters by reusing the existing solution. Our dual message-passing algorithm finds the MAP configuration in protein sidechain placement, protein design, and stereo problems, in cases where the standard LP relaxation fails. 1
Message-passing for graph-structured linear programs: Proximal methods and rounding schemes
, 2008
"... The problem of computing a maximum a posteriori (MAP) configuration is a central computational challenge associated with Markov random fields. A line of work has focused on “tree-based ” linear programming (LP) relaxations for the MAP problem. This paper develops a family of super-linearly convergen ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
The problem of computing a maximum a posteriori (MAP) configuration is a central computational challenge associated with Markov random fields. A line of work has focused on “tree-based ” linear programming (LP) relaxations for the MAP problem. This paper develops a family of super-linearly convergent algorithms for solving these LPs, based on proximal minimization schemes using Bregman divergences. As with standard messagepassing on graphs, the algorithms are distributed and exploit the underlying graphical structure, and so scale well to large problems. Our algorithms have a double-loop character, with the outer loop corresponding to the proximal sequence, and an inner loop of cyclic Bregman divergences used to compute each proximal update. Different choices of the Bregman divergence lead to conceptually related but distinct LP-solving algorithms. We establish convergence guarantees for our algorithms, and illustrate their performance via some simulations. We also develop two classes of graph-structured rounding schemes, randomized and deterministic, for obtaining integral configurations from the LP solutions. Our deterministic rounding schemes use a “re-parameterization ” property of our algorithms so that when the LP solution is integral, the MAP solution can be obtained even before the LP-solver converges to the optimum. We also propose a graph-structured randomized rounding scheme that applies to iterative LP solving algorithms in general. We analyze the performance of our rounding schemes, giving bounds on the number of iterations required, when the LP is integral, for the rounding schemes to obtain the MAP solution. These bounds are expressed in terms of the strength of the potential functions, and the energy gap, which measures how well the integral MAP solution is separated from other integral configurations. We also report simulations comparing these rounding schemes. 1
Norm-Product Belief Propagation: Primal-Dual Message-Passing for Approximate Inference
"... Abstract—Inference problems in graphical models can be represented as a constrained optimization of a free energy function. In this paper we treat both forms of probabilistic inference, estimating marginal probabilities of the joint distribution and finding the most probable assignment, through a un ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Abstract—Inference problems in graphical models can be represented as a constrained optimization of a free energy function. In this paper we treat both forms of probabilistic inference, estimating marginal probabilities of the joint distribution and finding the most probable assignment, through a unified messagepassing algorithm architecture. In particular we generalize the Belief Propagation (BP) algorithms of sum-product and maxproduct and tree-rewaighted (TRW) sum and max product algorithms (TRBP) and introduce a new set of convergent algorithms based on ”convex-free-energy ” and Linear-Programming (LP) relaxation as a zero-temprature of a convex-free-energy. The main idea of this work arises from taking a general perspective on the existing BP and TRBP algorithms while observing that they all are reductions from the basic optimization formula of f + ∑ i hi
MAP Estimation of Semi-Metric MRFs via Hierarchical Graph Cuts
"... We consider the task of obtaining the maximum a posteriori estimate of discrete pairwise random fields with arbitrary unary potentials and semimetric pairwise potentials. For this problem, we propose an accurate hierarchical move making strategy where each move is computed efficiently by solving an ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
We consider the task of obtaining the maximum a posteriori estimate of discrete pairwise random fields with arbitrary unary potentials and semimetric pairwise potentials. For this problem, we propose an accurate hierarchical move making strategy where each move is computed efficiently by solving an st-MINCUT problem. Unlike previous move making approaches, e.g. the widely used α-expansion algorithm, our method obtains the guarantees of the standard linear programming (LP) relaxation for the important special case of metric labeling. Unlike the existing LP relaxation solvers, e.g. interior-point algorithms or tree-reweighted message passing, our method is significantly faster as it uses only the efficient st-MINCUT algorithm in its design. Using both synthetic and real data experiments, we show that our technique outperforms several commonly used algorithms. 1
Constrained Approximate Maximum Entropy Learning of Markov Random Fields
"... Parameter estimation in Markov random fields (MRFs) is a difficult task, in which inference over the network is run in the inner loop of a gradient descent procedure. Replacing exact inference with approximate methods such as loopy belief propagation (LBP) can suffer from poor convergence. In this p ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Parameter estimation in Markov random fields (MRFs) is a difficult task, in which inference over the network is run in the inner loop of a gradient descent procedure. Replacing exact inference with approximate methods such as loopy belief propagation (LBP) can suffer from poor convergence. In this paper, we provide a different approach for combining MRF learning and Bethe approximation. We consider the dual of maximum likelihood Markov network learning — maximizing entropy with moment matching constraints — and then approximate both the objective and the constraints in the resulting optimization problem. Unlike previous work along these lines (Teh & Welling, 2003), our formulation allows parameter sharing between features in a general log-linear model, parameter regularization and conditional training. We show that piecewise training (Sutton & McCallum, 2005) is a very restricted special case of this formulation. We study two optimization strategies: one based on a single convex approximation and one that uses repeated convex approximations. We show results on several real-world networks that demonstrate that these algorithms can significantly outperform learning with loopy and piecewise. Our results also provide a framework for analyzing the trade-offs of different relaxations of the entropy objective and of the constraints. 1
Energy Minimization for Linear Envelope MRFs
"... Markov random fields with higher order potentials have emerged as a powerful model for several problems in computer vision. In order to facilitate their use, we propose a new representation for higher order potentials as upper and lower envelopes of linear functions. Our representation concisely mod ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Markov random fields with higher order potentials have emerged as a powerful model for several problems in computer vision. In order to facilitate their use, we propose a new representation for higher order potentials as upper and lower envelopes of linear functions. Our representation concisely models several commonly used higher order potentials, thereby providing a unified framework for minimizing the corresponding Gibbs energy functions. We exploit this framework by converting lower envelope potentials to standard pairwise functions with the addition of a small number of auxiliary variables. This allows us to minimize energy functions with lower envelope potentials using conventional algorithms such as BP, TRW and α-expansion. Furthermore, we show how the minimization of energy functions with upper envelope potentials leads to a difficult minmax problem. We address this difficulty by proposing a new message passing algorithm that solves a linear programming relaxation of the problem. Although this is primarily a theoretical paper, we demonstrate the efficacy of our approach on the binary (fg/bg) segmentation problem. 1.
Polynomial linear programming with gaussian belief propagation
- in the 46th Allerton Conf. on Communications, Control and Computing
, 2008
"... Abstract—Interior-point methods are state-of-the-art algorithms for solving linear programming (LP) problems with polynomial complexity. Specifically, the Karmarkar algorithm typically solves LP problems in time O(n 3.5), where n is the number of unknown variables. Karmarkar’s celebrated algorithm i ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Abstract—Interior-point methods are state-of-the-art algorithms for solving linear programming (LP) problems with polynomial complexity. Specifically, the Karmarkar algorithm typically solves LP problems in time O(n 3.5), where n is the number of unknown variables. Karmarkar’s celebrated algorithm is known to be an instance of the log-barrier method using the Newton iteration. The main computational overhead of this method is in inverting the Hessian matrix of the Newton iteration. In this contribution, we propose the application of the Gaussian belief propagation (GaBP) algorithm as part of an efficient and distributed LP solver that exploits the sparse and symmetric structure of the Hessian matrix and avoids the need for direct matrix inversion. This approach shifts the computation from realm of linear algebra to that of probabilistic inference on graphical models, thus applying GaBP as an efficient inference engine. Our construction is general and can be used for any interior-point algorithm which uses the Newton method, including non-linear program solvers. I.

