Results 1 - 10
of
585
Graphical models, exponential families, and variational inference
, 2008
"... The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fiel ..."
Abstract
-
Cited by 819 (28 self)
- Add to MetaCart
(Show Context)
The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide varietyof algorithms — among them sum-product, cluster variational methods, expectation-propagation, mean field methods, max-product and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.
A New Class of Upper Bounds on the Log Partition Function
- In Uncertainty in Artificial Intelligence
, 2002
"... Bounds on the log partition function are important in a variety of contexts, including approximate inference, model fitting, decision theory, and large deviations analysis [11, 5, 4]. We introduce a new class of upper bounds on the log partition function, based on convex combinations of distribution ..."
Abstract
-
Cited by 225 (32 self)
- Add to MetaCart
Bounds on the log partition function are important in a variety of contexts, including approximate inference, model fitting, decision theory, and large deviations analysis [11, 5, 4]. We introduce a new class of upper bounds on the log partition function, based on convex combinations of distributions in the exponential domain, that is applicable to an arbitrary undirected graphical model. In the special case of convex combinations of tree-structured distributions, we obtain a family of variational problems, similar to the Bethe free energy, but distinguished by the following desirable properties: (i) they are convex, and have a unique global minimum; and (ii) the global minimum gives an upper bound on the log partition function. The global minimum is defined by stationary conditions very similar to those defining xed points of belief propagation (BP) or tree-based reparameterization [see 13, 14]. As with BP fixed points, the elements of the minimizing argument can be used as approximations to the marginals of the original model. The analysis described here can be extended to structures of higher treewidth (e.g., hypertrees), thereby making connections with more advanced approximations (e.g., Kikuchi and variants [15, 10]).
Collective classification in network data
, 2008
"... Numerous real-world applications produce networked data such as web data (hypertext documents connected via hyperlinks) and communication networks (people connected via communication links). A recent focus in machine learning research has been to extend traditional machine learning classification te ..."
Abstract
-
Cited by 178 (32 self)
- Add to MetaCart
(Show Context)
Numerous real-world applications produce networked data such as web data (hypertext documents connected via hyperlinks) and communication networks (people connected via communication links). A recent focus in machine learning research has been to extend traditional machine learning classification techniques to classify nodes in such data. In this report, we attempt to provide a brief introduction to this area of research and how it has progressed during the past decade. We introduce four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and real-world data.
Message passing algorithms for compressed sensing: I. motivation and construction
- Proc. ITW
, 2010
"... Abstract—In a recent paper, the authors proposed a new class of low-complexity iterative thresholding algorithms for reconstructing sparse signals from a small set of linear measurements [1]. The new algorithms are broadly referred to as AMP, for approximate message passing. This is the second of tw ..."
Abstract
-
Cited by 163 (19 self)
- Add to MetaCart
(Show Context)
Abstract—In a recent paper, the authors proposed a new class of low-complexity iterative thresholding algorithms for reconstructing sparse signals from a small set of linear measurements [1]. The new algorithms are broadly referred to as AMP, for approximate message passing. This is the second of two conference papers describing the derivation of these algorithms, connection with related literature, extensions of original framework, and new empirical evidence. This paper describes the state evolution formalism for analyzing these algorithms, and some of the conclusions that can be drawn from this formalism. We carried out extensive numerical simulations to confirm these predictions. We present here a few representative results. I. GENERAL AMP AND STATE EVOLUTION We consider the model
Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations
"... We present a novel message passing algorithm for approximating the MAP problem in graphical models. The algorithm is similar in structure to max-product but unlike max-product it always converges, and can be proven to find the exact MAP solution in various settings. The algorithm is derived via bloc ..."
Abstract
-
Cited by 160 (14 self)
- Add to MetaCart
(Show Context)
We present a novel message passing algorithm for approximating the MAP problem in graphical models. The algorithm is similar in structure to max-product but unlike max-product it always converges, and can be proven to find the exact MAP solution in various settings. The algorithm is derived via block coordinate descent in a dual of the LP relaxation of MAP, but does not require any tunable parameters such as step size or tree weights. We also describe a generalization of the method to cluster based potentials. The new method is tested on synthetic and real-world problems, and compares favorably with previous approaches. Graphical models are an effective approach for modeling complex objects via local interactions. In such models, a distribution over a set of variables is assumed to factor according to cliques of a graph with potentials assigned to each clique. Finding the assignment with highest probability in these models is key to using them in practice, and is often referred to as the MAP (maximum aposteriori) assignment problem. In the general case the problem is NP hard, with complexity exponential in the tree-width of the underlying graph.
Learning to combine bottom-up and top-down segmentation
- in: European Conference on Computer Vision
"... Abstract. Bottom-up segmentation based only on low-level cues is a notoriously difficult problem. This difficulty has lead to recent top-down segmentation algorithms that are based on class-specific image information. Despite the success of top-down algorithms, they often give coarse segmentations t ..."
Abstract
-
Cited by 132 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Bottom-up segmentation based only on low-level cues is a notoriously difficult problem. This difficulty has lead to recent top-down segmentation algorithms that are based on class-specific image information. Despite the success of top-down algorithms, they often give coarse segmentations that can be significantly refined using low-level cues. This raises the question of how to combine both top-down and bottom-up cues in a principled manner. In this paper we approach this problem using supervised learning. Given a training set of ground truth segmentations we train a fragment-based segmentation algorithm which takes into account both bottom-up and top-down cues simultaneously, in contrast to most existing algorithms which train top-down and bottom-up modules separately. We formulate the problem in the framework of Conditional Random Fields (CRF) and derive a feature induction algorithm for CRF, which allows us to efficiently search over thousands of candidate fragments. Whereas pure top-down algorithms often require hundreds of fragments, our simultaneous learning procedure yields algorithms with a handful of fragments that are combined with low-level cues to efficiently compute high quality segmentations. 1
MRF optimization via dual decomposition: Message-passing revisited
- In ICCV
, 2007
"... A new message-passing scheme for MRF optimization is proposed in this paper. This scheme inherits better theoretical properties than all other state-of-the-art message passing methods and in practice performs equally well/outperforms them. It is based on the very powerful technique of Dual Decomposi ..."
Abstract
-
Cited by 117 (11 self)
- Add to MetaCart
(Show Context)
A new message-passing scheme for MRF optimization is proposed in this paper. This scheme inherits better theoretical properties than all other state-of-the-art message passing methods and in practice performs equally well/outperforms them. It is based on the very powerful technique of Dual Decomposition [1] and leads to an elegant and general framework for understanding/designing message-passing algorithms that can provide new insights into existing techniques. Promising experimental results and comparisons with the state of the art demonstrate the extreme theoretical and practical potentials of our approach. 1.
Graph-cover decoding and finite-length analysis of message-passing iterative decoding of LDPC codes
- IEEE TRANS. INFORM. THEORY
, 2005
"... The goal of the present paper is the derivation of a framework for the finite-length analysis of message-passing iterative decoding of low-density parity-check codes. To this end we introduce the concept of graph-cover decoding. Whereas in maximum-likelihood decoding all codewords in a code are comp ..."
Abstract
-
Cited by 116 (17 self)
- Add to MetaCart
(Show Context)
The goal of the present paper is the derivation of a framework for the finite-length analysis of message-passing iterative decoding of low-density parity-check codes. To this end we introduce the concept of graph-cover decoding. Whereas in maximum-likelihood decoding all codewords in a code are competing to be the best explanation of the received vector, under graph-cover decoding all codewords in all finite covers of a Tanner graph representation of the code are competing to be the best explanation. We are interested in graph-cover decoding because it is a theoretical tool that can be used to show connections between linear programming decoding and message-passing iterative decoding. Namely, on the one hand it turns out that graph-cover decoding is essentially equivalent to linear programming decoding. On the other hand, because iterative, locally operating decoding algorithms like message-passing iterative decoding cannot distinguish the underlying Tanner graph from any covering graph, graph-cover decoding can serve as a model to explain the behavior of message-passing iterative decoding. Understanding the behavior of graph-cover decoding is tantamount to understanding