Results 1  10
of
84
Constructing Free Energy Approximations and Generalized Belief Propagation Algorithms
 IEEE Transactions on Information Theory
, 2005
"... Important inference problems in statistical physics, computer vision, errorcorrecting coding theory, and artificial intelligence can all be reformulated as the computation of marginal probabilities on factor graphs. The belief propagation (BP) algorithm is an efficient way to solve these problems t ..."
Abstract

Cited by 414 (12 self)
 Add to MetaCart
Important inference problems in statistical physics, computer vision, errorcorrecting coding theory, and artificial intelligence can all be reformulated as the computation of marginal probabilities on factor graphs. The belief propagation (BP) algorithm is an efficient way to solve these problems that is exact when the factor graph is a tree, but only approximate when the factor graph has cycles. We show that BP fixed points correspond to the stationary points of the Bethe approximation of the free energy for a factor graph. We explain how to obtain regionbased free energy approximations that improve the Bethe approximation, and corresponding generalized belief propagation (GBP) algorithms. We emphasize the conditions a free energy approximation must satisfy in order to be a “valid ” or “maxentnormal ” approximation. We describe the relationship between four different methods that can be used to generate valid approximations: the “Bethe method, ” the “junction graph method, ” the “cluster variation method, ” and the “region graph method.” Finally, we explain how to tell whether a regionbased approximation, and its corresponding GBP algorithm, is likely to be accurate, and describe empirical results showing that GBP can significantly outperform BP.
Convergent TreeReweighted Message Passing for Energy Minimization
 Proc. Int’l Workshop Artificial Intelligence and Statistics
, 2005
"... Abstract—Algorithms for discrete energy minimization are of fundamental importance in computer vision. In this paper, we focus on the recent technique proposed by Wainwright et al. [33]—treereweighted maxproduct message passing (TRW). It was inspired by the problem of maximizing a lower bound on t ..."
Abstract

Cited by 299 (10 self)
 Add to MetaCart
Abstract—Algorithms for discrete energy minimization are of fundamental importance in computer vision. In this paper, we focus on the recent technique proposed by Wainwright et al. [33]—treereweighted maxproduct message passing (TRW). It was inspired by the problem of maximizing a lower bound on the energy. However, the algorithm is not guaranteed to increase this bound—it may actually go down. In addition, TRW does not always converge. We develop a modification of this algorithm which we call sequential treereweighted message passing. Its main property is that the bound is guaranteed not to decrease. We also give a weak tree agreement condition which characterizes local maxima of the bound with respect to TRW algorithms. We prove that our algorithm has a limit point that achieves weak tree agreement. Finally, we show that, our algorithm requires half as much memory as traditional message passing approaches. Experimental results demonstrate that on certain synthetic and real problems, our algorithm outperforms both the ordinary belief propagation and treereweighted algorithm in [33]. In addition, on stereo problems with Potts interactions, we obtain a lower energy than graph cuts. Index Terms—Energy minimization, graph algorithms, message passing, belief propagation, early vision, Markov random fields, stereo. Ç 1
MAP estimation via agreement on trees: Messagepassing and linear programming
, 2002
"... We develop and analyze methods for computing provably optimal maximum a posteriori (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of treestructured distributions, we obtain an upper bound ..."
Abstract

Cited by 132 (8 self)
 Add to MetaCart
We develop and analyze methods for computing provably optimal maximum a posteriori (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of treestructured distributions, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is tight if and only if all the tree distributions share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original distribution. Next we develop two approaches to attempting to obtain tight upper bounds: (a) a treerelaxed linear program (LP), which is derived from the Lagrangian dual of the upper bounds; and (b) a treereweighted maxproduct messagepassing algorithm that is related to but distinct from the maxproduct algorithm. In this way, we establish a connection between a certain LP relaxation of the modefinding problem, and a reweighted form of the maxproduct (minsum) messagepassing algorithm.
MAP estimation via agreement on (hyper)trees: Messagepassing and linear programming approaches
 IEEE Transactions on Information Theory
, 2002
"... We develop an approach for computing provably exact maximum a posteriori (MAP) configurations for a subclass of problems on graphs with cycles. By decomposing the original problem into a convex combination of treestructured problems, we obtain an upper bound on the optimal value of the original ..."
Abstract

Cited by 107 (11 self)
 Add to MetaCart
We develop an approach for computing provably exact maximum a posteriori (MAP) configurations for a subclass of problems on graphs with cycles. By decomposing the original problem into a convex combination of treestructured problems, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is met with equality if and only if the tree problems share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original problem. Next we present and analyze two methods for attempting to obtain tight upper bounds: (a) a treereweighted messagepassing algorithm that is related to but distinct from the maxproduct (minsum) algorithm; and (b) a treerelaxed linear program (LP), which is derived from the Lagrangian dual of the upper bounds. Finally, we discuss the conditions that govern when the relaxation is tight, in which case the MAP configuration can be obtained. The analysis described here generalizes naturally to convex combinations of hypertreestructured distributions.
WalkSums and Belief Propagation in Gaussian Graphical Models
 Journal of Machine Learning Research
, 2006
"... We present a new framework based on walks in a graph for analysis and inference in Gaussian graphical models. The key idea is to decompose the correlation between each pair of variables as a sum over all walks between those variables in the graph. The weight of each walk is given by a product of edg ..."
Abstract

Cited by 66 (14 self)
 Add to MetaCart
We present a new framework based on walks in a graph for analysis and inference in Gaussian graphical models. The key idea is to decompose the correlation between each pair of variables as a sum over all walks between those variables in the graph. The weight of each walk is given by a product of edgewise partial correlation coefficients. This representation holds for a large class of Gaussian graphical models which we call walksummable. We give a precise characterization of this class of models, and relate it to other classes including diagonally dominant, attractive, nonfrustrated, and pairwisenormalizable. We provide a walksum interpretation of Gaussian belief propagation in trees and of the approximate method of loopy belief propagation in graphs with cycles. The walksum perspective leads to a better understanding of Gaussian belief propagation and to stronger results for its convergence in loopy graphs.
Consensus propagation
 IEEE Transactions on Information Theory
"... Abstract — We propose consensus propagation, an asynchronous distributed protocol for averaging numbers across a network. We establish convergence, characterize the convergence rate for regular graphs, and demonstrate that the protocol exhibits better scaling properties than pairwise averaging, an a ..."
Abstract

Cited by 61 (6 self)
 Add to MetaCart
Abstract — We propose consensus propagation, an asynchronous distributed protocol for averaging numbers across a network. We establish convergence, characterize the convergence rate for regular graphs, and demonstrate that the protocol exhibits better scaling properties than pairwise averaging, an alternative that has received much recent attention. Consensus propagation can be viewed as a special case of belief propagation, and our results contribute to the belief propagation literature. In particular, beyond singlyconnected graphs, there are very few classes of relevant problems for which belief propagation is known to converge. Index Terms — belief propagation, distributed averaging, distributed consensus, distributed signal processing, Gaussian Markov random fields, messagepassing algorithms, maxproduct algorithm, minsum algorithm, sumproduct algorithm. I.
Tree Consistency and Bounds on the Performance of the MaxProduct Algorithm and Its Generalizations
, 2002
"... Finding the maximum a posteriori (MAP) assignment of a discretestate distribution specified by a graphical model requires solving an integer program. The maxproduct algorithm, also known as the maxplus or minsum algorithm, is an iterative method for (approximately) solving such a problem on gr ..."
Abstract

Cited by 55 (5 self)
 Add to MetaCart
Finding the maximum a posteriori (MAP) assignment of a discretestate distribution specified by a graphical model requires solving an integer program. The maxproduct algorithm, also known as the maxplus or minsum algorithm, is an iterative method for (approximately) solving such a problem on graphs with cycles.
KernelBased Learning of Hierarchical Multilabel Classification Models
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We present a kernelbased algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is a variant of the Maximum Margin Markov Network framework, where the classification hierarchy is represented as a Mark ..."
Abstract

Cited by 53 (6 self)
 Add to MetaCart
We present a kernelbased algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is a variant of the Maximum Margin Markov Network framework, where the classification hierarchy is represented as a Markov tree equipped with an exponential family defined on the edges. We present an efficient optimization algorithm based on incremental conditional gradient ascent in singleexample subspaces spanned by the marginal dual variables. The optimization is facilitated with a dynamic programming based algorithm that computes best update directions in the feasible set. Experiments show
Learning conditional random fields for stereo
 In CVPR
, 2007
"... Stateoftheart stereo vision algorithms utilize color changes as important cues for object boundaries. Most methods impose heuristic restrictions or priors on disparities, for example by modulating local smoothness costs with intensity gradients. In this paper we seek to replace such heuristics wi ..."
Abstract

Cited by 50 (2 self)
 Add to MetaCart
Stateoftheart stereo vision algorithms utilize color changes as important cues for object boundaries. Most methods impose heuristic restrictions or priors on disparities, for example by modulating local smoothness costs with intensity gradients. In this paper we seek to replace such heuristics with explicit probabilistic models of disparities and intensities learned from real images. We have constructed a large number of stereo datasets with groundtruth disparities, and we use a subset of these datasets to learn the parameters of Conditional Random Fields (CRFs). We present experimental results illustrating the potential of our approach for automatically learning the parameters of models with richer structure than standard handtuned MRF models. 1. Introduction and