Results 1  10
of
14
MAP Estimation, Linear Programming and Belief Propagation with Convex Free Energies
, 2007
"... Finding the most probable assignment (MAP) in a general graphical model is known to be NP hard but good approximations have been attained with maxproduct belief propagation (BP) and its variants. In particular, it is known that using BP on a singlecycle graph or tree reweighted BP on an arbitrary ..."
Abstract

Cited by 45 (4 self)
 Add to MetaCart
Finding the most probable assignment (MAP) in a general graphical model is known to be NP hard but good approximations have been attained with maxproduct belief propagation (BP) and its variants. In particular, it is known that using BP on a singlecycle graph or tree reweighted BP on an arbitrary graph will give the MAP solution if the beliefs have no ties. In this paper we extend the setting under which BP can be used to provably extract the MAP. We define Convex BP as BP algorithms based on a convex free energy approximation and show that this class includes ordinary BP with singlecycle, tree reweighted BP and many other BP variants. We show that when there are no ties, fixedpoints of convex maxproduct BP will provably give the MAP solution. We also show that convex sumproduct BP at sufficiently small temperatures can be used to solve linear programs that arise from relaxing the MAP problem. Finally, we derive a novel condition that allows us to derive the MAP solution even if some of the convex BP beliefs have ties. In experiments, we show that our theorems allow us to find the MAP in many realworld instances of graphical models where exact inference using junctiontree is impossible.
Convergent message passing algorithms  a unifying view
 In Proc. Twentyeighth Conference on Uncertainty in Artificial Intelligence (UAI ’09
, 2009
"... Messagepassing algorithms have emerged as powerful techniques for approximate inference in graphical models. When these algorithms converge, they can be shown to find local (or sometimes even global) optima of variational formulations to the inference problem. But many of the most popular algorithm ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
Messagepassing algorithms have emerged as powerful techniques for approximate inference in graphical models. When these algorithms converge, they can be shown to find local (or sometimes even global) optima of variational formulations to the inference problem. But many of the most popular algorithms are not guaranteed to converge. This has lead to recent interest in convergent messagepassing algorithms. In this paper, we present a unified view of convergent messagepassing algorithms. We algorithm, treeconsistency bound optimization (TCBO) that is provably convergent in both its sum and max product forms. We then show that many of the existing convergent algorithms are instances of our TCBO algorithm, and obtain novel convergent algorithms “for free ” by exchanging maximizations and summations in existing algorithms. In particular, we show that Wainwright’s nonconvergent sumproduct algorithm for tree based variational bounds, is actually convergent with the right update order for the case where trees are monotonic chains. 1
Extending expectation propagation for graphical models
, 2005
"... Graphical models have been widely used in many applications, ranging from human behavior recognition to wireless signal detection. However, efficient inference and learning techniques for graphical models are needed to handle complex models, such as hybrid Bayesian networks. This thesis proposes ext ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
Graphical models have been widely used in many applications, ranging from human behavior recognition to wireless signal detection. However, efficient inference and learning techniques for graphical models are needed to handle complex models, such as hybrid Bayesian networks. This thesis proposes extensions of expectation propagation, a powerful generalization of loopy belief propagation, to develop efficient inference and learning algorithms for graphical models. The first two chapters of the thesis present inference algorithms for generative graphical models, and the next two propose learning algorithms for conditional graphical models. First, the thesis proposes a windowbased EP smoothing algorithm, as an alternative to batch EP, for hybrid dynamic Bayesian networks. For an application to digital wireless communications, windowbased EP smoothing achieves estimation accuracy comparable to sequential Monte Carlo methods, but with more than 10 times less computational cost. Second, it combines treestructured EP approximation with the junction tree algorithm
The DLR Hierarchy of Approximate Inference
"... We propose a hierarchy for approximate inference based on the Dobrushin, Lanford, Ruelle (DLR) equations. This hierarchy includes existing algorithms, such as belief propagation, and also motivates novel algorithms such as factorized neighbors (FN) algorithms and variants of mean field (MF) al ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
We propose a hierarchy for approximate inference based on the Dobrushin, Lanford, Ruelle (DLR) equations. This hierarchy includes existing algorithms, such as belief propagation, and also motivates novel algorithms such as factorized neighbors (FN) algorithms and variants of mean field (MF) algorithms.
Approximate inference techniques with expectation constraints
 Journal of Statistical Mechanics: Theory and Experiment
, 2005
"... constraints ..."
Turbo Decoding as Iterative Constrained Maximum Likelihood Sequence Detection
"... Abstract — The turbo decoder was not originally introduced as a solution to an optimization problem, which has impeded attempts to explain its excellent performance. Here it is shown, nonetheless, that the turbo decoder is an iterative method seeking a solution to an intuitively pleasing constrained ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Abstract — The turbo decoder was not originally introduced as a solution to an optimization problem, which has impeded attempts to explain its excellent performance. Here it is shown, nonetheless, that the turbo decoder is an iterative method seeking a solution to an intuitively pleasing constrained optimization problem. In particular, the turbo decoder seeks the maximum likelihood sequence under the false assumption that the input to the encoders are chosen independently of each other in the parallel case, or that the output of the outer encoder is chosen independently of the input to the inner encoder in the serial case. To control the error introduced by the false assumption, the optimizations are performed subject to a constraint on the probability that the independent messages happen to coincide. When the the constraining probability equals one, the global maximum of the constrained optimization problem is the maximum likelihood sequence detection, allowing for a theoretical connection between turbo decoding and maximum likelihood sequence detection. It is then shown that the turbo decoder is a nonlinear block GaussSeidel iteration that aims to solve the optimization problem by zeroing the gradient of the Lagrangian with a Lagrange multiplier of1. Some conditions for the convergence for the turbo decoder are then given by adapting the existing literature for GaussSeidel iterations. Index Terms — constrained optimization, maximum likelihood decoding, Turbo decoder convergence analysis I.
Turbo decoding as constrained optimization
 in 43rd Allerton Conference on Communication, Control, and Computing
, 2005
"... The turbo decoder was not originally introduced as a solution to an optimization problem. This has made explaining just why the turbo decoder performs as well as it does very difficult. Many authors have attempted to explain both the performance and convergence of the decoder, with varied success. I ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
The turbo decoder was not originally introduced as a solution to an optimization problem. This has made explaining just why the turbo decoder performs as well as it does very difficult. Many authors have attempted to explain both the performance and convergence of the decoder, with varied success. In this document we show that the turbo decoder admits an exact interpretation as an iterative method (nonlinear block Gauss Seidel iteration) attempting to find a solution to a particular intuitively pleasing constrained optimization problem. In particular the turbo decoder is trying to find the maximum likelihood solution under the false assumption that the input to the encoders were chosen independently of one another, subject to a constraint on the probability that the messages so chosen happened to be the same. We provide an exact analytical objective function, along with an exact analytical form of the constraint, and then show that the turbo decoder is an iterative method originally suggested by Gauss, which is trying to solve the optimization problem by solving a system of equations which are the necessary conditions of Lagrange with a Lagrange multiplier of −1. 1
Uniqueness of Belief Propagation on Signed Graphs
"... While loopy Belief Propagation (LBP) has been utilized in a wide variety of applications with empirical success, it comes with few theoretical guarantees. Especially, if the interactions of random variables in a graphical model are strong, the behaviors of the algorithm can be difficult to analyze d ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
While loopy Belief Propagation (LBP) has been utilized in a wide variety of applications with empirical success, it comes with few theoretical guarantees. Especially, if the interactions of random variables in a graphical model are strong, the behaviors of the algorithm can be difficult to analyze due to underlying phase transitions. In this paper, we develop a novel approach to the uniqueness problem of the LBP fixed point; our new “necessary and sufficient ” condition is stated in terms of graphs and signs, where the sign denotes the types (attractive/repulsive) of the interaction (i.e., compatibility function) on the edge. In all previous works, uniqueness is guaranteed only in the situations where the strength of the interactions are “sufficiently ” small in certain senses. In contrast, our condition covers arbitrary strong interactions on the specified class of signed graphs. The result of this paper is based on the recent theoretical advance in the LBP algorithm; the connection with the graph zeta function. 1
Optimality and Duality of the Turbo Decoder
, 2007
"... The nearoptimal performance of the turbo decoder has been a source of intrigue among communications engineers and information theorists, given its ad hoc origins that were seemingly disconnected from optimization theory. Naturally one would inquire whether the favorable performance might be explain ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
The nearoptimal performance of the turbo decoder has been a source of intrigue among communications engineers and information theorists, given its ad hoc origins that were seemingly disconnected from optimization theory. Naturally one would inquire whether the favorable performance might be explained by characterizing the turbo decoder via some optimization criterion or performance index. Recently, two such characterizations have surfaced. One draws from statistical mechanics and aims to minimize the Bethe approximation to a free energy measure. The other characterization involves constrained likelihood estimation, a setting perhaps more familiar to communications engineers. The intent of this paper is to assemble a tutorial overview of these recent developments, and more importantly to identify the formal mathematical duality between the two viewpoints. The paper includes tutorial background material on the information geometry tools used in analyzing the turbo decoder, and the analysis accommodates both the parallel concatenation and serial concatenation schemes in a common framework.
THERMODYNAMIC SEMIRINGS
, 1108
"... Abstract. Thermodynamic semirings are deformed additive structures on characteristic one semirings, defined using a binary information measure. The algebraic properties of the semiring encode thermodynamical and information theoretic properties of the entropy function. Besides the case of the Shanno ..."
Abstract
 Add to MetaCart
Abstract. Thermodynamic semirings are deformed additive structures on characteristic one semirings, defined using a binary information measure. The algebraic properties of the semiring encode thermodynamical and information theoretic properties of the entropy function. Besides the case of the Shannon entropy, which arises in the context of geometry over the field with one element and the Witt construction in characteristic one, there are other interesting thermodynamic semirings associated to the Rényi and Tsallis entropies, and to the Kullback–Leibler divergence, with connections to information geometry, multifractal analysis, and statistical mechanics. A more general theory of thermodynamic semirings is then formulated in categorical terms, by encoding all partial associativity and commutativity constraints into an entropy operad and a corresponding information algebra. Contents