Results 1  10
of
356
Graphical models, exponential families, and variational inference
, 2008
"... The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building largescale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fiel ..."
Abstract

Cited by 800 (26 self)
 Add to MetaCart
(Show Context)
The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building largescale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide varietyof algorithms — among them sumproduct, cluster variational methods, expectationpropagation, mean field methods, maxproduct and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in largescale statistical models.
Good ErrorCorrecting Codes based on Very Sparse Matrices
, 1999
"... We study two families of errorcorrecting codes defined in terms of very sparse matrices. "MN" (MacKayNeal) codes are recently invented, and "Gallager codes" were first investigated in 1962, but appear to have been largely forgotten, in spite of their excellent properties. The ..."
Abstract

Cited by 741 (23 self)
 Add to MetaCart
We study two families of errorcorrecting codes defined in terms of very sparse matrices. "MN" (MacKayNeal) codes are recently invented, and "Gallager codes" were first investigated in 1962, but appear to have been largely forgotten, in spite of their excellent properties. The decoding of both codes can be tackled with a practical sumproduct algorithm. We prove that these codes are "very good," in that sequences of codes exist which, when optimally decoded, achieve information rates up to the Shannon limit. This result holds not only for the binarysymmetric channel but also for any channel with symmetric stationary ergodic noise. We give experimental results for binarysymmetric channels and Gaussian channels demonstrating that practical performance substantially better than that of standard convolutional and concatenated codes can be achieved; indeed, the performance of Gallager codes is almost as close to the Shannon limit as that of turbo codes.
On the Optimality of Solutions of the MaxProduct Belief Propagation Algorithm in Arbitrary Graphs
, 2001
"... Graphical models, suchasBayesian networks and Markov random fields, represent statistical dependencies of variables by a graph. The maxproduct "belief propagation" algorithm is a localmessage passing algorithm on this graph that is known to converge to a unique fixed point when the gra ..."
Abstract

Cited by 242 (15 self)
 Add to MetaCart
Graphical models, suchasBayesian networks and Markov random fields, represent statistical dependencies of variables by a graph. The maxproduct "belief propagation" algorithm is a localmessage passing algorithm on this graph that is known to converge to a unique fixed point when the graph is a tree. Furthermore, when the graph is a tree, the assignment based on the fixedpoint yields the most probable a posteriori (MAP) values of the unobserved variables given the observed ones. Recently, good
Analysis of sumproduct decoding of lowdensity paritycheck codes using a Gaussian approximation
 IEEE TRANS. INFORM. THEORY
, 2001
"... Density evolution is an algorithm for computing the capacity of lowdensity paritycheck (LDPC) codes under messagepassing decoding. For memoryless binaryinput continuousoutput additive white Gaussian noise (AWGN) channels and sumproduct decoders, we use a Gaussian approximation for message densi ..."
Abstract

Cited by 242 (2 self)
 Add to MetaCart
(Show Context)
Density evolution is an algorithm for computing the capacity of lowdensity paritycheck (LDPC) codes under messagepassing decoding. For memoryless binaryinput continuousoutput additive white Gaussian noise (AWGN) channels and sumproduct decoders, we use a Gaussian approximation for message densities under density evolution to simplify the analysis of the decoding algorithm. We convert the infinitedimensional problem of iteratively calculating message densities, which is needed to find the exact threshold, to a onedimensional problem of updating means of Gaussian densities. This simplification not only allows us to calculate the threshold quickly and to understand the behavior of the decoder better, but also makes it easier to design good irregular LDPC codes for AWGN channels. For various regular LDPC codes we have examined, thresholds can be estimated within 0.1 dB of the exact value. For rates between 0.5 and 0.9, codes designed using the Gaussian approximation perform within 0.02 dB of the best performing codes found so far by using density evolution when the maximum variable degree is IH. We show that by using the Gaussian approximation, we can visualize the sumproduct decoding algorithm. We also show that the optimization of degree distributions can be understood and done graphically using the visualization.
Improved lowdensity paritycheck codes using irregular graphs
 IEEE Trans. Inform. Theory
, 2001
"... Abstract—We construct new families of errorcorrecting codes based on Gallager’s lowdensity paritycheck codes. We improve on Gallager’s results by introducing irregular paritycheck matrices and a new rigorous analysis of harddecision decoding of these codes. We also provide efficient methods for ..."
Abstract

Cited by 224 (15 self)
 Add to MetaCart
Abstract—We construct new families of errorcorrecting codes based on Gallager’s lowdensity paritycheck codes. We improve on Gallager’s results by introducing irregular paritycheck matrices and a new rigorous analysis of harddecision decoding of these codes. We also provide efficient methods for finding good irregular structures for such decoding algorithms. Our rigorous analysis based on martingales, our methodology for constructing good irregular codes, and the demonstration that irregular structure improves performance constitute key points of our contribution. We also consider irregular codes under belief propagation. We report the results of experiments testing the efficacy of irregular codes on both binarysymmetric and Gaussian channels. For example, using belief propagation, for rate I R codes on 16 000 bits over a binarysymmetric channel, previous lowdensity paritycheck codes can correct up to approximately 16 % errors, while our codes correct over 17%. In some cases our results come very close to reported results for turbo codes, suggesting that variations of irregular low density paritycheck codes may be able to match or beat turbo code performance. Index Terms—Belief propagation, concentration theorem, Gallager codes, irregular codes, lowdensity paritycheck codes.
An Introduction to Factor Graphs
 IEEE SIGNAL PROCESSING MAG., JAN. 2004
, 2004
"... A large variety of algorithms in coding, signal processing, and artificial intelligence may be viewed as instances of the summaryproduct algorithm (or belief/probability ..."
Abstract

Cited by 197 (36 self)
 Add to MetaCart
A large variety of algorithms in coding, signal processing, and artificial intelligence may be viewed as instances of the summaryproduct algorithm (or belief/probability
MAP estimation via agreement on trees: Messagepassing and linear programming
, 2002
"... We develop and analyze methods for computing provably optimal maximum a posteriori (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of treestructured distributions, we obtain an upper bound ..."
Abstract

Cited by 196 (10 self)
 Add to MetaCart
(Show Context)
We develop and analyze methods for computing provably optimal maximum a posteriori (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of treestructured distributions, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is tight if and only if all the tree distributions share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original distribution. Next we develop two approaches to attempting to obtain tight upper bounds: (a) a treerelaxed linear program (LP), which is derived from the Lagrangian dual of the upper bounds; and (b) a treereweighted maxproduct messagepassing algorithm that is related to but distinct from the maxproduct algorithm. In this way, we establish a connection between a certain LP relaxation of the modefinding problem, and a reweighted form of the maxproduct (minsum) messagepassing algorithm.
Using linear programming to decode binary linear codes
 IEEE TRANS. INFORM. THEORY
, 2005
"... A new method is given for performing approximate maximumlikelihood (ML) decoding of an arbitrary binary linear code based on observations received from any discrete memoryless symmetric channel. The decoding algorithm is based on a linear programming (LP) relaxation that is defined by a factor grap ..."
Abstract

Cited by 184 (10 self)
 Add to MetaCart
(Show Context)
A new method is given for performing approximate maximumlikelihood (ML) decoding of an arbitrary binary linear code based on observations received from any discrete memoryless symmetric channel. The decoding algorithm is based on a linear programming (LP) relaxation that is defined by a factor graph or paritycheck representation of the code. The resulting “LP decoder” generalizes our previous work on turbolike codes. A precise combinatorial characterization of when the LP decoder succeeds is provided, based on pseudocodewords associated with the factor graph. Our definition of a pseudocodeword unifies other such notions known for iterative algorithms, including “stopping sets, ” “irreducible closed walks, ” “trellis cycles, ” “deviation sets, ” and “graph covers.” The fractional distance ��— ™ of a code is introduced, which is a lower bound on the classical distance. It is shown that the efficient LP decoder will correct up to ��— ™ P I errors and that there are codes with ��— ™ a @ I A. An efficient algorithm to compute the fractional distance is presented. Experimental evidence shows a similar performance on lowdensity paritycheck (LDPC) codes between LP decoding and the minsum and sumproduct algorithms. Methods for tightening the LP relaxation to improve performance are also provided.
Lowdensity paritycheck codes based on finite geometries: A rediscovery and new results
 IEEE Trans. Inform. Theory
, 2001
"... This paper presents a geometric approach to the construction of lowdensity paritycheck (LDPC) codes. Four classes of LDPC codes are constructed based on the lines and points of Euclidean and projective geometries over finite fields. Codes of these four classes have good minimum distances and thei ..."
Abstract

Cited by 182 (7 self)
 Add to MetaCart
This paper presents a geometric approach to the construction of lowdensity paritycheck (LDPC) codes. Four classes of LDPC codes are constructed based on the lines and points of Euclidean and projective geometries over finite fields. Codes of these four classes have good minimum distances and their Tanner graphs have girth T. Finitegeometry LDPC codes can be decoded in various ways, ranging from low to high decoding complexity and from reasonably good to very good performance. They perform very well with iterative decoding. Furthermore, they can be put in either cyclic or quasicyclic form. Consequently, their encoding can be achieved in linear time and implemented with simple feedback shift registers. This advantage is not shared by other LDPC codes in general and is important in practice. Finitegeometry LDPC codes can be extended and shortened in various ways to obtain other good LDPC codes. Several techniques of extension and shortening are presented. Long extended finitegeometry LDPC codes have been constructed and they achieve a performance only a few tenths of a decibel away from the Shannon theoretical limit with iterative decoding.
MAP estimation via agreement on (hyper)trees: Messagepassing and linear programming approaches
 IEEE Transactions on Information Theory
, 2002
"... We develop an approach for computing provably exact maximum a posteriori (MAP) configurations for a subclass of problems on graphs with cycles. By decomposing the original problem into a convex combination of treestructured problems, we obtain an upper bound on the optimal value of the original ..."
Abstract

Cited by 144 (10 self)
 Add to MetaCart
We develop an approach for computing provably exact maximum a posteriori (MAP) configurations for a subclass of problems on graphs with cycles. By decomposing the original problem into a convex combination of treestructured problems, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is met with equality if and only if the tree problems share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original problem. Next we present and analyze two methods for attempting to obtain tight upper bounds: (a) a treereweighted messagepassing algorithm that is related to but distinct from the maxproduct (minsum) algorithm; and (b) a treerelaxed linear program (LP), which is derived from the Lagrangian dual of the upper bounds. Finally, we discuss the conditions that govern when the relaxation is tight, in which case the MAP configuration can be obtained. The analysis described here generalizes naturally to convex combinations of hypertreestructured distributions.