## The Generalized Distributive Law

### BibTeX

@MISC{_thegeneralized,

author = {},

title = {The Generalized Distributive Law},

year = {}

}

### OpenURL

### Abstract

Abstract—In this semitutorial paper we discuss a general message passing algorithm, which we call the generalized distributive law (GDL). The GDL is a synthesis of the work of many authors in the information theory, digital communications, signal processing, statistics, and artificial intelligence communities. It includes as special cases the Baum–Welch algorithm, the fast Fourier transform (FFT) on any finite Abelian group, the Gallager–Tanner–Wiberg decoding algorithm, Viterbi’s algorithm, the BCJR algorithm, Pearl’s “belief propagation ” algorithm, the Shafer–Shenoy probability propagation algorithm, and the turbo decoding algorithm. Although this algorithm is guaranteed to give exact answers only in certain cases (the “junction tree ” condition), unfortunately not including the cases of GTW with cycles or turbo decoding, there is much experimental evidence, and a few theorems, suggesting that it often works approximately even when it is not supposed to. Index Terms—Belief propagation, distributive law, graphical models, junction trees, turbo codes. I.

### Citations

8778 |
Introduction to Algorithms
- Cormen, Leiserson, et al.
- 2009
(Show Context)
Citation Context ...fers depends on the relative size of the matrices. For example, if , , , and , the left junction tree requires 15 000 operations and the right junction tree takes 150 000. (This example is taken from =-=[9]-=-.) As we discussed in Example 2.6, the matrix multiplication problem is equivalent to a trellis path problem. In particular, if the computations are in the min-sum semiring, the problem is that of fin... |

1321 |
Local computations with probabilities on graphical structures and their application to expert systems
- Lauritzen, Spiegelhalter
- 1998
(Show Context)
Citation Context ...cedure more of an art than a science. 7 The whimsical term “moral graph” originally referred to the graph obtained from a DAG by drawing edges between—“marrying”—each of the parents of a given vertex =-=[23]-=-.sAJI AND MCELIECE: THE GENERALIZED DISTRIBUTIVE LAW 335 Fig. 11. The LD graph for the local domains and kernels in Example 2.2. (All edges have weight I.) There is no junction tree. Fig. 12. The mora... |

1251 |
Optimal decoding of linear codes for minimizing symbol error rate
- Bahl, Cocke, et al.
- 1974
(Show Context)
Citation Context ...n this case the local domains can be organized into a junction tree, as illustrated in Fig. 16 for the case . The GDL algorithm, applied to the junction tree of Fig. 16, gives us essentially the BCJR =-=[5]-=- and Viterbi [37][11] algorithms, respectively. (For Viterbi’s algorithm, we take the negative logarithm of the objective function in (2.5), and use the min-sum semiring, with a single target vertex, ... |

954 | Low-Density Parity-Check Codes
- Gallager
- 1963
(Show Context)
Citation Context ...domains can be organized as a junction tree. One such tree is shown in Fig. 14. It can be shown that the GDL, when applied to the junction tree of Fig. 14, yields the Gallager–Tanner–Wiberg algorithm =-=[15]-=-, [34], [39] for decoding linear codes defined by cycle-free graphs. Indeed, Fig. 14 is identical to the “Tanner graph” cited by Wiberg [39] for decoding this particular code. Example 4.4: Here we con... |

949 |
An introduction to Bayesian network
- Jensen
- 1995
(Show Context)
Citation Context ...ple 2.5. or, using streamlined notation (2.4) A DAG, together with associated random variables whose joint density function factors according to the structure of the DAG, is called a Bayesian network =-=[18]-=-. Let us assume that the two random variables and are observed to have the values and , respectively. The probabilistic inference problem, in this case, is to compute the conditional probabilities of ... |

708 |
An Algorithm for the Machine Calculation of Complex Fourier Series
- Cooley, Tukey
- 1965
(Show Context)
Citation Context ...ample 4.3. yields the usual “fast” Hadamard transform. More generally, by extending the method in this example, it is possible to show that the FFT on any finite Abelian group, as described, e.g., in =-=[8]-=- or [31], can be derived from an application of the GDL. 8 Example 4.3: Here we continue Example 2.3. In this case, the local domains can be organized as a junction tree. One such tree is shown in Fig... |

378 |
Statistical inference for probabilistic functions of finite state Markov chains
- BAUM, PETRIE
- 1966
(Show Context)
Citation Context ...gorithm (also known as the -step in the Baum–Welch algorithm) was invented in 1962 by Lloyd Welch, and seems to have first appeared in the unclassified literature in two independent 1966 publications =-=[6]-=-, [7]. It appeared explicitly as an algorithm for tracking the states of a Markov chain in the early 1970’s [5], [26] (see also the survey articles [30] and [32]). A similar algorithm (in min-sum form... |

292 | Bucket elimination: A unifying framework for probabilistic inference
- Dechter
- 1996
(Show Context)
Citation Context ...troduced by Shafer and Shenoy [33]. A recent book by Jensen [18] is a good introduction to most of this material. A recent unification of many of these concepts called “bucket elimination” appears in =-=[10]-=-, and a recent paper by Lauritzen and Jensen [22] abstracts the MPF problem still further, so that the marginalization is done axiomatically, rather than by summation. In any case, by early 1996, the ... |

132 |
The Viterbi Algorithm
- Jr, G
- 1973
(Show Context)
Citation Context ...l domains can be organized into a junction tree, as illustrated in Fig. 16 for the case . The GDL algorithm, applied to the junction tree of Fig. 16, gives us essentially the BCJR [5] and Viterbi [37]=-=[11]-=- algorithms, respectively. (For Viterbi’s algorithm, we take the negative logarithm of the objective function in (2.5), and use the min-sum semiring, with a single target vertex, preferably the “last”... |

115 | Iterative decoding of compound codes by probability propagation in graphical models
- Kschischang
- 1998
(Show Context)
Citation Context ...arginalization is done axiomatically, rather than by summation. In any case, by early 1996, the relevance of these AI algorithms had become apparent to researchers in the information theory community =-=[21]-=- [28]. Conversely, the AI community has become excited by the developments in the information theory community [14] [38], which demonstrate that these algorithms can be successful on graphs with cycle... |

114 | A revolution: Belief propagation in graphs with cycles
- Frey, MacKay
- 1998
(Show Context)
Citation Context ...AI algorithms had become apparent to researchers in the information theory community [21] [28]. Conversely, the AI community has become excited by the developments in the information theory community =-=[14]-=- [38], which demonstrate that these algorithms can be successful on graphs with cycles. We discuss this is in the next section. VII. ITERATIVE AND APPROXIMATE VERSIONS OF THE GDL Although the GDL can ... |

83 |
A Computational Model for Causal and Diagnostic Reasoning in Inference Engines
- Kim, Pearl
- 1983
(Show Context)
Citation Context ...ial Intelligence The relevant research in the artificial intelligence (AI) community began relatively late, but it has evolved quickly. The activity began in the 1980’s with the work of Kim and Pearl =-=[20]-=- and Pearl [29]. Pearl’s “belief propagation” algorithm, as it has come to be known, is a message-passing algorithm for solving the probabilistic inference problem on a Bayesian network whose DAG cont... |

82 | Optimal junction trees
- Jensen, Jensen
- 1994
(Show Context)
Citation Context ...there are several choices with the same weight, choose one whose complexity, as defined by (5.5), is as small as possible. The tree that results is guaranteed to be a minimum-complexity junction tree =-=[19]-=-. In fact, we used this technique to find minimum-complexity junction trees in Examples 4.1, 4.3, 4.4, and 4.5. We conclude this section with two examples which illustrate the difficulty of finding th... |

81 | Good codes based on very sparse matrices
- MacKay, Neal
(Show Context)
Citation Context ...w known that an application of the GDL, or one of its close relatives, to an appropriate junction graph with cycles, gives both the Gallager–Tanner–Wiberg algorithm for low-density parity-check codes =-=[24]-=-, [25], [28] ,[39], the turbo decoding algorithm [21], [28], [39]. Both of these decoding algorithms have proved to be extraordinarily effective experimentally, despite the fact that there are as yet ... |

44 | Iterative decoding on graphs with a single cycle - Aji, Horn, et al. - 1998 |

42 |
On receiver structures for channels having memory
- Chang, Hancock
- 1966
(Show Context)
Citation Context ...hm (also known as the -step in the Baum–Welch algorithm) was invented in 1962 by Lloyd Welch, and seems to have first appeared in the unclassified literature in two independent 1966 publications [6], =-=[7]-=-. It appeared explicitly as an algorithm for tracking the states of a Markov chain in the early 1970’s [5], [26] (see also the survey articles [30] and [32]). A similar algorithm (in min-sum form) app... |

28 | Local computation with valuations from commutative semigroups
- Lauritzen, Jensen
- 1997
(Show Context)
Citation Context ... by Jensen [18] is a good introduction to most of this material. A recent unification of many of these concepts called “bucket elimination” appears in [10], and a recent paper by Lauritzen and Jensen =-=[22]-=- abstracts the MPF problem still further, so that the marginalization is done axiomatically, rather than by summation. In any case, by early 1996, the relevance of these AI algorithms had become appar... |

16 | Bayesian networks for pattern classification, data compression, and channel coding - Frey - 1997 |

13 | Iterative min-sum decoding of tail-biting codes
- Aji, Horn, et al.
- 1998
(Show Context)
Citation Context ...ily effective experimentally, despite the fact that there are as yet no general theorems that explain their behavior. • Emerging Theory: Single-Cycle Junction Graphs Recently, a number of authors [1]–=-=[3]-=-, [12], [38], [39] have studied the behavior of the iterative GDL on junction graphs which have exactly one cycle. It seems fair to say that, at least for the sum-product and the min-sum semirings, th... |

10 |
Graphical Models and Iterative Decoding
- Aji
- 1999
(Show Context)
Citation Context ... in (2.5) as it stands, and use the sum–product semiring, and evaluate the objective function at each of the vertices , for . In both cases, the appropriate schedule is fully serial.) 8 For this, see =-=[1]-=-, where it is observed that the moral graph for the DFT over a finite Abelian group q is triangulated if and only if q is a cyclic group of prime-power order. In all other cases, it is necessary to tr... |

7 |
Iterative decoding of tail-biting trellises
- Forney, Kschischang, et al.
- 1998
(Show Context)
Citation Context ...ffective experimentally, despite the fact that there are as yet no general theorems that explain their behavior. • Emerging Theory: Single-Cycle Junction Graphs Recently, a number of authors [1]–[3], =-=[12]-=-, [38], [39] have studied the behavior of the iterative GDL on junction graphs which have exactly one cycle. It seems fair to say that, at least for the sum-product and the min-sum semirings, the iter... |

2 |
Viterbi’s algorithm and matrix multiplication
- Aji, McEliece, et al.
- 1999
(Show Context)
Citation Context ...tree corresponds to the parenthesization @w w Aw , and the one on the right corresponds to w @w w A. plying a chain of matrices in the min-sum semiring. (This connection is explored in more detail in =-=[4]-=-.) V. COMPLEXITY OF THE GDL In this section we will provide complexity estimates for the serial versions of the GDL discussed in Section III. Here by complexity we mean the arithmetic complexity, i.e.... |