Results 1  10
of
1,037
Factor Graphs and the SumProduct Algorithm
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 1998
"... A factor graph is a bipartite graph that expresses how a "global" function of many variables factors into a product of "local" functions. Factor graphs subsume many other graphical models including Bayesian networks, Markov random fields, and Tanner graphs. Following one simple computational rule, t ..."
Abstract

Cited by 1163 (67 self)
 Add to MetaCart
A factor graph is a bipartite graph that expresses how a "global" function of many variables factors into a product of "local" functions. Factor graphs subsume many other graphical models including Bayesian networks, Markov random fields, and Tanner graphs. Following one simple computational rule, the sumproduct algorithm operates in factor graphs to computeeither exactly or approximatelyvarious marginal functions by distributed messagepassing in the graph. A wide variety of algorithms developed in artificial intelligence, signal processing, and digital communications can be derived as specific instances of the sumproduct algorithm, including the forward/backward algorithm, the Viterbi algorithm, the iterative "turbo" decoding algorithm, Pearl's belief propagation algorithm for Bayesian networks, the Kalman filter, and certain fast Fourier transform algorithms.
A Bayesian method for the induction of probabilistic networks from data
 Machine Learning
, 1992
"... Abstract. This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computerassisted hypothesis testing, automated scientific discovery, and automated construction of ..."
Abstract

Cited by 1081 (27 self)
 Add to MetaCart
Abstract. This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computerassisted hypothesis testing, automated scientific discovery, and automated construction of probabilistic expert systems. We extend the basic method to handle missing data and hidden (latent) variables. We show how to perform probabilistic inference by averaging over the inferences of multiple belief networks. Results are presented of a preliminary evaluation of an algorithm for constructing a belief network from a database of cases. Finally, we relate the methods in this paper to previous work, and we discuss open problems.
Good ErrorCorrecting Codes based on Very Sparse Matrices
, 1999
"... We study two families of errorcorrecting codes defined in terms of very sparse matrices. "MN" (MacKayNeal) codes are recently invented, and "Gallager codes" were first investigated in 1962, but appear to have been largely forgotten, in spite of their excellent properties. The decoding of both cod ..."
Abstract

Cited by 513 (25 self)
 Add to MetaCart
We study two families of errorcorrecting codes defined in terms of very sparse matrices. "MN" (MacKayNeal) codes are recently invented, and "Gallager codes" were first investigated in 1962, but appear to have been largely forgotten, in spite of their excellent properties. The decoding of both codes can be tackled with a practical sumproduct algorithm. We prove that these codes are "very good," in that sequences of codes exist which, when optimally decoded, achieve information rates up to the Shannon limit. This result holds not only for the binarysymmetric channel but also for any channel with symmetric stationary ergodic noise. We give experimental results for binarysymmetric channels and Gaussian channels demonstrating that practical performance substantially better than that of standard convolutional and concatenated codes can be achieved; indeed, the performance of Gallager codes is almost as close to the Shannon limit as that of turbo codes.
The Infinite Hidden Markov Model
 Machine Learning
, 2002
"... We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. Th ..."
Abstract

Cited by 488 (33 self)
 Add to MetaCart
We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. These three hyperparameters define a hierarchical Dirichlet process capable of capturing a rich set of transition dynamics. The three hyperparameters control the time scale of the dynamics, the sparsity of the underlying statetransition matrix, and the expected number of distinct hidden states in a finite sequence. In this framework it is also natural to allow the alphabet of emitted symbols to be infiniteconsider, for example, symbols being possible words appearing in English text.
On Bayesian analysis of mixtures with an unknown number of components
 INSTITUTE OF INTERNATIONAL ECONOMICS PROJECT ON INTERNATIONAL COMPETITION POLICY," COM/DAFFE/CLP/TD(94)42
, 1997
"... ..."
Turbo decoding as an instance of Pearl’s belief propagation algorithm
 IEEE Journal on Selected Areas in Communications
, 1998
"... Abstract—In this paper, we will describe the close connection between the now celebrated iterative turbo decoding algorithm of Berrou et al. and an algorithm that has been well known in the artificial intelligence community for a decade, but which is relatively unknown to information theorists: Pear ..."
Abstract

Cited by 309 (15 self)
 Add to MetaCart
Abstract—In this paper, we will describe the close connection between the now celebrated iterative turbo decoding algorithm of Berrou et al. and an algorithm that has been well known in the artificial intelligence community for a decade, but which is relatively unknown to information theorists: Pearl’s belief propagation algorithm. We shall see that if Pearl’s algorithm is applied to the “belief network ” of a parallel concatenation of two or more codes, the turbo decoding algorithm immediately results. Unfortunately, however, this belief diagram has loops, and Pearl only proved that his algorithm works when there are no loops, so an explanation of the excellent experimental performance of turbo decoding is still lacking. However, we shall also show that Pearl’s algorithm can be used to routinely derive previously known iterative, but suboptimal, decoding algorithms for a number of other errorcontrol systems, including Gallager’s
Probabilistic Horn abduction and Bayesian networks
 Artificial Intelligence
, 1993
"... This paper presents a simple framework for Hornclause abduction, with probabilities associated with hypotheses. The framework incorporates assumptions about the rule base and independence assumptions amongst hypotheses. It is shown how any probabilistic knowledge representable in a discrete Bayesia ..."
Abstract

Cited by 298 (37 self)
 Add to MetaCart
This paper presents a simple framework for Hornclause abduction, with probabilities associated with hypotheses. The framework incorporates assumptions about the rule base and independence assumptions amongst hypotheses. It is shown how any probabilistic knowledge representable in a discrete Bayesian belief network can be represented in this framework. The main contribution is in finding a relationship between logical and probabilistic notions of evidential reasoning. This provides a useful representation language in its own right, providing a compromise between heuristic and epistemic adequacy. It also shows how Bayesian networks can be extended beyond a propositional language. This paper also shows how a language with only (unconditionally) independent hypotheses can represent any probabilistic knowledge, and argues that it is better to invent new hypotheses to explain dependence rather than having to worry about dependence in the language. Scholar, Canadian Institute for Advanced...
ContextSpecific Independence in Bayesian Networks
, 1996
"... Bayesiannetworks provide a languagefor qualitatively representing the conditional independence properties of a distribution. This allows a natural and compact representation of the distribution, eases knowledge acquisition, and supports effective inference algorithms. ..."
Abstract

Cited by 296 (30 self)
 Add to MetaCart
Bayesiannetworks provide a languagefor qualitatively representing the conditional independence properties of a distribution. This allows a natural and compact representation of the distribution, eases knowledge acquisition, and supports effective inference algorithms.
Bucket Elimination: A Unifying Framework for Reasoning
"... Bucket elimination is an algorithmic framework that generalizes dynamic programming to accommodate many problemsolving and reasoning tasks. Algorithms such as directionalresolution for propositional satisfiability, adaptiveconsistency for constraint satisfaction, Fourier and Gaussian elimination ..."
Abstract

Cited by 278 (62 self)
 Add to MetaCart
Bucket elimination is an algorithmic framework that generalizes dynamic programming to accommodate many problemsolving and reasoning tasks. Algorithms such as directionalresolution for propositional satisfiability, adaptiveconsistency for constraint satisfaction, Fourier and Gaussian elimination for solving linear equalities and inequalities, and dynamic programming for combinatorial optimization, can all be accommodated within the bucket elimination framework. Many probabilistic inference tasks can likewise be expressed as bucketelimination algorithms. These include: belief updating, finding the most probable explanation, and expected utility maximization. These algorithms share the same performance guarantees; all are time and space exponential in the inducedwidth of the problem's interaction graph. While elimination strategies have extensive demands on memory, a contrasting class of algorithms called "conditioning search" require only linear space. Algorithms in this class split a problem into subproblems by instantiating a subset of variables, called a conditioning set, or a cutset. Typical examples of conditioning search algorithms are: backtracking (in constraint satisfaction), and branch and bound (for combinatorial optimization). The paper presents the bucketelimination framework as a unifying theme across probabilistic and deterministic reasoning tasks and show how conditioning search can be augmented to systematically trade space for time.