Results 1  10
of
359
Learning in graphical models
 STATISTICAL SCIENCE
, 2004
"... Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve largescale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for ..."
Abstract

Cited by 806 (10 self)
 Add to MetaCart
(Show Context)
Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve largescale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for approaching these problems, and indeed many of the models developed by researchers in these applied fields are instances of the general graphical model formalism. We review some of the basic ideas underlying graphical models, including the algorithmic ideas that allow graphical models to be deployed in largescale data analysis problems. We also present examples of graphical models in bioinformatics, errorcontrol coding and language processing.
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 770 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Constructing Free Energy Approximations and Generalized Belief Propagation Algorithms
 IEEE Transactions on Information Theory
, 2005
"... Important inference problems in statistical physics, computer vision, errorcorrecting coding theory, and artificial intelligence can all be reformulated as the computation of marginal probabilities on factor graphs. The belief propagation (BP) algorithm is an efficient way to solve these problems t ..."
Abstract

Cited by 585 (13 self)
 Add to MetaCart
(Show Context)
Important inference problems in statistical physics, computer vision, errorcorrecting coding theory, and artificial intelligence can all be reformulated as the computation of marginal probabilities on factor graphs. The belief propagation (BP) algorithm is an efficient way to solve these problems that is exact when the factor graph is a tree, but only approximate when the factor graph has cycles. We show that BP fixed points correspond to the stationary points of the Bethe approximation of the free energy for a factor graph. We explain how to obtain regionbased free energy approximations that improve the Bethe approximation, and corresponding generalized belief propagation (GBP) algorithms. We emphasize the conditions a free energy approximation must satisfy in order to be a “valid ” or “maxentnormal ” approximation. We describe the relationship between four different methods that can be used to generate valid approximations: the “Bethe method, ” the “junction graph method, ” the “cluster variation method, ” and the “region graph method.” Finally, we explain how to tell whether a regionbased approximation, and its corresponding GBP algorithm, is likely to be accurate, and describe empirical results showing that GBP can significantly outperform BP.
Correctness of belief propagation in Gaussian graphical models of arbitrary topology
 NEURAL COMPUTATION
, 1999
"... Local "belief propagation" rules of the sort proposed byPearl [12] are guaranteed to converge to the correct posterior probabilities in singly connected graphical models. Recently, a number of researchers have empirically demonstrated good performance of "loopy belief propagation&q ..."
Abstract

Cited by 296 (7 self)
 Add to MetaCart
Local "belief propagation" rules of the sort proposed byPearl [12] are guaranteed to converge to the correct posterior probabilities in singly connected graphical models. Recently, a number of researchers have empirically demonstrated good performance of "loopy belief propagation"  using these same rules on graphs with loops. Perhaps the most dramatic instance is the near Shannonlimit performance of "Turbo codes", whose decoding algorithm is equivalentto loopy belief propagation. Except for the
Hidden Markov processes
 IEEE Trans. Inform. Theory
, 2002
"... Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finite ..."
Abstract

Cited by 264 (5 self)
 Add to MetaCart
(Show Context)
Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finitestate finitealphabet HMPs was expanded to HMPs with finite as well as continuous state spaces and a general alphabet. In particular, statistical properties and ergodic theorems for relative entropy densities of HMPs were developed. Consistency and asymptotic normality of the maximumlikelihood (ML) parameter estimator were proved under some mild conditions. Similar results were established for switching autoregressive processes. These processes generalize HMPs. New algorithms were developed for estimating the state, parameter, and order of an HMP, for universal coding and classification of HMPs, and for universal decoding of hidden Markov channels. These and other related topics are reviewed in this paper. Index Terms—Baum–Petrie algorithm, entropy ergodic theorems, finitestate channels, hidden Markov models, identifiability, Kalman filter, maximumlikelihood (ML) estimation, order estimation, recursive parameter estimation, switching autoregressive processes, Ziv inequality. I.
The Bayes Net Toolbox for MATLAB
 Computing Science and Statistics
, 2001
"... The Bayes Net Toolbox (BNT) is an opensource Matlab package for directed graphical models. BNT supports many kinds of nodes (probability distributions), exact and approximate inference, parameter and structure learning, and static and dynamic models. BNT is widely used in teaching and research: the ..."
Abstract

Cited by 250 (1 self)
 Add to MetaCart
The Bayes Net Toolbox (BNT) is an opensource Matlab package for directed graphical models. BNT supports many kinds of nodes (probability distributions), exact and approximate inference, parameter and structure learning, and static and dynamic models. BNT is widely used in teaching and research: the web page has received over 28,000 hits since May 2000. In this paper, we discuss a broad spectrum of issues related to graphical models (directed and undirected), and describe, at a highlevel, how BNT was designed to cope with them all. We also compare BNT to other software packages for graphical models, and to the nascent OpenBayes effort.
On the Optimality of Solutions of the MaxProduct Belief Propagation Algorithm in Arbitrary Graphs
, 2001
"... Graphical models, suchasBayesian networks and Markov random fields, represent statistical dependencies of variables by a graph. The maxproduct "belief propagation" algorithm is a localmessage passing algorithm on this graph that is known to converge to a unique fixed point when the gra ..."
Abstract

Cited by 241 (13 self)
 Add to MetaCart
Graphical models, suchasBayesian networks and Markov random fields, represent statistical dependencies of variables by a graph. The maxproduct "belief propagation" algorithm is a localmessage passing algorithm on this graph that is known to converge to a unique fixed point when the graph is a tree. Furthermore, when the graph is a tree, the assignment based on the fixedpoint yields the most probable a posteriori (MAP) values of the unobserved variables given the observed ones. Recently, good
An Introduction to Factor Graphs
 IEEE SIGNAL PROCESSING MAG., JAN. 2004
, 2004
"... A large variety of algorithms in coding, signal processing, and artificial intelligence may be viewed as instances of the summaryproduct algorithm (or belief/probability ..."
Abstract

Cited by 197 (34 self)
 Add to MetaCart
A large variety of algorithms in coding, signal processing, and artificial intelligence may be viewed as instances of the summaryproduct algorithm (or belief/probability
Regular and Irregular Progressive EdgeGrowth Tanner Graphs
 IEEE TRANS. INFORM. THEORY
, 2003
"... We propose a general method for constructing Tanner graphs having a large girth by progressively establishing edges or connections between symbol and check nodes in an edgebyedge manner, called progressive edgegrowth (PEG) construction. Lower bounds on the girth of PEG Tanner graphs and on the mi ..."
Abstract

Cited by 193 (0 self)
 Add to MetaCart
We propose a general method for constructing Tanner graphs having a large girth by progressively establishing edges or connections between symbol and check nodes in an edgebyedge manner, called progressive edgegrowth (PEG) construction. Lower bounds on the girth of PEG Tanner graphs and on the minimum distance of the resulting lowdensity paritycheck (LDPC) codes are derived in terms of parameters of the graphs. The PEG construction attains essentially the same girth as Gallager's explicit construction for regular graphs, both of which meet or exceed the ErdosSachs bound. Asymptotic analysis of a relaxed version of the PEG construction is presented. We describe an empirical approach using a variant of the "downhill simplex" search algorithm to design irregular PEG graphs for short codes with fewer than a thousand of bits, complementing the design approach of "density evolution" for larger codes. Encoding of LDPC codes based on the PEG construction is also investigated. We show how to exploit the PEG principle to obtain LDPC codes that allow linear time encoding. We also investigate regular and irregular LDPC codes using PEG Tanner graphs but allowing the symbol nodes to take values over GF(q), q > 2. Analysis and simulation demonstrate that one can obtain better performance with increasing field size, which contrasts with previous observations.
MAP estimation via agreement on trees: Messagepassing and linear programming
, 2002
"... We develop and analyze methods for computing provably optimal maximum a posteriori (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of treestructured distributions, we obtain an upper bound ..."
Abstract

Cited by 191 (9 self)
 Add to MetaCart
(Show Context)
We develop and analyze methods for computing provably optimal maximum a posteriori (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of treestructured distributions, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is tight if and only if all the tree distributions share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original distribution. Next we develop two approaches to attempting to obtain tight upper bounds: (a) a treerelaxed linear program (LP), which is derived from the Lagrangian dual of the upper bounds; and (b) a treereweighted maxproduct messagepassing algorithm that is related to but distinct from the maxproduct algorithm. In this way, we establish a connection between a certain LP relaxation of the modefinding problem, and a reweighted form of the maxproduct (minsum) messagepassing algorithm.