Results 1  10
of
78
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 564 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Turbo decoding as an instance of Pearl’s belief propagation algorithm
 IEEE Journal on Selected Areas in Communications
, 1998
"... Abstract—In this paper, we will describe the close connection between the now celebrated iterative turbo decoding algorithm of Berrou et al. and an algorithm that has been well known in the artificial intelligence community for a decade, but which is relatively unknown to information theorists: Pear ..."
Abstract

Cited by 309 (15 self)
 Add to MetaCart
Abstract—In this paper, we will describe the close connection between the now celebrated iterative turbo decoding algorithm of Berrou et al. and an algorithm that has been well known in the artificial intelligence community for a decade, but which is relatively unknown to information theorists: Pearl’s belief propagation algorithm. We shall see that if Pearl’s algorithm is applied to the “belief network ” of a parallel concatenation of two or more codes, the turbo decoding algorithm immediately results. Unfortunately, however, this belief diagram has loops, and Pearl only proved that his algorithm works when there are no loops, so an explanation of the excellent experimental performance of turbo decoding is still lacking. However, we shall also show that Pearl’s algorithm can be used to routinely derive previously known iterative, but suboptimal, decoding algorithms for a number of other errorcontrol systems, including Gallager’s
Exploiting Causal Independence in Bayesian Network Inference
 Journal of Artificial Intelligence Research
, 1996
"... A new method is proposed for exploiting causal independencies in exact Bayesian network inference. ..."
Abstract

Cited by 157 (9 self)
 Add to MetaCart
A new method is proposed for exploiting causal independencies in exact Bayesian network inference.
Multiresolution markov models for signal and image processing
 Proceedings of the IEEE
, 2002
"... This paper reviews a significant component of the rich field of statistical multiresolution (MR) modeling and processing. These MR methods have found application and permeated the literature of a widely scattered set of disciplines, and one of our principal objectives is to present a single, coheren ..."
Abstract

Cited by 122 (18 self)
 Add to MetaCart
This paper reviews a significant component of the rich field of statistical multiresolution (MR) modeling and processing. These MR methods have found application and permeated the literature of a widely scattered set of disciplines, and one of our principal objectives is to present a single, coherent picture of this framework. A second goal is to describe how this topic fits into the even larger field of MR methods and concepts–in particular making ties to topics such as wavelets and multigrid methods. A third is to provide several alternate viewpoints for this body of work, as the methods and concepts we describe intersect with a number of other fields. The principle focus of our presentation is the class of MR Markov processes defined on pyramidally organized trees. The attractiveness of these models stems from both the very efficient algorithms they admit and their expressive power and broad applicability. We show how a variety of methods and models relate to this framework including models for selfsimilar and 1/f processes. We also illustrate how these methods have been used in practice. We discuss the construction of MR models on trees and show how questions that arise in this context make contact with wavelets, state space modeling of time series, system and parameter identification, and hidden
An Introduction to Factor Graphs
 IEEE SIGNAL PROCESSING MAG., JAN. 2004
, 2004
"... A large variety of algorithms in coding, signal processing, and artificial intelligence may be viewed as instances of the summaryproduct algorithm (or belief/probability ..."
Abstract

Cited by 121 (34 self)
 Add to MetaCart
A large variety of algorithms in coding, signal processing, and artificial intelligence may be viewed as instances of the summaryproduct algorithm (or belief/probability
Learning with mixtures of trees
 Journal of Machine Learning Research
, 2000
"... This paper describes the mixturesoftrees model, a probabilistic model for discrete multidimensional domains. Mixturesoftrees generalize the probabilistic trees of Chow and Liu [6] in a different and complementary direction to that of Bayesian networks. We present efficient algorithms for learnin ..."
Abstract

Cited by 109 (2 self)
 Add to MetaCart
This paper describes the mixturesoftrees model, a probabilistic model for discrete multidimensional domains. Mixturesoftrees generalize the probabilistic trees of Chow and Liu [6] in a different and complementary direction to that of Bayesian networks. We present efficient algorithms for learning mixturesoftrees models in maximum likelihood and Bayesian frameworks. We also discuss additional efficiencies that can be obtained when data are “sparse, ” and we present data structures and algorithms that exploit such sparseness. Experimental results demonstrate the performance of the model for both density estimation and classification. We also discuss the sense in which treebased classifiers perform an implicit form of feature selection, and demonstrate a resulting insensitivity to irrelevant attributes.
Perspectives on the Theory and Practice of Belief Functions
 International Journal of Approximate Reasoning
, 1990
"... The theory of belief functions provides one way to use mathematical probability in subjective judgment. It is a generalization of the Bayesian theory of subjective probability. When we use the Bayesian theory to quantify judgments about a question, we must assign probabilities to the possible answer ..."
Abstract

Cited by 86 (7 self)
 Add to MetaCart
The theory of belief functions provides one way to use mathematical probability in subjective judgment. It is a generalization of the Bayesian theory of subjective probability. When we use the Bayesian theory to quantify judgments about a question, we must assign probabilities to the possible answers to that question. The theory of belief functions is more flexible; it allows us to derive degrees of belief for a question from probabilities for a related question. These degrees of belief may or may not have the mathematical properties of probabilities; how much they differ from probabilities will depend on how closely the two questions are related. Examples of what we would now call belieffunction reasoning can be found in the late seventeenth and early eighteenth centuries, well before Bayesian ideas were developed. In 1689, George Hooper gave rules for combining testimony that can be recognized as special cases of Dempster's rule for combining belief functions (Shafer 1986a). Similar rules were formulated by Jakob Bernoulli in his Ars Conjectandi, published posthumously in 1713, and by JohannHeinrich Lambert in his Neues Organon, published in 1764 (Shafer 1978). Examples of belieffunction reasoning can also be found in more recent work, by authors
Topological Parameters for timespace tradeoff
 Artificial Intelligence
, 1996
"... In this paper we propose a family of algorithms combining treeclustering with conditioning that trade space for time. Such algorithms are useful for reasoning in probabilistic and deterministic networks as well as for accomplishing optimization tasks. By analyzing the problem structure it will be p ..."
Abstract

Cited by 58 (12 self)
 Add to MetaCart
In this paper we propose a family of algorithms combining treeclustering with conditioning that trade space for time. Such algorithms are useful for reasoning in probabilistic and deterministic networks as well as for accomplishing optimization tasks. By analyzing the problem structure it will be possible to select from a spectrum the algorithm that best meets a given timespace specification. 1 INTRODUCTION Topologybased algorithms for constraint satisfaction and probabilistic reasoning fall into two distinct classes. One class is centered on treeclustering, the other on cyclecutset decomposition. Treeclustering involves transforming the original problem into a treelike problem that can then be solved by a specialized treesolving algorithm [ Mackworth and Freuder, 1985; Pearl, 1986 ] . The treeclustering algorithm is time and space exponential in the induced width (also called tree width) of the problem's graph. The transforming algorithm identifies subproblems that together ...
On the Convergence of Iterative Decoding on Graphs with a Single Cycle
 In Proc. 1998 ISIT
, 1998
"... It is now understood [7, 8] that the turbo decoding algorithm is an instance of a probability propagation algorithm (PPA) on a graph with many cycles. However, PPAtype algorithms are known to give exact results only when the underlying graph is cyclefree. Thus it is important to study the "approxi ..."
Abstract

Cited by 44 (1 self)
 Add to MetaCart
It is now understood [7, 8] that the turbo decoding algorithm is an instance of a probability propagation algorithm (PPA) on a graph with many cycles. However, PPAtype algorithms are known to give exact results only when the underlying graph is cyclefree. Thus it is important to study the "approximate correctness" of PPA on graphs with cycles. In this paper we make a first step by discussing the behavior of an PPA in graphs with a single cycle. This work is directly relevant to the study of iterative decoding of tailbiting codes, whose underlying graph has just one cycle [3], [12]. First, we shall show that for strictly positive local kernels, the iterations of the PPA will always converge to the same fixed point regardless of the scheduling order used. Moreover, the length of the cycle does not play a role in this convergence. Secondly, we shall generalize a result of McEliece and Rodemich [9], by showing that if the hidden variables in the cycle are binaryvalued, a decision based...
Lazy Propagation in Junction Trees
 In Proc. 14th Conf. on Uncertainty in Artificial Intelligence
, 1998
"... The efficiency of algorithms using secondary structures for probabilistic inference in Bayesian networks can be improved by exploiting independence relations induced by evidence and the direction of the links in the original network. In this paper we present an algorithm that exploits the independen ..."
Abstract

Cited by 36 (8 self)
 Add to MetaCart
The efficiency of algorithms using secondary structures for probabilistic inference in Bayesian networks can be improved by exploiting independence relations induced by evidence and the direction of the links in the original network. In this paper we present an algorithm that exploits the independences relations induced by evidence and the direction of the links in the original network to reduce both time and space costs. Instead of multiplying the conditional probability distributions for the various cliques, we store a script specifying which potentials to multiply when a message is to be produced. The performance improvement of the algorithm is emphasized through empirical evaluations involving large real world Bayesian networks, and we compare the method with the Hugin and ShaferShenoy inference algorithms. 1 Introduction It has for a long time been a puzzle why "standard" inference algorithms for Bayesian networks did not really use the direction of the links in the network. By ...