Results 1  10
of
62
Learning in graphical models
, 2004
"... Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve largescale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for ..."
Abstract

Cited by 612 (11 self)
 Add to MetaCart
Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve largescale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for approaching these problems, and indeed many of the models developed by researchers in these applied fields are instances of the general graphical model formalism. We review some of the basic ideas underlying graphical models, including the algorithmic ideas that allow graphical models to be deployed in largescale data analysis problems. We also present examples of graphical models in bioinformatics, errorcontrol coding and language processing. Key words and phrases: Probabilistic graphical models, junction tree algorithm, sumproduct algorithm, Markov chain Monte Carlo, variational inference, bioinformatics, errorcontrol coding.
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 564 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Bursty and Hierarchical Structure in Streams
, 2002
"... A fundamental problem in text data mining is to extract meaningful structure from document streams that arrive continuously over time. Email and news articles are two natural examples of such streams, each characterized by topics that appear, grow in intensity for a period of time, and then fade aw ..."
Abstract

Cited by 260 (2 self)
 Add to MetaCart
A fundamental problem in text data mining is to extract meaningful structure from document streams that arrive continuously over time. Email and news articles are two natural examples of such streams, each characterized by topics that appear, grow in intensity for a period of time, and then fade away. The published literature in a particular research field can be seen to exhibit similar phenomena over a much longer time scale. Underlying much of the text mining work in this area is the following intuitive premise  that the appearance of a topic in a document stream is signaled by a "burst of activity," with certain features rising sharply in frequency as the topic emerges.
Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data
 IN ICML
, 2004
"... In sequence modeling, we often wish to represent complex interaction between labels, such as when performing multiple, cascaded labeling tasks on the same sequence, or when longrange dependencies exist. We present dynamic conditional random fields (DCRFs), a generalization of linearchain cond ..."
Abstract

Cited by 122 (11 self)
 Add to MetaCart
In sequence modeling, we often wish to represent complex interaction between labels, such as when performing multiple, cascaded labeling tasks on the same sequence, or when longrange dependencies exist. We present dynamic conditional random fields (DCRFs), a generalization of linearchain conditional random fields (CRFs) in which each time slice contains a set of state variables and edgesa distributed state representation as in dynamic Bayesian networks (DBNs)and parameters are tied across slices. Since exact
Layered representations for learning and inferring office activity from multiple sensory channels
, 2004
"... ..."
A General Model for Online Probabilistic Plan Recognition
 In Proc. of the International Joint Conference on Artificial Intelligence (IJCAI
, 2003
"... We present a new general framework for online probabilistic plan recognition called the Abstract Hidden Markov Memory Model (AHMEM). The new model is an extension of the existing Abstract Hidden Markov Model to allow the policy to have internal memory which can be updated in a Markov fashion. ..."
Abstract

Cited by 81 (1 self)
 Add to MetaCart
We present a new general framework for online probabilistic plan recognition called the Abstract Hidden Markov Memory Model (AHMEM). The new model is an extension of the existing Abstract Hidden Markov Model to allow the policy to have internal memory which can be updated in a Markov fashion. We show that the AHMEM can represent a richer class of probabilistic plans, and at the same time derive an efficient algorithm for plan recognition in the AHMEM based on the RaoBlackwellised Particle Filter approximate inference method.
Representing hierarchical POMDPs as DBNs for multiscale robot localization
, 2004
"... We explore the advantages of representing hierarchical partially observable Markov decision processes (HPOMDPs) as dynamic Bayesian networks (DBNs). In particular, we focus on the special case of using HPOMDPs to represent multiresolution spatial maps for indoor robot navigation. Our results show ..."
Abstract

Cited by 29 (2 self)
 Add to MetaCart
We explore the advantages of representing hierarchical partially observable Markov decision processes (HPOMDPs) as dynamic Bayesian networks (DBNs). In particular, we focus on the special case of using HPOMDPs to represent multiresolution spatial maps for indoor robot navigation. Our results show that a DBN representation of HPOMDPs can train significantly faster than the original learning algorithm for HPOMDPs or the equivalent flat POMDP, and requires much less data. In addition, the DBN formulation can easily be extended to parameter tying and factoring of variables, which further reduces the time and sample complexity. This enables us to apply HPOMDP methods to much larger problems than previously possible. 1.
Unsupervised Discovery Of Multilevel Statistical Video Structures Using Hierarchical Hidden Markov Models
 IN PROC. ICME
, 2003
"... Structure elements in a time sequence (e.g. video) are repetitive segments with consistent deterministic or stochastic characteristics. While most existing work in detecting structures follow a supervised paradigm, we propose a fully unsupervised statistical solution in this paper. We present a unif ..."
Abstract

Cited by 28 (4 self)
 Add to MetaCart
Structure elements in a time sequence (e.g. video) are repetitive segments with consistent deterministic or stochastic characteristics. While most existing work in detecting structures follow a supervised paradigm, we propose a fully unsupervised statistical solution in this paper. We present a unified approach to structure discovery from long video sequences as simultaneously finding the statistical descriptions of structure and locating segments that matches the descriptions. We model the multilevel statistical structure as hierarchical hidden Markov models, and present efficient algorithms for learning both the parameters and the model structure. When tested on a specific domain, soccer video, the unsupervised learning scheme achieves very promising results: it automatically discovers the statistical descriptions of highlevel structures, and at the same time achieves even slightly better accuracy in detecting discovered structures in unlabelled videos than a supervised approach designed with domain knowledge and trained with comparable hidden Markov models.
Bayesian information extraction network
 In Proc.18th Int. Joint Conf. Artifical Intelligence
, 2003
"... Dynamic Bayesian networks (DBNs) offer an elegant way to integrate various aspects of language in one model. Many existing algorithms developed for learning and inference in DBNs are applicable to probabilistic language modeling. To demonstrate the potential of DBNs for natural language processing, ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
Dynamic Bayesian networks (DBNs) offer an elegant way to integrate various aspects of language in one model. Many existing algorithms developed for learning and inference in DBNs are applicable to probabilistic language modeling. To demonstrate the potential of DBNs for natural language processing, we employ a DBN in an information extraction task. We show how to assemble wealth of emerging linguistic instruments for shallow parsing, syntactic and semantic tagging, morphological decomposition, named entity recognition etc. in order to incrementally build a robust information extraction system. Our method outperforms previously published results on an established benchmark domain.
Hierarchical POMDP Controller Optimization by Likelihood Maximization
"... Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. (2006) recently showed that the hierarchy discovery problem can be framed as a nonconvex optimization problem. However, the inherent computational difficulty of solving such an optimi ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. (2006) recently showed that the hierarchy discovery problem can be framed as a nonconvex optimization problem. However, the inherent computational difficulty of solving such an optimization problem makes it hard to scale to realworld problems. In another line of research, Toussaint et al. (2006) developed a method to solve planning problems by maximumlikelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on nonconvex optimization. 1