Results 1  10
of
480
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 705 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Has the U.S. Economy Become More Stable? A Bayesian Approach Based on a MarkovSwitching Model of Business Cycle
, 1999
"... We hope to be able to provide answers to the following questions: 1) Has there been a structural break in postwar U.S. real GDP growth toward more stabilization? 2) If so, when would it have been? 3) What's the nature of the structural break? For this purpose, we employ a Bayesian approach to d ..."
Abstract

Cited by 384 (15 self)
 Add to MetaCart
We hope to be able to provide answers to the following questions: 1) Has there been a structural break in postwar U.S. real GDP growth toward more stabilization? 2) If so, when would it have been? 3) What's the nature of the structural break? For this purpose, we employ a Bayesian approach to dealing with structural break at an unknown changepoint in a Markovswitching model of business cycle. Empirical results suggest that there has been a structural break in U.S. real GDP growth toward more stabilization, with the posterior mode of the break date around 1984:1. Furthermore, we #nd a narrowing gap between growth rates during recessions and booms is at least as important as a decline in the volatility of shocks. Key Words: Bayes Factor, Gibbs sampling, Marginal Likelihood, MarkovSwitching, Stabilization, Structural Break. JEL Classi#cations: C11, C12, C22, E32. 1. Introduction In the literature, the issue of postwar stabilization of the U.S. economy relative to the prewar period has...
Using simulation methods for Bayesian econometric models: Inference, development and communication
 Econometric Review
, 1999
"... This paper surveys the fundamental principles of subjective Bayesian inference in econometrics and the implementation of those principles using posterior simulation methods. The emphasis is on the combination of models and the development of predictive distributions. Moving beyond conditioning on a ..."
Abstract

Cited by 313 (18 self)
 Add to MetaCart
This paper surveys the fundamental principles of subjective Bayesian inference in econometrics and the implementation of those principles using posterior simulation methods. The emphasis is on the combination of models and the development of predictive distributions. Moving beyond conditioning on a fixed number of completely specified models, the paper introduces subjective Bayesian tools for formal comparison of these models with as yet incompletely specified models. The paper then shows how posterior simulators can facilitate communication between investigators (for example, econometricians) on the one hand and remote clients (for example, decision makers) on the other, enabling clients to vary the prior distributions and functions of interest employed by investigators. A theme of the paper is the practicality of subjective Bayesian methods. To this end, the paper describes publicly available software for Bayesian inference, model development, and communication and provides illustrations using two simple econometric models. *This paper was originally prepared for the Australasian meetings of the Econometric Society in Melbourne, Australia,
Likelihood inference for discretely observed nonlinear diffusions. Econometrica 69 959–993. MR1839375
 Ann. Statist
, 2001
"... ..."
(Show Context)
Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables
 Machine Learning
, 1997
"... We discuss Bayesian methods for learning Bayesian networks when data sets are incomplete. In particular, we examine asymptotic approximations for the marginal likelihood of incomplete data given a Bayesian network. We consider the Laplace approximation and the less accurate but more efficient BIC/MD ..."
Abstract

Cited by 188 (12 self)
 Add to MetaCart
(Show Context)
We discuss Bayesian methods for learning Bayesian networks when data sets are incomplete. In particular, we examine asymptotic approximations for the marginal likelihood of incomplete data given a Bayesian network. We consider the Laplace approximation and the less accurate but more efficient BIC/MDL approximation. We also consider approximations proposed by Draper (1993) and Cheeseman and Stutz (1995). These approximations are as efficient as BIC/MDL, but their accuracy has not been studied in any depth. We compare the accuracy of these approximations under the assumption that the Laplace approximation is the most accurate. In experiments using synthetic data generated from discrete naiveBayes models having a hidden root node, we find that (1) the BIC/MDL measure is the least accurate, having a bias in favor of simple models, and (2) the Draper and CS measures are the most accurate. 1
Simulating Normalized Constants: From Importance Sampling to Bridge Sampling to Path Sampling
, 1998
"... Computing (ratios of) normalizing constants of probability models is a fundamental computational problem for many statistical and scientific studies. Monte Carlo simulation is an effective technique, especially with complex and highdimensional models. This paper aims to bring to the attention of ..."
Abstract

Cited by 183 (4 self)
 Add to MetaCart
Computing (ratios of) normalizing constants of probability models is a fundamental computational problem for many statistical and scientific studies. Monte Carlo simulation is an effective technique, especially with complex and highdimensional models. This paper aims to bring to the attention of general statistical audiences of some effective methods originating from theoretical physics and at the same time to explore these methods from a more statistical perspective, through establishing theoretical connections and illustrating their uses with statistical problems. We show that the acceptance ratio method and thermodynamic integration are natural generalizations of importance sampling, which is most familiar to statistical audiences. The former generalizes importance sampling through the use of a single “bridge ” density and is thus a case of bridge sampling in the sense of Meng and Wong. Thermodynamic integration, which is also known in the numerical analysis literature as Ogata’s method for highdimensional integration, corresponds to the use of infinitely many and continuously connected bridges (and thus a “path”). Our path sampling formulation offers more flexibility and thus potential efficiency to thermodynamic integration, and the search of optimal paths turns out to have close connections with the Jeffreys prior density and the Rao and Hellinger distances between two densities. We provide an informative theoretical example as well as two empirical examples (involving 17 to 70dimensional integrations) to illustrate the potential and implementation of path sampling. We also discuss some open problems.
Marginal Likelihood From the MetropolisHastings Output
 OUTPUT,JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2001
"... This article provides a framework for estimating the marginal likelihood for the purpose of Bayesian model comparisons. The approach extends and completes the method presented in Chib (1995) by overcoming the problems associated with the presence of intractable full conditional densities. The propos ..."
Abstract

Cited by 183 (16 self)
 Add to MetaCart
This article provides a framework for estimating the marginal likelihood for the purpose of Bayesian model comparisons. The approach extends and completes the method presented in Chib (1995) by overcoming the problems associated with the presence of intractable full conditional densities. The proposed method is developed in the context of MCMC chains produced by the Metropolis–Hastings algorithm, whose building blocks are used both for sampling and marginal likelihood estimation, thus economizing on prerun tuning effort and programming. Experiments involving the logit model for binary data, hierarchical random effects model for clustered Gaussian data, Poisson regression model for clustered count data, and the multivariate probit model for correlated binary data, are used to illustrate the performance and implementation of the method. These examples demonstrate that the method is practical and widely applicable.
Analysis of multivariate probit models
 BIOMETRIKA
, 1998
"... This paper provides a practical simulationbased Bayesian and nonBayesian analysis of correlated binary data using the multivariate probit model. The posterior distribution is simulated by Markov chain Monte Carlo methods and maximum likelihood estimates are obtained by a Monte Carlo version of the ..."
Abstract

Cited by 157 (13 self)
 Add to MetaCart
This paper provides a practical simulationbased Bayesian and nonBayesian analysis of correlated binary data using the multivariate probit model. The posterior distribution is simulated by Markov chain Monte Carlo methods and maximum likelihood estimates are obtained by a Monte Carlo version of the EM algorithm. A practical approach for the computation of Bayes factors from the simulation output is also developed. The methods are applied to a dataset with a bivariate binary response, to a fouryear longitudinal dataset from the Six Cities study of the health effects of air pollution and to a sevenvariate binary response dataset on the labour supply of married women from the Panel Survey of Income Dynamics.
Pachinko allocation: DAGstructured mixture models of topic correlations
 In Proceedings of the 23rd International Conference on Machine Learning
, 2006
"... Latent Dirichlet allocation (LDA) and other related topic models are increasingly popular tools for summarization and manifold discovery in discrete data. However, LDA does not capture correlations between topics. In this paper, we introduce the pachinko allocation model (PAM), which captures arbitr ..."
Abstract

Cited by 153 (8 self)
 Add to MetaCart
(Show Context)
Latent Dirichlet allocation (LDA) and other related topic models are increasingly popular tools for summarization and manifold discovery in discrete data. However, LDA does not capture correlations between topics. In this paper, we introduce the pachinko allocation model (PAM), which captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). The leaves of the DAG represent individual words in the vocabulary, while each interior node represents a correlation among its children, which may be words or other interior nodes (topics). PAM provides a flexible alternative to recent work by Blei and Lafferty (2006), which captures correlations only between pairs of topics. Using text data from newsgroups, historic NIPS proceedings and other research paper corpora, we show improved performance of PAM in document classification, likelihood of heldout data, the ability to support finergrained topics, and