Results 1 - 10
of
413
The Infinite Hidden Markov Model
- Machine Learning
, 2002
"... We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. Th ..."
Abstract
-
Cited by 375 (28 self)
- Add to MetaCart
We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. These three hyperparameters define a hierarchical Dirichlet process capable of capturing a rich set of transition dynamics. The three hyperparameters control the time scale of the dynamics, the sparsity of the underlying state-transition matrix, and the expected number of distinct hidden states in a finite sequence. In this framework it is also natural to allow the alphabet of emitted symbols to be infinite---consider, for example, symbols being possible words appearing in English text.
Hierarchical Dirichlet processes
- Journal of the American Statistical Association
, 2004
"... program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. 1 We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture comp ..."
Abstract
-
Cited by 328 (44 self)
- Add to MetaCart
program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. 1 We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the well-known clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we consider a hierarchical model, specifically one in which the base measure for the child Dirichlet processes is itself distributed according to a Dirichlet process. Such a base measure being discrete, the child Dirichlet processes necessar-ily share atoms. Thus, as desired, the mixture models in the different groups necessarily share mixture components. We discuss representations of hierarchical Dirichlet processes in terms of
Bayesian Density Estimation and Inference Using Mixtures
- Journal of the American Statistical Association
, 1994
"... We describe and illustrate Bayesian inference in models for density estimation using mixtures of Dirichlet processes. These models provide natural settings for density estimation, and are exemplified by special cases where data are modelled as a sample from mixtures of normal distributions. Efficien ..."
Abstract
-
Cited by 285 (16 self)
- Add to MetaCart
We describe and illustrate Bayesian inference in models for density estimation using mixtures of Dirichlet processes. These models provide natural settings for density estimation, and are exemplified by special cases where data are modelled as a sample from mixtures of normal distributions. Efficient simulation methods are used to approximate various prior, posterior and predictive distributions. This allows for direct inference on a variety of practical issues, including problems of local versus global smoothing, uncertainty about density estimates, assessment of modality, and the inference on the numbers of components. Also, convergence results are established for a general class of normal mixture models. Keywords: Kernel estimation; Mixtures of Dirichlet processes; Multimodality; Normal mixtures; Posterior sampling; Smoothing parameter estimation * Michael D. Escobar is Assistant Professor, Department of Statistics and Department of Preventive Medicine and Biostatistics, University ...
The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator.
, 1995
"... The two-parameter Poisson-Dirichlet distribution, denoted pd(ff; `), is a distribution on the set of decreasing positive sequences with sum 1. The usual Poisson-Dirichlet distribution with a single parameter `, introduced by Kingman, is pd(0; `). Known properties of pd(0; `), including the Markov ..."
Abstract
-
Cited by 162 (36 self)
- Add to MetaCart
The two-parameter Poisson-Dirichlet distribution, denoted pd(ff; `), is a distribution on the set of decreasing positive sequences with sum 1. The usual Poisson-Dirichlet distribution with a single parameter `, introduced by Kingman, is pd(0; `). Known properties of pd(0; `), including the Markov chain description due to Vershik-Shmidt-Ignatov, are generalized to the two-parameter case. The size-biased random permutation of pd(ff; `) is a simple residual allocation model proposed by Engen in the context of species diversity, and rediscovered by Perman and the authors in the study of excursions of Brownian motion and Bessel processes. For 0 ! ff ! 1, pd(ff; 0) is the asymptotic distribution of ranked lengths of excursions of a Markov chain away from a state whose recurrence time distribution is in the domain of attraction of a stable law of index ff. Formulae in this case trace back to work of Darling, Lamperti and Wendel in the 1950's and 60's. The distribution of ranked lengths of e...
Gibbs Sampling Methods for Stick-Breaking Priors
"... ... In this paper we present two general types of Gibbs samplers that can be used to fit posteriors of Bayesian hierarchical models based on stick-breaking priors. The first type of Gibbs sampler, referred to as a Polya urn Gibbs sampler, is a generalized version of a widely used Gibbs sampling meth ..."
Abstract
-
Cited by 160 (16 self)
- Add to MetaCart
... In this paper we present two general types of Gibbs samplers that can be used to fit posteriors of Bayesian hierarchical models based on stick-breaking priors. The first type of Gibbs sampler, referred to as a Polya urn Gibbs sampler, is a generalized version of a widely used Gibbs sampling method currently employed for Dirichlet process computing. This method applies to stick-breaking priors with a known P'olya urn characterization; that is priors with an explicit and simple prediction rule. Our second method, the blocked Gibbs sampler, is based on a entirely different approach that works by directly sampling values from the posterior of the random measure. The blocked Gibbs sampler can be viewed as a more general approach as it works without requiring an explicit prediction rule. We find that the blocked Gibbs avoids some of the limitations seen with the Polya urn approach and should be simpler for non-experts to use.
Hierarchical topic models and the nested Chinese restaurant process
- Advances in Neural Information Processing Systems
, 2004
"... We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested ..."
Abstract
-
Cited by 127 (19 self)
- Add to MetaCart
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior allows arbitrarily large branching factors and readily accommodates growing data collections. We build a hierarchical topic model by combining this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation. We illustrate our approach on simulated data and with an application to the modeling of NIPS abstracts. 1
The Infinite Gaussian Mixture Model
- In Advances in Neural Information Processing Systems 12
, 2000
"... In a Bayesian mixture model it is not necessary a priori to limit the number of components to be finite. In this paper an infinite Gaussian mixture model is presented which neatly sidesteps the difficult problem of finding the "right" number of mixture components. Inference in the model is done usin ..."
Abstract
-
Cited by 122 (6 self)
- Add to MetaCart
In a Bayesian mixture model it is not necessary a priori to limit the number of components to be finite. In this paper an infinite Gaussian mixture model is presented which neatly sidesteps the difficult problem of finding the "right" number of mixture components. Inference in the model is done using an efficient parameter-free Markov Chain that relies entirely on Gibbs sampling.
Iterated random functions
- SIAM Review
, 1999
"... Abstract. Iterated random functions are used to draw pictures or simulate large Ising models, among other applications. They offer a method for studying the steady state distribution of a Markov chain, and give useful bounds on rates of convergence in a variety of examples. The present paper surveys ..."
Abstract
-
Cited by 94 (1 self)
- Add to MetaCart
Abstract. Iterated random functions are used to draw pictures or simulate large Ising models, among other applications. They offer a method for studying the steady state distribution of a Markov chain, and give useful bounds on rates of convergence in a variety of examples. The present paper surveys the field and presents some new examples. There is a simple unifying idea: the iterates of random Lipschitz functions converge if the functions are contracting on the average. 1. Introduction. The
Variational inference for Dirichlet process mixtures
- Bayesian Analysis
, 2005
"... Abstract. Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of Monte-Carlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis prob ..."
Abstract
-
Cited by 90 (12 self)
- Add to MetaCart
Abstract. Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of Monte-Carlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis problems. However, MCMC sampling can be prohibitively slow, and it is important to explore alternatives. One class of alternatives is provided by variational methods, a class of deterministic algorithms that convert inference problems into optimization problems (Opper and Saad 2001; Wainwright and Jordan 2003). Thus far, variational methods have mainly been explored in the parametric setting, in particular within the formalism of the exponential family (Attias 2000; Ghahramani and Beal 2001; Blei et al. 2003). In this paper, we present a variational inference algorithm for DP mixtures. We present experiments that compare the algorithm to Gibbs sampling algorithms for DP mixtures of Gaussians and present an application to a large-scale image analysis problem.
Reputation-based framework for high integrity sensor networks
- In SASN ’04: Proceedings of the 2nd ACM workshop on Security of ad hoc and sensor networks
, 2004
"... The traditional approach of providing network security has been to borrow tools from cryptography and authentication. However, we argue that the conventional view of security based on cryptography alone is not sufficient for the unique characteristics and novel misbehaviors encountered in sensor net ..."
Abstract
-
Cited by 82 (6 self)
- Add to MetaCart
The traditional approach of providing network security has been to borrow tools from cryptography and authentication. However, we argue that the conventional view of security based on cryptography alone is not sufficient for the unique characteristics and novel misbehaviors encountered in sensor networks. Fundamental to this is the observation that cryptography cannot prevent malicious or non-malicious insertion of data from internal adversaries or faulty nodes. We believe that in general tools from different domains such as economics, statistics and data analysis will have to be combined with cryptography for the development of trustworthy sensor networks. Following this approach, we propose a reputation-based framework for sensor networks where nodes maintain reputation for other nodes and use it to evaluate their trustworthiness. We will show that this framework provides a scalable, diverse and a generalized approach for countering all types of misbehavior resulting from malicious and faulty nodes. We are currently developing a system within this framework where we employ a Bayesian formulation, specifically a beta reputation system, for reputation representation, updates and integration. We will explain the reasoning behind our design choices, analyzing their pros & cons. We conclude the paper by verifying the efficacy of this system through some preliminary simulation results.

