Results 1 - 10
of
23
Hierarchical Priors and Mixture Models, With Application in Regression and Density Estimation
, 1993
"... ..."
A Hierarchical Dirichlet Language Model
- Natural Language Engineering
, 1994
"... We discuss a hierarchical probabilistic model whose predictions are similar to those of the popular language modelling procedure known as `smoothing'. A number of interesting differences from smoothing emerge. The insights gained from a probabilistic view of this problem point towards new directions ..."
Abstract
-
Cited by 67 (3 self)
- Add to MetaCart
We discuss a hierarchical probabilistic model whose predictions are similar to those of the popular language modelling procedure known as `smoothing'. A number of interesting differences from smoothing emerge. The insights gained from a probabilistic view of this problem point towards new directions for language modelling. The ideas of this paper are also applicable to other problems such as the modelling of triphomes in speech, and DNA and protein sequences in molecular biology. The new algorithm is compared with smoothing on a two million word corpus. The methods prove to be about equally accurate, with the hierarchical model using fewer computational resources. Contents 1 Introduction 2 1.1 The bigram language model with smoothing 2 1.2 Any rational predictive procedure can be made Bayesian 3 2 An explicit model using Dirichlet priors 4 2.1 The inferences we will make 4 2.2 The likelihood function 5 2.3 What prior? 5 2.4 A convenient family of priors: Dirichlet distributions 5 2.5 ...
Bayesian density regression
- JOURNAL OF THE ROYAL STATISTICAL SOCIETY B
, 2007
"... This article considers Bayesian methods for density regression, allowing a random probability distribution to change flexibly with multiple predictors. The conditional response dis-tribution is expressed as a nonparametric mixture of parametric densities, with the mixture distri-bution changing acc ..."
Abstract
-
Cited by 27 (17 self)
- Add to MetaCart
This article considers Bayesian methods for density regression, allowing a random probability distribution to change flexibly with multiple predictors. The conditional response dis-tribution is expressed as a nonparametric mixture of parametric densities, with the mixture distri-bution changing according to location in the predictor space. A new class of priors for dependent random measures is proposed for the collection of random mixing measures at each location. The conditional prior for the random measure at a given location is expressed as a mixture of a Dirichlet process (DP) distributed innovation measure and neighboring random measures. This specifica-tion results in a coherent prior for the joint measure, with the marginal random measure at each location being a finite mixture of DP basis measures. Integrating out the infinite-dimensional col-lection of mixing measures, we obtain a simple expression for the conditional distribution of the subject-specific random variables, which generalizes the Pólya urn scheme. Properties are consid-ered and a simple Gibbs sampling algorithm is developed for posterior computation. The methods are illustrated using simulated data examples and epidemiologic studies.
Computing Nonparametric Hierarchical Models
, 1998
"... Bayesian models involving Dirichlet process mixtures are at the heart of the modern nonparametric Bayesian movement. Much of the rapid development of these models in the last decade has been a direct result of advances in simulation-based computational methods. Some of the very early work in thi ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
Bayesian models involving Dirichlet process mixtures are at the heart of the modern nonparametric Bayesian movement. Much of the rapid development of these models in the last decade has been a direct result of advances in simulation-based computational methods. Some of the very early work in this area, circa 1988-1991, focused on the use of such nonparametric ideas and models in applications of otherwise standard hierarchical models. This chapter provides some historical review and perspective on these developments, with a prime focus on the use and integration of such nonparametric ideas in hierarchical models. We illustrate the ease with which the strict parametric assumptions common to most standard Bayesian hierarchical models can be relaxed to incorporate uncertainties about functional forms using Dirichlet process components, partly enabled by the approach to computation using MCMC methods. The resulting methology is illustrated with two examples taken from an unpub...
A bayesian model for supervised clustering with the dirichlet process prior
- Journal of Machine Learning Research
, 2005
"... We develop a Bayesian framework for tackling the supervised clustering problem, the generic problem encountered in problems such as reference matching, coreference resolution, identity uncertainty and record linkage. Our clustering model is based on the Dirichlet process prior, which enables us to d ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
We develop a Bayesian framework for tackling the supervised clustering problem, the generic problem encountered in problems such as reference matching, coreference resolution, identity uncertainty and record linkage. Our clustering model is based on the Dirichlet process prior, which enables us to define distributions over the countably infinite sets that naturally arise in this problem. We add supervision to our model by positing the existence of a set of unobserved random variables (we call these “reference types”) that are generic across all clusters. Inference in our framework, which require integrating over infinitely many parameters, is solved using Markov chain Monte Carlo techniques. We present algorithms for both conjugate and non-conjugate priors. We present a simple – but general – parameterization of our model based on a Gaussian assumption. We evaluate this model on one artificial task and three real-world tasks, comparing it against both unsupervised and state-of-the-art supervised algorithms. Our results show that our model is able to outperform other models across a variety of tasks, performance metrics, and problem settings Keywords:
Semiparametric Bayesian Analysis: Selection Models And Meteorological Applications
, 1998
"... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Selection Models . . . . . . . . . . . . . . . . . . . . . ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Selection Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.3 Modeling Ozone Profiles . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Dirichlet Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4.1 Definition and properties . . . . . . . . . . . . . . . . . . . . . 5 1.4.2 Computation with Dirichlet processes . . . . . . . . . . . . . . 9 1.4.3 Applications of Dirichlet processes . . . . . . . . . . . . . . . . 10 2. SEMIPARAMETRIC BAYESIAN ANALYSIS OF SELECTION MODELS 12 2.1 Notation and models . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2 Some Aspects of the Model . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.1 Sampling plans . . . . . . . . . . . . . . . . . . . . . . . . . . 1...
A Bayesian population model with hierarchical mixture priors applied to blood count data
- JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1997
"... Population pharmacokinetic and pharmacodynamic studies require one to analyze nonlinear growth curves fit to multiple measurements from study subjects. We propose a class of nonlinear population models with nonparametric second-stage priors for analyzing such data. The proposed models apply a flexib ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Population pharmacokinetic and pharmacodynamic studies require one to analyze nonlinear growth curves fit to multiple measurements from study subjects. We propose a class of nonlinear population models with nonparametric second-stage priors for analyzing such data. The proposed models apply a flexible class of mixtures to implement the nonparametric second stage. The discussion is based on a pharmacodynamic study involving longitudinal data consisting of hematologic profiles (i.e., blood counts measured over time) of cancer patients undergoing chemotherapy. We describe a full posterior analysis in a Bayesian framework. This includes prediction of future observations (profiles and end points for new patients), estimation of the mean response function for observed individuals, and inference on population characteristics. The mixture model is specified and given a hyperprior distribution by means of a Dirichlet processes prior on the mixing measure. Estimation is implemented by a combinat...
Non-parametric Bayesian kernel models
- Discussion Paper 2005-09, Duke University ISDS
, 2007
"... Kernel models for classification and regression have emerged as widely applied tools in statistics and machine learning. We discuss a Bayesian framework and theory for kernel methods, providing a new rationalisation of kernel regression based on non-parametric Bayesian models. Functional analytic re ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Kernel models for classification and regression have emerged as widely applied tools in statistics and machine learning. We discuss a Bayesian framework and theory for kernel methods, providing a new rationalisation of kernel regression based on non-parametric Bayesian models. Functional analytic results ensure that such a non-parametric prior specification induces a class of functions that span the reproducing kernel Hilbert space corresponding to the selected kernel. Bayesian analysis of the model allows for direct and formal inference on the uncertain re-gression or classification functions. Augmenting the model with Bayesian vari-able selection priors over kernel bandwidth parameters extends the framework to automatically address the key practical questions of kernel feature selection. Novel, customised MCMC methods are detailed and used in example analyses. The practical benefits and modelling flexibility of the Bayesian kernel framework are illustrated in both simulated and real data examples that address prediction and classification inference with high-dimensional data.
Characterizing the function space for Bayesian kernel models
- Duke University, Institute of Statistics and Decision Sciences
, 2006
"... Kernel methods have been very popular in the machine learning literature in the last ten years, mainly in the context of Tikhonov regularization algorithms. In this paper we study a coherent Bayesian kernel model based on an integral operator defined as the convolution of a kernel with a signed meas ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Kernel methods have been very popular in the machine learning literature in the last ten years, mainly in the context of Tikhonov regularization algorithms. In this paper we study a coherent Bayesian kernel model based on an integral operator defined as the convolution of a kernel with a signed measure. Priors on the random signed measures correspond to prior distributions on the functions mapped by the integral operator. We study several classes of signed measures and their image mapped by the integral operator. In particular, we identify a general class of measures whose image is dense in the reproducing kernel Hilbert space (RKHS) induced by the kernel. A consequence of this result is a function theoretic foundation for using non-parametric prior specifications in Bayesian modeling, such as Gaussian process and Dirichlet process prior distributions. We discuss the construction of priors on spaces of signed measures using Gaussian and Lévy processes, with the Dirichlet processes being a special case the latter. Computational issues involved with sampling from the posterior distribution are outlined for a univariate regression and a high dimensional

