Results 1  10
of
11
Nonlinear Models Using Dirichlet Process Mixtures
"... We introduce a new nonlinear model for classification, in which we model the joint distribution of response variable, y, and covariates, x, nonparametrically using Dirichlet process mixtures. We keep the relationship between y and x linear within each component of the mixture. The overall relations ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
We introduce a new nonlinear model for classification, in which we model the joint distribution of response variable, y, and covariates, x, nonparametrically using Dirichlet process mixtures. We keep the relationship between y and x linear within each component of the mixture. The overall relationship becomes nonlinear if the mixture contains more than one component, with different regression coefficients. We use simulated data to compare the performance of this new approach to alternative methods such as multinomial logit (MNL) models, decision trees, and support vector machines. We also evaluate our approach on two classification problems: identifying the folding class of protein sequences and detecting Parkinson’s disease. Our model can sometimes improve predictive accuracy. Moreover, by grouping observations into subpopulations (i.e., mixture components), our model can sometimes provide insight into hidden structure in the data.
Variable Selection in Nonparametric Random Effects Models
"... In analyzing longitudinal or clustered data with a mixed effects model (Laird and Ware, 1982), one may be concerned about violations of normality. Such violations can potentially impact subset selection for the fixed and random effects components of the model, inferences on the heterogeneity structu ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
In analyzing longitudinal or clustered data with a mixed effects model (Laird and Ware, 1982), one may be concerned about violations of normality. Such violations can potentially impact subset selection for the fixed and random effects components of the model, inferences on the heterogeneity structure, and the accuracy of predictions. This article focuses on Bayesian methods for subset selection in nonparametric random effects models in which one is uncertain about the predictors to be included and the distribution of their random effects. We characterize the unknown distribution of the individualspecific regression coefficients using a weighted sum of Dirichlet process (DP)distributed latent variables. By using carefullychosen mixture priors for coefficients in the base distributions of the component DPs, we allow fixed and random effects to be effectively dropped out of the model. A stochastic search Gibbs sampler is developed for posterior computation, and the methods are illustrated using simulated data and real data from a multilaboratory bioassay study.
BAYESIAN METHODS TO IMPUTE MISSING COVARIATES FOR CAUSAL INFERENCE AND MODEL SELECTION
, 2008
"... This thesis presents new approaches to deal with missing covariate data in two situations; matching in observational studies and model selection for generalized linear models. In observational studies, inferences about treatment effects are often affected by confounding covariates. Analysts can redu ..."
Abstract
 Add to MetaCart
This thesis presents new approaches to deal with missing covariate data in two situations; matching in observational studies and model selection for generalized linear models. In observational studies, inferences about treatment effects are often affected by confounding covariates. Analysts can reduce bias due to differences in control and treated units ’ observed covariates using propensity score matching, which results in a matched control group with similar characteristics to the treated group. Propensity scores are typically estimated from the data using a logistic regression. When covariates are partially observed, missing values can be filled in using multiple imputation. Analysts can estimate propensity scores from the imputed data sets to find a matched control set. Typically, in observational studies, covariates are spread thinly over a large space. It is not always clear what an appropriate imputation model for the missing data should be. Implausible imputations can influence the matches selected and hence the estimate of the treatment effect. In propensity score matching, units tend to be selected from among those lying in the treated units ’ covariate space.
Statistics and Computing manuscript No. (will be inserted by the editor) Robust Estimation of the Correlation Matrix of Longitudinal
"... Abstract We propose a doublerobust procedure for modeling the correlation matrix of a longitudinal dataset. It is based on an alternative Cholesky decomposition of the form Σ = DLL ⊤ D where D is a diagonal matrix proportional to the square roots of the diagonal entries of Σ and L is a unit lowert ..."
Abstract
 Add to MetaCart
Abstract We propose a doublerobust procedure for modeling the correlation matrix of a longitudinal dataset. It is based on an alternative Cholesky decomposition of the form Σ = DLL ⊤ D where D is a diagonal matrix proportional to the square roots of the diagonal entries of Σ and L is a unit lowertriangular matrix determining solely the correlation matrix. The first robustness is with respect to model misspecification for the innovation variances in D, and the second is robustness to outliers in the data. The latter is handled using heavytailed multivariate tdistributions with unknown degrees of freedom. We develop a Fisher scoring algorithm for computing the maximum likelihood estimator of the parameters when the nonredundant and unconstrained entries of (L, D) are modeled parsimoniously using covariates. We compare our results with those based on the modified Cholesky decomposition of the form LD 2 L ⊤ using simulations and a real dataset.
and
, 2006
"... SUMMARY. We address the problem of selecting which variables should be included in the fixed and random components of logistic mixed effects models for correlated data. A fully Bayesian variable selection is implemented using a stochastic search Gibbs sampler to estimate the exact modelaveraged pos ..."
Abstract
 Add to MetaCart
SUMMARY. We address the problem of selecting which variables should be included in the fixed and random components of logistic mixed effects models for correlated data. A fully Bayesian variable selection is implemented using a stochastic search Gibbs sampler to estimate the exact modelaveraged posterior distribution. This approach automatically identifies subsets of predictors having nonzero fixed effect coefficients or nonzero random effects variance, while allowing uncertainty in the model selection process. Default priors are proposed for the variance components and an efficient parameter expansion Gibbs sampler is developed for posterior computation. The approach is illustrated using simulated data and an epidemiologic example.
Bayesian Models for Variable Selection that Incorporate Biological Information
"... Variable selection has been the focus of much research in recent years. Bayesian methods have found many successful applications, particularly in situations where the amount of measured variables can be much greater than the number of observations. One such example is the analysis of genomics data. ..."
Abstract
 Add to MetaCart
Variable selection has been the focus of much research in recent years. Bayesian methods have found many successful applications, particularly in situations where the amount of measured variables can be much greater than the number of observations. One such example is the analysis of genomics data. In this paper we first review Bayesian variable selection methods for linear settings, including regression and classification models. We focus in particular on recent prior constructions that have been used for the analysis of genomic data and briefly describe two novel applications that integrate different sources of biological information into the analysis of experimental data. Next, we address variable selection for a different modeling context, i.e. mixture models. We address both clustering and discriminant analysis settings and conclude with an application to gene expression data for patients affected by leukemia.
MODEL SELECTION, COVARIANCE SELECTION AND BAYES CLASSIFICATION VIA SHRINKAGE
, 2006
"... The naive Bayes classifier (NB) has exhibited its “mysterious ” but outstanding classification ability in practice, in spite of its often unrealistic conditional independence assumption. This simple assumption implies the adoption of a diagonal structure for the underlying classspecific precision ..."
Abstract
 Add to MetaCart
(Show Context)
The naive Bayes classifier (NB) has exhibited its “mysterious ” but outstanding classification ability in practice, in spite of its often unrealistic conditional independence assumption. This simple assumption implies the adoption of a diagonal structure for the underlying classspecific precision matrices. However, the NB leaves covariates interrelationships unrevealed. In this dissertation, we will extend the NB from the perspectives of covariance modeling and classification. Due to the positive definiteness constraint and the rapidlygrowing number of parameters with dimensions, covariance estimation in a multivariate normal population has been a classic but challenging statistical problem. Sparse shrinkage covariance/precision matrix estimation has been obeyed as an important principle in covariance/precision matrix modeling. However, many existing models can only shrink the covariance/precision matrix toward a predefined diagonal structure. We model a precision matrix via its Cholesky decomposition in terms of compositional regression coefficient matrix and error precisions. Our approach aims at estimating
, 1205–1212 doi: 10.1111/j.13652664.2008.01487.x © 2008 The Authors. Journal compilation © 2008 British Ecological Society
"... Testing the use of interviews as a tool for monitoring trends in the harvesting of wild species ..."
Abstract
 Add to MetaCart
(Show Context)
Testing the use of interviews as a tool for monitoring trends in the harvesting of wild species
An exploration of fixed and random effects selection for longitu dinal binary outcomes in the presence of nonignorable dropout
"... We explore a Bayesian approach to fixed and random effects selection for longitudinal binary outcomes that are subject to missing data caused by dropouts. We show via analytic results for a simple example that nonignorable missing data lead to biased parameter estimates and thus result in selection ..."
Abstract
 Add to MetaCart
We explore a Bayesian approach to fixed and random effects selection for longitudinal binary outcomes that are subject to missing data caused by dropouts. We show via analytic results for a simple example that nonignorable missing data lead to biased parameter estimates and thus result in selection of the wrong effects asymptotically and confirmed these results via simulation for more complex settings. By jointly modeling the longitudinal binary data with the dropout process, one is able to correct the bias in estimation and selection of fixed and random effects if the missing data are nonignorable. We illustrate the approach using a clinical trial for acute ischemic stroke.