Results 1 - 10
of
104
Assessing Gene Significance from cDNA Microarray Expression Data via Mixed Models
, 2001
"... The determination of a list of differentially expressed genes is a basic objective in many cDNA microarray experiments. We present a statistical approach that allows direct control over the percentage of false positives in such a list and, under certain reasonable assumptions, improves on existing m ..."
Abstract
-
Cited by 68 (3 self)
- Add to MetaCart
The determination of a list of differentially expressed genes is a basic objective in many cDNA microarray experiments. We present a statistical approach that allows direct control over the percentage of false positives in such a list and, under certain reasonable assumptions, improves on existing methods with respect to the percentage of false negatives. The method accommodates a wide variety of experimental designs and can simultaneously assess significant differences between multiple types of biological samples. Two interconnected mixed linear models are central to the method and provide a flexible means to properly account for variability both across and within genes. The mixed model also provides a convenient framework for evaluating the statistical power of any particular experimental design and thus enables a researcher to a priori select an appropriate number of replicates. We also suggest some basic graphics for visualizing lists of significant genes. Analyses of published experiments studying human cancer and yeast cells illustrate the results.
Smoothing Spline Models for the Analysis of Nested and Crossed Samples of Curves
- Journal of the American Statistical Association
, 1998
"... We introduce a class of models for an additive decomposition of groups of curves strati ed by crossed and nested factors, generalizing smoothing splines to such samples by associating them with a corresponding mixed e ects model. The models are also useful for imputation of missing data and explorat ..."
Abstract
-
Cited by 56 (1 self)
- Add to MetaCart
We introduce a class of models for an additive decomposition of groups of curves strati ed by crossed and nested factors, generalizing smoothing splines to such samples by associating them with a corresponding mixed e ects model. The models are also useful for imputation of missing data and exploratory analysis of variance. We prove that the best linear unbiased predictors (BLUP) from the extended mixed e ects model correspond to solutions of a generalized penalized regression where smoothing parameters are directly related to variance components, and we show that these solutions are natural cubic splines. The model parameters are estimated using a highly e cient implementation of the EM algorithm for restricted maximum likelihood (REML) estimation based on a preliminary eigenvector decomposition. Variability of computed estimates can be assessed with asymptotic techniques or with a novel hierarchical bootstrap resampling scheme for nested mixed e ects models. Our methods are applied to menstrual cycle data from studies of reproductive function that measure daily urinary progesterone; the sample of progesterone curves is strati ed by cycles nested within subjects nested within conceptive and non-conceptive groups.
Improved statistical tests for differential gene expression by shrinking variance components estimates
- Biostatistics
, 2005
"... Combining information across genes in the statistical analysis of microarray data is desirable because of the relatively small number of data points obtained for each individual gene. Here we develop an estimator of the error variance that can borrow information across genes using the James-Stein-Li ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
Combining information across genes in the statistical analysis of microarray data is desirable because of the relatively small number of data points obtained for each individual gene. Here we develop an estimator of the error variance that can borrow information across genes using the James-Stein-Lindley shrinkage concept. A new test statistic (FS) is constructed using this estimator. The new statistic is compared with other statistics used to test for differential expression, namely the gene-specific F test (F1), the pooled-variance F statistic (F3), and a hybrid statistic (F2) that uses the average of the individual and pooled variances. The FS test shows best or nearly best power for detecting differentially expressed genes over a wide range of simulated data in which the variance components associated with individual genes are either homo-geneous or heterogeneous. Thus FS provides a powerful and robust approach to test differential expression of genes that utilizes information not available in individual gene testing approaches and does not suffer from biases of the pooled variance approach.
Geoadditive Models
, 2000
"... this paper is a recent article on model-based geostatistics by Diggle, Tawn and Moyeed (1998) where pure kriging (i.e. no covariates) is the focus. Our paper inherits some of its aspects: model-based and with mixed model connections. In particular the comment by Bowman (1998) in the ensuing discussi ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
this paper is a recent article on model-based geostatistics by Diggle, Tawn and Moyeed (1998) where pure kriging (i.e. no covariates) is the focus. Our paper inherits some of its aspects: model-based and with mixed model connections. In particular the comment by Bowman (1998) in the ensuing discussion suggested that additive modelling would be a worthwhile extension. This paper essentially follows this suggestion. However, this paper is not the first to combine the notions of geostatistics and additive modelling. References known to us are Kelsall and Diggle (1998), Durban Reguera (1998) and Durban, Hackett, Currie and Newton (2000). Nevertheless, we believe that our approach has a number of attractive features (see (1)-(4) above), not all shared by these references. Section 2 describes the motivating application and data in detail. Section 3 shows how one can express additive models as a mixed model, while Section 4 does the same for kriging and merges the two into the geoadditive model. Issues concerning the amount of smoothing are discussed in Section 5 and inferential aspects are treated in Section 6. Our analysis of the Upper Cape Cod reproductive data is presented in Section 7. Section 8 discusses extension to the generalised context.We close the paper with some disussion in Section 9. 2 Description of the application and data
Geometric Ergodicity of Gibbs and Block Gibbs Samplers for a Hierarchical Random Effects Model
, 1998
"... We consider fixed scan Gibbs and block Gibbs samplers for a Bayesian hierarchical random effects model with proper conjugate priors. A drift condition given in Meyn and Tweedie (1993, Chapter 15) is used to show that these Markov chains are geometrically ergodic. Showing that a Gibbs sampler is geom ..."
Abstract
-
Cited by 22 (7 self)
- Add to MetaCart
We consider fixed scan Gibbs and block Gibbs samplers for a Bayesian hierarchical random effects model with proper conjugate priors. A drift condition given in Meyn and Tweedie (1993, Chapter 15) is used to show that these Markov chains are geometrically ergodic. Showing that a Gibbs sampler is geometrically ergodic is the first step towards establishing central limit theorems, which can be used to approximate the error associated with Monte Carlo estimates of posterior quantities of interest. Thus, our results will be of practical interest to researchers using these Gibbs samplers for Bayesian data analysis. Key words and phrases: Bayesian model, Central limit theorem, Drift condition, Markov chain, Monte Carlo, Rate of convergence, Variance Components AMS 1991 subject classifications: Primary 60J27, secondary 62F15 1 Introduction Gelfand and Smith (1990, Section 3.4) introduced the Gibbs sampler for the hierarchical one-way random effects model with proper conjugate priors. Rosen...
General Multi-Level Linear Modelling for Group Analysis in FMRI
- NeuroImage
, 2003
"... This paper discusses general modelling of multi-subject and/or multi-session FMRI data. In particular, we show that a two-level mixed-effects model (where parameters of interest at the group level are estimated from parameter and variance estimates from the single-session level) can be made equivale ..."
Abstract
-
Cited by 22 (6 self)
- Add to MetaCart
This paper discusses general modelling of multi-subject and/or multi-session FMRI data. In particular, we show that a two-level mixed-effects model (where parameters of interest at the group level are estimated from parameter and variance estimates from the single-session level) can be made equivalent to a single complete mixed-effects model (where parameters of interest at the group level are estimated directly from all of the original single-sessions' time-series data) if the (co-)variance at the second level is set equal to the sum of the (co-)variances in the single-level form, using the BLUE with known covariances. This result has significant implications for group studies in FMRI, since it shows that the group analysis only requires values of the parameter estimates and their (co-)variance from the first level, generalising the well established 'summary statistics' approach in FMRI. The simple and generalised framework allows for different pre-whitening and different first-level regressors to be used for each subject. The framework incorporates multiple levels and cases such as repeated measures, paired or unpaired t-tests and F -tests at the group level; explicit examples of such models are given in the paper. Using numerical simulations based on typical first level covariance structures from real FMRI data we demonstrate that by taking into account lower-level covariances and heterogeneity a substantial increase in higher-level Z-score is possible. 1
Assortment: An Attribute-Based Approach
- Carnegie Mellon University
, 2000
"... Most supermarket categories are cluttered with items or SKUs (stock-keeping units) that differ very little at the attribute level. Previous research has found that reductions (up to 54%) in the number of low-selling SKUs need not affect perceptions of variety, and therefore sales, significantly. In ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Most supermarket categories are cluttered with items or SKUs (stock-keeping units) that differ very little at the attribute level. Previous research has found that reductions (up to 54%) in the number of low-selling SKUs need not affect perceptions of variety, and therefore sales, significantly. In this research, we analyze data from a natural experiment conducted by an online grocer in which 94% of the categories experienced dramatic cuts in the number of SKUs offered, particularly low-selling SKUs. We find sales were indeed affected dramatically, with sales increasing an average of 11% across the 42 categories examined. Sales rose in more than two-thirds of these categories, with nearly half experiencing an increase of 10% or more; 75% of households increased their overall expenditures after the cut in SKUs. In turn, we examined how different types of SKU reductions -- defined by how the cuts affect the available attributes or features of a category (e.g., the number of brands) -- af...
Likelihood Ratio Tests in Linear Mixed Models With One Variance Component
, 2003
"... this paper but the results were practically indistinguishable, indicating that our choice of grid provides an accurate approximation ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
this paper but the results were practically indistinguishable, indicating that our choice of grid provides an accurate approximation
Estimation of a Common Mean and Weighted Means Statistics
- Working Paper WP--2--96, National Institute of Standards and Technology, Statistical Engineering Division
, 1996
"... Measurements made by several laboratories may exhibit non-negligible between-laboratory variability, as well as different within-laboratory variances. Also, the number of measurements made at each laboratory often differ. A question of fundamental importance in the analysis of such data is how to fo ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Measurements made by several laboratories may exhibit non-negligible between-laboratory variability, as well as different within-laboratory variances. Also, the number of measurements made at each laboratory often differ. A question of fundamental importance in the analysis of such data is how to form a best consensus mean, and what uncertainty to attach to this estimate. An estimation equation approach due to Mandel and Paule is often used at the National Institute of Standards and Technology (NIST), particularly when certifying standard reference materials. Primary goals of this work are to study the theoretical properties of this method, and to compare it with some alternative methods, in particular to the maximum likelihood estimator. Towards this end, we show that the Mandel-Paule solution can be interpreted as a simplified version of the maximum likelihood method. A class of weighted means statistics is investigated for situations where the number of laboratories is large. This c...
A survey of Monte Carlo algorithms for maximizing the likelihood of a two-stage hierarchical model
, 2001
"... Likelihood inference with hierarchical models is often complicated by the fact that the likelihood function involves intractable integrals. Numerical integration (e.g. quadrature) is an option if the dimension of the integral is low but quickly becomes unreliable as the dimension grows. An alternati ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Likelihood inference with hierarchical models is often complicated by the fact that the likelihood function involves intractable integrals. Numerical integration (e.g. quadrature) is an option if the dimension of the integral is low but quickly becomes unreliable as the dimension grows. An alternative approach is to approximate the intractable integrals using Monte Carlo averages. Several dierent algorithms based on this idea have been proposed. In this paper we discuss the relative merits of simulated maximum likelihood, Monte Carlo EM, Monte Carlo Newton-Raphson and stochastic approximation. Key words and phrases : Eciency, Monte Carlo EM, Monte Carlo Newton-Raphson, Rate of convergence, Simulated maximum likelihood, Stochastic approximation All three authors partially supported by NSF Grant DMS-00-72827. 1 1

