Results 1  10
of
95
Linear models and empirical Bayes methods for assessing differential expression in microarray experiments
 STAT. APPL. GENET. MOL. BIOL
, 2004
"... ..."
Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments
 STATISTICA SINICA
, 2002
"... DNA microarrays are a new and promising biotechnology whichallows the monitoring of expression levels in cells for thousands of genes simultaneously. The present paper describes statistical methods for the identification of differentially expressed genes in replicated cDNA microarray experiments. A ..."
Abstract

Cited by 256 (10 self)
 Add to MetaCart
DNA microarrays are a new and promising biotechnology whichallows the monitoring of expression levels in cells for thousands of genes simultaneously. The present paper describes statistical methods for the identification of differentially expressed genes in replicated cDNA microarray experiments. Although it is not the main focus of the paper, new methods for the important preprocessing steps of image analysis and normalization are proposed. Given suitably normalized data, the biological question of differential expression is restated as a problem in multiple hypothesis testing: the simultaneous test for each gene of the null hypothesis of no association between the expression levels and responses or covariates of interest. Di erentially expressed genes are identified based on adjusted pvalues for a multiple testing procedure which strongly controls the familywise Type I error rate and takes into account the dependence structure between the gene expression levels. No specific parametric form is assumed for the distribution of the test statistics and a permutation procedure is used to estimate adjusted pvalues. Several data displays are suggested for the visual identification of differentially expressed genes and of important features of these genes. The above methods are applied to microarray data from a study of gene expression in the livers of mice with very low HDL cholesterol levels. The genes identified using data from multiple slides are compared to those identified by recently published singleslide methods.
Use of withinarray replicate spots for assessing differential expression in microarray experiments
 Bioinformatics
, 2005
"... Motivation. Spotted arrays are often printed with probes in duplicate or triplicate, but current methods for assessing differential expression are not able to make full use of the resulting information. Usual practice is to average the duplicate or triplicate results for each probe before assessing ..."
Abstract

Cited by 86 (3 self)
 Add to MetaCart
Motivation. Spotted arrays are often printed with probes in duplicate or triplicate, but current methods for assessing differential expression are not able to make full use of the resulting information. Usual practice is to average the duplicate or triplicate results for each probe before assessing differential expression. This loses valuable information about genewise variability. Results. A method is proposed for extracting more information from withinarray replicate spots in microarray experiments by estimating the strength of the correlation between them. The method involves fitting separate linear models to the expression data for each gene but with a common value for the betweenreplicate correlation. The method greatly improves the precision with which the genewise variances are estimated and thereby improves inference methods designed to identify differentially expressed genes. The method may be combined with empirical Bayes methods for moderating the genewise variances between genes. The method is validated using data from a microarray experiment involving calibration and ratio control spots in conjunction with spikedin RNA. Comparing results for calibration and ratio control spots shows that the common correlation method results in substantially better discrimination of differentially expressed genes from those which are not. The spikein experiment also confirms that the results may be further improved by empirical Bayes smoothing of the variances when the sample size is small. Availability. The methodology is implemented in the limma software package for R, available from the CRAN repository
Normalization of cDNA microarray data
 Methods
, 2003
"... Normalization means to adjust microarray data for effects which arise from variation in the technology rather than from biological differences between the RNA samples or between the printed probes. This article describes normalization methods based on the fact that dye balance typically varies with ..."
Abstract

Cited by 84 (2 self)
 Add to MetaCart
Normalization means to adjust microarray data for effects which arise from variation in the technology rather than from biological differences between the RNA samples or between the printed probes. This article describes normalization methods based on the fact that dye balance typically varies with spot intensity and with spatial position on the array. Printtip loess normalization provides a welltested general purpose normalization method which has given good results on a wide range of arrays. The method may be refined by using quality weights for individual spots. The method is best combined with diagnostic plots of the data which display the spatial and intensity trends. When diagnostic plots show that biases still remain in the data after normalization, further normalization steps such as plateorder normalization or scalenormalization between the arrays may be undertaken. Composite normalization may be used when control spots are available which are known to be not differentially expressed. Variations on loess normalization include global loess normalization and 2D normalization. Detailed commands are given to implement the normalization techniques using freely available software. 1
Statistical Issues in cDNA Microarray Data Analysis
, 2003
"... This article summarizes some of the issues involved and provides a brief review of the analysis tools which are available to researchers to deal with them. Any microarray experiment involves a number of distinct stages. Firstly there is the design of the experiment. The researchers must decide which ..."
Abstract

Cited by 56 (3 self)
 Add to MetaCart
This article summarizes some of the issues involved and provides a brief review of the analysis tools which are available to researchers to deal with them. Any microarray experiment involves a number of distinct stages. Firstly there is the design of the experiment. The researchers must decide which genes are to be printed on the arrays, which sources of RNA are to be hybridized to the arrays and on how many arrays the hybridizations will be replicated. Secondly, after hybridization, there follows a number of datacleaning steps or `lowlevel analysis' of the microarray data. The microarray images must be processed to acquire red and green foreground and background intensities for each spot. The acquired red/green ratios must be normalized to adjust for dyebias and for any systematic variation other than that due to the differences between the RNA samples being studied. Thirdly, the normalized ratios are analyzed by various graphical and numerical means to select differentially expressed genes or to find groups of genes whose expression profiles can reliably classify the different RNA sources into meaningful groups. The sections of this article correspond roughly to the various analysis steps. The following notation will be used throughout the article. The foreground red and green intensities will be written Pp and 9p for each spot. The background intensities will be Pf and 9f . The backgroundcorrected intensities will be P and 9 where usually P Pp Pf 0 # and 9 9p 9f 0 # . The logdifferential expression ratio will be vyq # E P 9 0 for each spot. Finally, the logintensity of the spot will be vyq 3 P9 0 , a measure of the overall brightness of the spot. (The letter E is a mnemonic for minus as vyq vyq E P 9 0 # while 3 is a mnemonic for add as #vyq vyq #...
Optimal Sample Size for Multiple Testing: the Case of Gene Expression Microarrays
 Journal of the American Statistical Association
, 2004
"... We consider the choice of an optimal sample size for multiple comparison problems. The motivating application is the choice of the number of microarray experiments to be carried out when learning about dierential gene expression. However, the approach is valid in any application that involves multip ..."
Abstract

Cited by 38 (2 self)
 Add to MetaCart
We consider the choice of an optimal sample size for multiple comparison problems. The motivating application is the choice of the number of microarray experiments to be carried out when learning about dierential gene expression. However, the approach is valid in any application that involves multiple comparison in a large number of hypothesis tests.
A Bayesian mixture model for differential gene expression
 Journal of the Royal Statistical Society C
, 2005
"... We propose modelbased inference for differential gene expression, using a nonparametric Bayesian probability model for the distribution of gene intensities under different conditions. The probability model is essentially a mixture of normals. The resulting inference is similar to the empirical Bay ..."
Abstract

Cited by 29 (4 self)
 Add to MetaCart
We propose modelbased inference for differential gene expression, using a nonparametric Bayesian probability model for the distribution of gene intensities under different conditions. The probability model is essentially a mixture of normals. The resulting inference is similar to the empirical Bayes approach proposed in Efron et al. (2001). The use of fully modelbased inference mitigates some of the necessary limitations of the empirical Bayes method. However, the increased generality of our method comes at a price. Computation is not as straightforward as in the empirical Bayes scheme. But we argue that inference is no more difficult than posterior simulation in traditional nonparametric mixture of normal models. We illustrate the proposed method in two examples, including a simulation study and a microarray experiment to screen for genes with differential expression in colon cancer versus normal tissue (Alon et al., 1999).
Bayesian robust inference for differential gene expression in microarrays with multiple samples
 Biometrics
"... We consider the problem of identifying differentially expressed genes under different conditions using gene expression microarrays. Because of the many steps involved in the experimental process, from hybridization to image analysis, cDNA microarray data often contain outliers. For example, an outly ..."
Abstract

Cited by 25 (5 self)
 Add to MetaCart
We consider the problem of identifying differentially expressed genes under different conditions using gene expression microarrays. Because of the many steps involved in the experimental process, from hybridization to image analysis, cDNA microarray data often contain outliers. For example, an outlying data value could occur because of scratches or dust on the surface, imperfections in the glass, or imperfections in the array production. We develop a robust Bayesian hierarchical model for testing for differential expression. Errors are modeled explicitly using a tdistribution, which accounts for outliers. The model includes an exchangeable prior for the variances which allow different variances for the genes but still shrink extreme empirical variances. Our model can be used for testing for differentially expressed genes among multiple samples, and it can distinguish between the different possible patterns of differential expression when there are three or more samples. Parameter estimation is carried out using a novel version of Markov chain Monte Carlo that is appropriate when the model puts mass on subspaces of the full parameter space. The method is illustrated using two publicly available gene expression data sets. We compare our method to six other baseline and commonly used techniques, namely the ttest, the Bonferroniadjusted ttest, Significance Analysis of Microarrays (SAM), Efron’s empirical Bayes, and EBarrays in both its LognormalNormal and GammaGamma forms. In an experiment with HIV data, our method performed better than these alternatives, on the basis of betweenreplicate agreement and disagreement.
Estimating the null and the proportion of nonnull effects in largescale multiple comparisons
 J. Amer. Statist. Assoc
, 2007
"... An important issue raised by Efron [7] in the context of largescale multiple comparisons is that in many applications the usual assumption that the null distribution is known is incorrect, and seemingly negligible differences in the null may result in large differences in subsequent studies. This s ..."
Abstract

Cited by 20 (5 self)
 Add to MetaCart
An important issue raised by Efron [7] in the context of largescale multiple comparisons is that in many applications the usual assumption that the null distribution is known is incorrect, and seemingly negligible differences in the null may result in large differences in subsequent studies. This suggests that a careful study of estimation of the null is indispensable. In this paper, we consider the problem of estimating a null normal distribution, and a closely related problem, estimation of the proportion of nonnull effects. We develop an approach based on the empirical characteristic function and Fourier analysis. The estimators are shown to be uniformly consistent over a wide class of parameters. Numerical performance of the estimators is investigated using both simulated and real data. In particular, we apply our
VarMixt: efficient variance modelling for the differential analysis of replicated gene expression data
 Bioinformatics
, 2005
"... replicated gene expression data. ..."