Results 1 -
7 of
7
Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation
, 2002
"... There are many sources of systematic variation in cDNA microarray experiments which affect the measured gene expression levels (e.g. differences in labeling efficiency between the two fluorescent dyes). The term normalization refers to the process of removing such variation. A constant adjustment is ..."
Abstract
-
Cited by 200 (3 self)
- Add to MetaCart
There are many sources of systematic variation in cDNA microarray experiments which affect the measured gene expression levels (e.g. differences in labeling efficiency between the two fluorescent dyes). The term normalization refers to the process of removing such variation. A constant adjustment is often used to force the distribution of the intensity log ratios to have a median of zero for each slide. However, such global normalization approaches are not adequate in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments. The selection of appropriate controls for normalization is discussed and a novel set of controls (microarray sample pool, MSP) is introduced to aid in intensity-dependent normalization. Lastly, to allow for comparisons of expression levels across slides, a robust method based on maximum likelihood estimation is proposed to adjust for scale differences among slides.
Assessing Gene Significance from cDNA Microarray Expression Data via Mixed Models
, 2001
"... The determination of a list of differentially expressed genes is a basic objective in many cDNA microarray experiments. We present a statistical approach that allows direct control over the percentage of false positives in such a list and, under certain reasonable assumptions, improves on existing m ..."
Abstract
-
Cited by 69 (3 self)
- Add to MetaCart
The determination of a list of differentially expressed genes is a basic objective in many cDNA microarray experiments. We present a statistical approach that allows direct control over the percentage of false positives in such a list and, under certain reasonable assumptions, improves on existing methods with respect to the percentage of false negatives. The method accommodates a wide variety of experimental designs and can simultaneously assess significant differences between multiple types of biological samples. Two interconnected mixed linear models are central to the method and provide a flexible means to properly account for variability both across and within genes. The mixed model also provides a convenient framework for evaluating the statistical power of any particular experimental design and thus enables a researcher to a priori select an appropriate number of replicates. We also suggest some basic graphics for visualizing lists of significant genes. Analyses of published experiments studying human cancer and yeast cells illustrate the results.
Learning Large-Scale Graphical Gaussian Models from Genomic Data
- In Science of Complex Networks: From Biology to the Internet and WWW
, 2005
"... The inference and modeling of network-like structures in genomic data is of prime importance in systems biology. Complex stochastic associations and interdependencies can very generally be described as a graphical model. However, the paucity of available samples in current highthroughput experiments ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
The inference and modeling of network-like structures in genomic data is of prime importance in systems biology. Complex stochastic associations and interdependencies can very generally be described as a graphical model. However, the paucity of available samples in current highthroughput experiments renders learning graphical models from genome data, such as microarray expression profiles, a challenging and very hard problem. Here we review several recently developed approaches to small-sample inference of graphical Gaussian modeling and discuss strategies to cope with the high dimensionality of functional genomics data. Particular emphasis is put on regularization methods and an empirical Bayes network inference procedure.
Data Compression and Its Statistical Implications, with an Application to the Analysis of Microarray Images
, 2001
"... by Rebecka Jenny Jornsten Doctor of Philosophy in Statistics University of California, Berkeley Professor Bin Yu, Chair This thesis consists of three parts. Even though each part is self-contained, a common theme runs through all of them: data compression and its implications for statistical in ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
by Rebecka Jenny Jornsten Doctor of Philosophy in Statistics University of California, Berkeley Professor Bin Yu, Chair This thesis consists of three parts. Even though each part is self-contained, a common theme runs through all of them: data compression and its implications for statistical inference. In particular, we consider the following three questions. How can we quantify the effect of compression on statistical inference? How should a compression scheme be designed such that the effect of compression on inference is minimal? How can the Minimum Description Length (MDL) principle be used for model selection with an extraordinary number of dependent predictors? In this thesis, we attempt to answer these three questions in a general setting, and with a specific application in the compression and analysis of microarray images.
Genes by a Distribution-Free Shrinkage Approach ∗
"... High-dimensional case-control analysis is encountered in many different settings in genomics. In order to rank genes accordingly, many different scores have been proposed, ranging from ad hoc modifications of the ordinary t statistic to complicated hierarchical Bayesian models. Here, we introduce th ..."
Abstract
- Add to MetaCart
High-dimensional case-control analysis is encountered in many different settings in genomics. In order to rank genes accordingly, many different scores have been proposed, ranging from ad hoc modifications of the ordinary t statistic to complicated hierarchical Bayesian models. Here, we introduce the “shrinkage t ” statistic that is based on a novel and model-free shrinkage estimate of the variance vector across genes. This is derived in a quasi-empirical Bayes setting. The new rank score is fully automatic and requires no specification of parameters or distributions. It is computationally inexpensive and can be written analytically in closed form. Using a series of synthetic and three real expression data we studied the quality of gene rankings produced by the “shrinkage t ” statistic. The new score consistently leads to highly accurate rankings for the complete range of investigated data sets and all considered scenarios for acrossgene variance structures. KEYWORDS: high-dimensional case-control data, James-Stein shrinkage, limited-translation, quasi-empirical Bayes, regularized t statistic, variance shrinkage
Fuzzy Pattern Identification with Applications to the Microarray Data
"... Abstract: In this paper, we investigate a relatively new area of statistical decision problem, namely features extraction and pattern detection for multiple spatial time series. We will work on the statistically motivated approach for data-based generation of interpretable rule bases on spatial (spa ..."
Abstract
- Add to MetaCart
Abstract: In this paper, we investigate a relatively new area of statistical decision problem, namely features extraction and pattern detection for multiple spatial time series. We will work on the statistically motivated approach for data-based generation of interpretable rule bases on spatial (space-time) series and give some general and practical applications. The membership function of each data corresponding to the cluster centers is calculated and as performance index grouping. The data set will be detailed discussed along with the fuzzy statistical methods for segmentation and background correction. We will present the result of the research and compare gene identified using replicated slides to those identified by single-slide methods. An integrated classification/identification procedure for microarray data will be proposed. Finally we discuss our finding and outlines open questions.
2004 Loguinov et Volume al. 5, Issue 3, Article R18 Open Access Method Exploratory differential gene expression analysis in microarray experiments with no or limited replication
, 2004
"... electronic version of this article is the complete one and can be found online at ..."
Abstract
- Add to MetaCart
electronic version of this article is the complete one and can be found online at

