Results 1  10
of
854
Summaries of Affymetrix GeneChip probe level data
 Nucleic Acids Res
, 2003
"... High density oligonucleotide array technology is widely used in many areas of biomedical research for quantitative and highly parallel measurements of gene expression. Affymetrix GeneChip arrays are the most popular. In this technology each gene is typically represented by a set of 11±20 pairs of pr ..."
Abstract

Cited by 471 (21 self)
 Add to MetaCart
(Show Context)
High density oligonucleotide array technology is widely used in many areas of biomedical research for quantitative and highly parallel measurements of gene expression. Affymetrix GeneChip arrays are the most popular. In this technology each gene is typically represented by a set of 11±20 pairs of probes. In order to obtain expression measures it is necessary to summarize the probe level data. Using two extensive spikein studies and a dilution study, we developed a set of tools for assessing the effectiveness of expression measures. We found that the performance of the current version of the default expression measure provided by Affymetrix Microarray Suite can be signi®cantly improved by the use of probe level summaries derived from empirically motivated statistical models. In particular, improvements in the ability to detect differentially expressed genes are demonstrated.
Evolving gene/ transcript definitions significantly alter the interpretation of GeneChip data
 ATHEY B, JONES EG, BUNNEY WE, MYERS RM, SPEED TP, AKIL H, WATSON SJ, MENG
, 2005
"... ..."
S.H.: A genomewide association study of global gene expression.
 Nature Genetics
, 2007
"... We have created a global map of the effects of polymorphism on gene expression in 400 children from families recruited through a proband with asthma. We genotyped 408,273 SNPs and identified expression quantitative trait loci from measurements of 54,675 transcripts representing 20,599 genes in Epst ..."
Abstract

Cited by 242 (11 self)
 Add to MetaCart
We have created a global map of the effects of polymorphism on gene expression in 400 children from families recruited through a proband with asthma. We genotyped 408,273 SNPs and identified expression quantitative trait loci from measurements of 54,675 transcripts representing 20,599 genes in EpsteinBarr virustransformed lymphoblastoid cell lines. We found that 15,084 transcripts (28%) representing 6,660 genes had narrowsense heritabilities (H 2 ) 4 0.3. We executed genomewide association scans for these traits and found peak lod scores between 3.68 and 59.1. The most highly heritable traits were markedly enriched in Gene Ontology descriptors for response to unfolded protein (chaperonins and heat shock proteins), regulation of progression through the cell cycle, RNA processing, DNA repair, immune responses and apoptosis. SNPs that regulate expression of these genes are candidates in the study of degenerative diseases, malignancy, infection and inflammation. We have created a downloadable database to facilitate use of our findings in the mapping of complex disease loci.
A Model Based Background Adjustment for Oligonucleotide Expression Arrays.
 Journal of the American Statistical Association
, 2004
"... ..."
(Show Context)
A Benchmark for Affymetrix GeneChip Expression Measures
 Bioinformatics
, 2003
"... Motivation: The defining feature of oligonucleotide expression arrays is the use of several probes to assay each targeted transcript. This is a bonanza for the statistical geneticist, who can create probeset summaries with specific characteristics.There are now several methods available for summariz ..."
Abstract

Cited by 141 (10 self)
 Add to MetaCart
Motivation: The defining feature of oligonucleotide expression arrays is the use of several probes to assay each targeted transcript. This is a bonanza for the statistical geneticist, who can create probeset summaries with specific characteristics.There are now several methods available for summarizing probe level data from the popular Affymetrix GeneChips, but it is difficult to identify the best method for a given inquiry. Results: We have developed a graphical tool to evaluate summaries of Affymetrix probe level data. Plots and summary statistics offer a picture of how an expression measure performs in several important areas. This picture facilitates the comparison of competing expression measures and the selection of methods suitable for a specific investigation. The key is a benchmark data set consisting of a dilution study and a spikein study. Because the truth is known for these data, we can identify statistical features of the data for which the expected outcome is known in advance. Those features highlighted in our suite of graphs are justified by questions of biological interest and motivated by the presence of appropriate data. Availability: In conjunction with the release of a graphics toolbox as part of the Bioconductor project
Adaptive lasso for sparse highdimensional regression models. Statistica Sinica,
, 2008
"... Abstract: We study the asymptotic properties of the adaptive Lasso estimators in sparse, highdimensional, linear regression models when the number of covariates may increase with the sample size. We consider variable selection using the adaptive Lasso, where the L1 norms in the penalty are reweig ..."
Abstract

Cited by 98 (11 self)
 Add to MetaCart
(Show Context)
Abstract: We study the asymptotic properties of the adaptive Lasso estimators in sparse, highdimensional, linear regression models when the number of covariates may increase with the sample size. We consider variable selection using the adaptive Lasso, where the L1 norms in the penalty are reweighted by datadependent weights. We show that, if a reasonable initial estimator is available, under appropriate conditions, the adaptive Lasso correctly selects covariates with nonzero coefficients with probability converging to one, and that the estimators of nonzero coefficients have the same asymptotic distribution they would have if the zero coefficients were known in advance. Thus, the adaptive Lasso has an oracle property in the sense of Fan and Li
A novel signaling pathway impact analysis
 Bioinformatics
, 2009
"... Motivation: Gene expression class comparison studies may identify hundreds or thousands of genes as differentially expressed (DE) between sample groups. Gaining biological insight from the result of such experiments can be approached, for instance, by identifying the signaling pathways impacted by t ..."
Abstract

Cited by 85 (1 self)
 Add to MetaCart
(Show Context)
Motivation: Gene expression class comparison studies may identify hundreds or thousands of genes as differentially expressed (DE) between sample groups. Gaining biological insight from the result of such experiments can be approached, for instance, by identifying the signaling pathways impacted by the observed changes. Most of the existing pathway analysis methods focus on either the number of DE genes observed in a given pathway (enrichment analysis methods), or on the correlation between the pathway genes and the class of the samples (functional class scoring methods). Both approaches treat the pathways as simple sets of genes, disregarding the complex gene interactions that these pathways are built to describe. Results: We describe a novel Signaling Pathway Impact Analysis (SPIA) that combines the evidence obtained from the classical enrichment analysis with a novel type of evidence, which measures the actual perturbation on a given pathway under a given condition. A bootstrap procedure is used to assess the significance of the observed total pathway perturbation. Using simulations we show that the evidence derived from perturbations is independent of the pathway enrichment evidence. This allows us to calculate a global pathway significance pvalue, which combines the enrichment and perturbation pvalues. We illustrate the capabilities of the novel method on 4 real data sets. The results obtained on these data show that SPIA has better specificity and more sensitivity than several widely used pathway analysis methods. Availability: SPIA was implemented as an R package which is available at
Stochastic Models Inspired by Hybridization Theory for Short Oligonucleotide Arrays (Extended Abstract)
 J. Comput. Biol
, 2004
"... Zhijin Wu Johns Hopkins Bloomberg School of Public Health 615 North Wolfe Street zwu@jhsph.edu Rafael A. Irizarry Johns Hopkins Bloomberg School of Public Health 615 North Wolfe Street rafa@jhu.edu ABSTRACT High density oligonucleotide expression arrays are a widely used tool for the measureme ..."
Abstract

Cited by 80 (4 self)
 Add to MetaCart
(Show Context)
Zhijin Wu Johns Hopkins Bloomberg School of Public Health 615 North Wolfe Street zwu@jhsph.edu Rafael A. Irizarry Johns Hopkins Bloomberg School of Public Health 615 North Wolfe Street rafa@jhu.edu ABSTRACT High density oligonucleotide expression arrays are a widely used tool for the measurement of gene expression on a large scale. A#ymetrix GeneChip arrays appear to dominate this market. These arrays use short oligonucleotides to probe for genes in an RNA sample. Due to optical noise, nonspecific hybridization, probespecific e#ects, and measurement error, adhoc measures of expression, that summarize probe intensities, can lead to imprecise and inaccurate results. Various researchers have demonstrated that expression measures based on simple statistical models can provide great improvements over the adhoc procedure o#ered by A#ymetrix. Recently, physical models based on molecular hybridization theory, have been proposed as useful tools for prediction of, for example, nonspecific hybridization. These physical models show great potential in terms of improving existing expression measures. In this paper we suggest that the system producing the measured intensities is too complex to be fully described with these relatively simple physical models and we propose empirically motivated stochastic models that compliment the above mentioned molecular hybridization theory to provide a comprehensive description of the data. We discuss how the proposed model can be used to obtain improved measures of expression useful for the data analysts.
Androgen receptor regulates a distinct transcription program in androgenindependent prostate cancer. Cell 138
, 2009
"... The evolution of prostate cancer from an androgendependent state to one that is androgenindependent marks its lethal progression. The androgen receptor (AR) is essential in both, though its function ..."
Abstract

Cited by 66 (9 self)
 Add to MetaCart
(Show Context)
The evolution of prostate cancer from an androgendependent state to one that is androgenindependent marks its lethal progression. The androgen receptor (AR) is essential in both, though its function
VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS
, 2008
"... Summary. We consider a nonparametric additive model of a conditional mean function in which the number of variables and additive components may be larger than the sample size but the number of nonzero additive components is “small” relative to the sample size. The statistical problem is to determin ..."
Abstract

Cited by 65 (1 self)
 Add to MetaCart
Summary. We consider a nonparametric additive model of a conditional mean function in which the number of variables and additive components may be larger than the sample size but the number of nonzero additive components is “small” relative to the sample size. The statistical problem is to determine which additive components are nonzero. The additive components are approximated by truncated series expansions with Bspline bases. With this approximation, the problem of component selection becomes that of selecting the groups of coefficients in the expansion. We apply the adaptive group Lasso to select nonzero components, using the group Lasso to obtain an initial estimator and reduce the dimension of the problem. We give conditions under which the group Lasso selects a model whose number of components is comparable with the underlying model and, the adaptive group Lasso selects the nonzero components correctly with probability approaching one as the sample size increases and achieves the optimal rate of convergence. Following model selection, oracleefficient, asymptotically normal estimators of the nonzero components can be obtained by using existing methods. The results of Monte Carlo experiments show that the adaptive group Lasso procedure works well with samples of moderate size. A data example is used to illustrate the application of the proposed method. Key words and phrases. Adaptive group Lasso; component selection; highdimensional data; nonparametric regression; selection consistency. Short title. Nonparametric component selection AMS 2000 subject classification. Primary 62G08, 62G20; secondary 62G99 1