## Semilinear high-dimensional model for normalization of microarray data: a theoretical analysis and partial consistency (2005)

Venue: | J. Amer. Statist. Assoc |

Citations: | 18 - 5 self |

### BibTeX

@ARTICLE{Fan05semilinearhigh-dimensional,

author = {Jianqing Fan and Heng Peng and Tao Huang},

title = {Semilinear high-dimensional model for normalization of microarray data: a theoretical analysis and partial consistency},

journal = {J. Amer. Statist. Assoc},

year = {2005},

volume = {100},

pages = {781--813}

}

### Years of Citing Articles

### OpenURL

### Abstract

Normalization of microarray data is essential for removing experimental biases and revealing meaningful biological results. Motivated by a problem of normalizing microarray data, a semilinear in-slide model (SLIM) has been proposed. To aggregate information from other arrays, SLIM is generalized to account for across-array information, resulting in an even more dynamic semiparametric regression model. This model can be used to normalize microarray data even when there is no replication within an array. We demonstrate that this semiparametric model has a number of interesting features. The parametric component and the nonparametric component that are of primary interest can be consistently estimated, the former having a parametric rate and the latter having a nonparametric rate, whereas the nuisance parameters cannot be consistently estimated. This is an interesting extension of the partial consistent phenomena, which itself is of theoretical interest. The asymptotic normality for the parametric component and the rate of convergence for the nonparametric component are established. The results are augmented by simulation studies and illustrated by an application to the cDNA microarray analysis of neuroblastoma cells in response to the macrophage migration inhibitory factor.

### Citations

1142 |
G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 2001, 98:5116–5121. doi:10.1186/1476-4598-13-88 Cite this article as: Ekström et al.: WNT5A induces release of exosomes containing pro-angiogenic
- VG, Tibshirani, et al.
(Show Context)
Citation Context ...llowed, namely, Y ∗ gj = αgj + εgj, (10) where {Y∗ gj } are the normalized log-ratios and αgj is the treatment effect on gene g, which may depend on the subject j. The balanced permutation technique (=-=Tusher et al. 2001-=-; Fan et al. 2004) can be used to empirically determine the distribution of a test statistic Vg (including the one outlined by Kosorok and Ma) and estimate its associated false discovery rate. The met... |

361 | Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection
- Li, Wong
- 2001
(Show Context)
Citation Context ...ey, and Tusher 2001; Sabatti, Karsten, and Geschwind 2002). Often the statistical methodologies used to take advantage of the large number of genes fall under the umbrella of empirical Bayes methods (=-=Li and Wong 2001-=-; Newton, Kendziorski, Richmond, Blattner, and Tsui 2001; Efron, Tibshirani, Storey, and Tusher 2001; and many others). The results of Fan, Peng, and Huang are in terms of partial consistency in prese... |

161 | Multiple Hypothesis Testing in Microarray Experiments - Dudoit, Shaffer, et al. - 2003 |

160 | The positive false discovery rate: A Bayesian interpretation and the q-value - Storey - 2003 |

159 | Large-scale simultaneous hypothesis testing: the choice of a null hypothesis
- Efron
- 2004
(Show Context)
Citation Context ...arrett, Irizarry, and Zeger 2003.) They revived a surge interest in multiple testing problems (Dudoit, Shaffer, and Boldrick 2003; Storey 2003; Donoho and Jin 2004; Storey, Taylor, and Siegmund 2004; =-=Efron 2004-=-). They exemplify the interactions between statistics and the sciences, tackling problems of high societal impact. All of the discussants call for more statistical understanding of various procedures ... |

116 | Strong Control, Conservative Point Estimation and Simultaneous Conservative Consistency of False Discovery Rates: A Unified Approach - Storey, Taylor, et al. |

86 | Higher criticism for detecting sparse heterogeneous mixtures
- Donoho, Jin
(Show Context)
Citation Context ...soni, Kohane, and Ramoni 2003; Speed 2003; Parmigiani, Garrett, Irizarry, and Zeger 2003.) They revived a surge interest in multiple testing problems (Dudoit, Shaffer, and Boldrick 2003; Storey 2003; =-=Donoho and Jin 2004-=-; Storey, Taylor, and Siegmund 2004; Efron 2004). They exemplify the interactions between statistics and the sciences, tackling problems of high societal impact. All of the discussants call for more s... |

77 |
Nonparametric Regression and Spline Smoothing, Second Edition. Statistics: A Series of Textbooks and Monogrphs
- Eubank
- 1999
(Show Context)
Citation Context ...lynomial splines and can be more computationally intensive. It depends on whether they directly invert large matrices and whether they carefully exploit the sparsity of matrices created by B-splines (=-=Eubank 1999-=-). 5. INCORPORATING SIDE INFORMATION Side information can be incorporated into the normalization and analysis of microarray data. Several discussants have touched on several aspects of these. For exam... |

72 |
T: Statistical Analysis of Gene Expression Microarray Data. Boca Raton: Chapman & Hall/CRC
- Speed
- 2003
(Show Context)
Citation Context ... numerous articles have been written on this topic, the use of duplicate spots in the normalization process has rarely been discussed. Given the apparent withinslide spatial effects (Smyth, Yang, and =-=Speed 2003-=-; Balázsi, Kay, Barabási, and Oltvai 2003), these spots have the potential to be very helpful not only in the normalization process itself, but also in assessing the effectiveness of a normalization p... |

66 | Class prediction by nearest shrunken centroids, with applicaitons to dna microarrays - Tibshirani, Hastie, et al. - 2003 |

61 | Linear smoothers and additive models (with discussion - Hastie, T, et al. - 1989 |

11 | Direct estimation of additive and linear components for high dimensional data. Mimeo 2339 - Fan, Härdle, et al. - 1995 |

9 | Removing intensity effects and identifying significant genes for Affymetrix arrays in MIF-suppressed neuroblastoma cells - Fan, Chen, et al. - 2005 |

9 | The Analysis of Gene Expression Data - Parmigiani, Garrett, et al. - 2003 |

3 |
Experimental design for gene expression
- Kerr, Martin, et al.
- 2000
(Show Context)
Citation Context ...on � j β j = 0 is no longer necessary in (3) unless zij = zi as in (2). The covariate vectors zi in (2) can be used to code various design schemes, such as the loop, reference, and factorial designs (=-=Kerr and Churchill 2001-=-). For example, for the two-sample direct comparison design, zi = 1, i = 1,...,n. For an indirect comparison design using a common reference, we can introduce a two-dimensional covariate vector, zi = ... |

1 | Statistical Approach to - Svrakic, Nesic, et al. - 2003 |

1 |
Design Issues for cDNA
- Yang, Speed
- 2002
(Show Context)
Citation Context ...last couple of years have brought an explosion of statistical techniques for the design and analysis of microarray data. They range from the design of microarray experiments (Kerr and Churchill 2001; =-=Yang and Speed 2002-=-), normalization of microarray data (Tseng, Oh, Rohlin, Liao, and Wong 2001; Dudoit et al. 2002; Fan, Tam, Vande Woude, and Ren 2004; Huang, Wang, and Zhang 2003), the expression indices of Affymetrix... |