Results 1  10
of
153
Group Lasso with Overlap and Graph Lasso
"... We propose a new penalty function which, when used as regularization for empirical risk minimization procedures, leads to sparse estimators. The support of the sparse vector is typically a union of potentially overlapping groups of covariates defined a priori, or a set of covariates which tend to be ..."
Abstract

Cited by 215 (18 self)
 Add to MetaCart
We propose a new penalty function which, when used as regularization for empirical risk minimization procedures, leads to sparse estimators. The support of the sparse vector is typically a union of potentially overlapping groups of covariates defined a priori, or a set of covariates which tend to be connected to each other when a graph of covariates is given. We study theoretical properties of the estimator, and illustrate its behavior on simulated and breast cancer gene expression data. 1.
Prediction by supervised principal components
 Journal of the American Statistical Association
, 2006
"... In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal co ..."
Abstract

Cited by 81 (9 self)
 Add to MetaCart
In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal components is similar to conventional principal components analysis except that it uses a subset of the predictors selected based on their association with the outcome. Supervised principal components can be applied to regression and generalized regression problems, such as survival analysis. It compares favorably to other techniques for this type of problem, and can also account for the effects of other covariates and help identify which predictor variables are most important. We also provide asymptotic consistency results to help support our empirical findings. These methods could become important tools for DNA microarray data, where they may be used to more accurately diagnose and treat cancer. KEY WORDS: Gene expression; Microarray; Regression; Survival analysis. 1.
FoxA1 translates epigenetic signatures into enhancerdriven lineagespecific transcription. Cell 132: 958–970
, 2008
"... Complex organisms require tissuespecific transcriptional programs, yet little is known about how these are established. The transcription factor FoxA1 is thought to contribute to gene regulation through its ability to act as a pioneer factor binding to nucleosomal DNA. Through genomewide positiona ..."
Abstract

Cited by 80 (12 self)
 Add to MetaCart
Complex organisms require tissuespecific transcriptional programs, yet little is known about how these are established. The transcription factor FoxA1 is thought to contribute to gene regulation through its ability to act as a pioneer factor binding to nucleosomal DNA. Through genomewide positional analyses, we demonstrate that FoxA1 cell typespecific functions rely primarily on differential recruitment to chromatin predominantly at distant enhancers rather than proximal promoters. This differential recruitment leads to cell typespecific changes in chromatin structure and functional collaboration with lineagespecific transcription factors. Despite the ability of FoxA1 to bind nucleosomes, its differential binding to chromatin sites is dependent on the distribution of histone H3 lysine 4 dimethylation. Together, our results suggest that methylation of histone H3 lysine 4 is part of the epigenetic signature that defines lineagespecific FoxA1 recruitment sites in chromatin. FoxA1 translates this epigenetic signature into changes in chromatin structure thereby establishing lineagespecific transcriptional enhancers and programs.
Partial Correlation Estimation by Joint Sparse Regression Models
 JASA
, 2008
"... In this article, we propose a computationally efficient approach—space (Sparse PArtial Correlation Estimation)—for selecting nonzero partial correlations under the highdimensionlowsamplesize setting. This method assumes the overall sparsity of the partial correlation matrix and employs sparse re ..."
Abstract

Cited by 77 (8 self)
 Add to MetaCart
In this article, we propose a computationally efficient approach—space (Sparse PArtial Correlation Estimation)—for selecting nonzero partial correlations under the highdimensionlowsamplesize setting. This method assumes the overall sparsity of the partial correlation matrix and employs sparse regression techniques for model fitting. We illustrate the performance of space by extensive simulation studies. It is shown that space performs well in both nonzero partial correlation selection and the identification of hub variables, and also outperforms two existing methods. We then apply space to a microarray breast cancer dataset and identify a set of hub genes that may provide important insights on genetic regulatory networks. Finally, we prove that, under a set of suitable assumptions, the proposed procedure is asymptotically consistent in terms of model selection and parameter estimation.
Smoothing Proximal Gradient Method for General Structured Sparse Learning
"... We study the problem of learning high dimensional regression models regularized by a structuredsparsityinducing penalty that encodes prior structural information on either input or output sides. We consider two widely adopted types of such penalties as our motivating examples: 1) overlapping group ..."
Abstract

Cited by 50 (7 self)
 Add to MetaCart
We study the problem of learning high dimensional regression models regularized by a structuredsparsityinducing penalty that encodes prior structural information on either input or output sides. We consider two widely adopted types of such penalties as our motivating examples: 1) overlapping group lasso penalty, based on the ℓ1/ℓ2 mixednorm penalty, and 2) graphguided fusion penalty. For both types of penalties, due to their nonseparability, developing an efficient optimization method has remained a challenging problem. In this paper, we propose a general optimization approach, called smoothing proximal gradient method, which can solve the structured sparse regression problems with a smooth convex loss and a wide spectrum of structuredsparsityinducing penalties. Our approach is based on a general smoothing technique of Nesterov [17]. It achieves a convergence rate faster than the standard firstorder method, subgradient method, and is much more scalable than the most widely used interiorpoint method. Numerical results are reported to demonstrate the efficiency and scalability of the proposed method. 1
Exact Covariance Thresholding into Connected Components for LargeScale Graphical Lasso
 Journal of Machine Learning Research
, 2012
"... We consider the sparse inverse covariance regularization problem or graphical lasso with regularization parameter λ. Suppose the sample covariance graph formed by thresholding the entries of the sample covariance matrix at λ is decomposed into connected components. We show that the vertexpartition ..."
Abstract

Cited by 42 (6 self)
 Add to MetaCart
We consider the sparse inverse covariance regularization problem or graphical lasso with regularization parameter λ. Suppose the sample covariance graph formed by thresholding the entries of the sample covariance matrix at λ is decomposed into connected components. We show that the vertexpartition induced by the connected components of the thresholded sample covariance graph (at λ) is exactly equal to that induced by the connected components of the estimated concentration graph, obtained by solving the graphical lasso problem for the same λ. This characterizes a very interesting property of a path of graphical lasso solutions. Furthermore, this simple rule, when used as a wrapper around existing algorithms for the graphical lasso, leads to enormous performance gains. For a range of values of λ, our proposal splits a large graphical lasso problem into smaller tractable problems, making it possible to solve an otherwise infeasible largescale problem. We illustrate the graceful scalability of our proposal via synthetic and reallife microarray examples.
Microarray standard data set and figures of merit for comparing data processing methods and experiment designs
, 2003
"... ..."
The gene expression response of breast cancer to growth regulators: patterns and correlation with tumor expression profiles
 Cancer Res
, 2003
"... ..."
Joint sampling distribution between actual and estimated classification errors for linear discriminant analysis
 IEEE Trans. Inf. Theory
, 2010
"... Abstract—Error estimation must be used to find the accuracy of a designed classifier, an issue that is critical in biomarker discovery for disease diagnosis and prognosis in genomics and proteomics. This paper presents, for what is believed to be the first time, the analytical formulation for the jo ..."
Abstract

Cited by 17 (9 self)
 Add to MetaCart
(Show Context)
Abstract—Error estimation must be used to find the accuracy of a designed classifier, an issue that is critical in biomarker discovery for disease diagnosis and prognosis in genomics and proteomics. This paper presents, for what is believed to be the first time, the analytical formulation for the joint sampling distribution of the actual and estimated errors of a classification rule. The analysis presented here concerns the linear discriminant analysis (LDA) classification rule and the resubstitution and leaveoneout error estimators, under a general parametric Gaussian assumption. Exact results are provided in the univariate case, and a simple method is suggested to obtain an accurate approximation in the multivariate case. It is also shown how these results can be applied in the computation of condition bounds and the regression of the actual error, given the observed error estimate. In contrast to asymptotic results, the analysis presented here is applicable to finite training data. In particular, it applies in the smallsample settings commonly found in genomics and proteomics applications. Numerical examples, which include parameters estimated from actual microarray data, illustrate the analysis throughout. Index Terms—Classification, crossvalidation, error estimation, leaveoneout, linear discriminant analysis, resubstitution, sampling distribution. I.