High dimensional graphs and variable selection with the Lasso
 ANNALS OF STATISTICS
, 2006
The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso
Cited by 736 (22 self)
The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso
Regularization and variable selection via the Elastic Net.
 J. R. Stat. Soc. Ser. B
, 2005
Abstract We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect
Cited by 973 (11 self)
Abstract We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect
Ideal spatial adaptation by wavelet shrinkage
 Biometrika
, 1994
With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle offers dramatic advantages over traditional linear estimation by nonadaptive kernels
Cited by 1269 (5 self)
advantages over traditional linear estimation by nonadaptive kernels � however, it is a priori unclear whether such performance can be obtained by a procedure relying on the data alone. We describe a new principle for spatiallyadaptive estimation: selective wavelet reconstruction. Weshowthatvariableknot
Least angle regression
, 2004
The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a large collection of possible covariates from which we hope to select
Cited by 1326 (37 self)
The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a large collection of possible covariates from which we hope
The Dantzig selector: statistical estimation when p is much larger than n
, 2005
In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ p
Cited by 879 (14 self)
In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n
Variable selection in semiparametric linear regression with censored data
 Journal of the Royal Statistical Society, Series B
, 2008
Abstract We describe two procedures for selecting variables in the semiparametric linear regression model for censored data. One procedure penalizes a vector of estimating equations and simultaneously estimates regression coefficients and selects submodels. A second procedure controls systematically the proportion of unimportant variables
Cited by 10 (2 self)
systematically the proportion of unimportant variables through forward selection and the addition of pseudo random variables. We explore both rankbased statistics and BuckleyJames statistics in the proposed setting and evaluate the performance of all methods through extensive simulation studies and one real
Using mutual information for selecting features in supervised neural net learning
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 1994
This paper investigates the application of the mutual information criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural network classifier. Because the mutual information measures arbitrary dependencies between random variables, it is
Cited by 358 (1 self)
This paper investigates the application of the mutual infor“ criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural network classifier. Because the mutual information measures arbitrary dependencies between random variables
A Fast Algorithm for the Minimum Covariance Determinant Estimator
 Technometrics
, 1998
The minimum covariance determinant (MCD) method of Rousseeuw (1984) is a highly robust estimator of multivariate location and scatter. Its objective is to find h observations (out of n) whose covariance matrix has the lowest determinant.
Cited by 346 (15 self)
variables. To deal with such problems we have developed a new algorithm for the MCD, called FASTMCD. The basic ideas are an inequality involving order statistics and determinants, and techniques which we call `selective iteration' and `nested extensions'. For small data sets FASTMCD typically
Discriminative Reranking for Natural Language Parsing
, 2005
This article considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon this initial ranking
Cited by 333 (9 self)
This article considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon
Generalized econometric models with selectivity
 Econometrica
, 1983
During the recent years, there is substantial interest in the econometric models with qualitative and censored dependent variables. The important contributions on these topics by Amemiya [1973], McFadden [1973] and Heckman [1974] among others stimulate the recent
Cited by 270 (0 self)
During the recent years, there is substantial interest in the econometric models with qualitative and censored dependent variables. The important contributions on these topics by Amemiya [1973J, McFadden [1973J and Heckman [1974J among others stimulate the recent
