Results 1 -
9 of
9
Nonnegative Garrote Component Selection in Functional ANOVA Models
, 2007
"... We consider the problem of component selection in a functional ANOVA model. A nonparametric extension of the nonnegative garrote (Breiman, 1996) is proposed. We show that the whole solution path of the proposed method can be efficiently computed, which, in turn, facilitates the selection of the tuni ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We consider the problem of component selection in a functional ANOVA model. A nonparametric extension of the nonnegative garrote (Breiman, 1996) is proposed. We show that the whole solution path of the proposed method can be efficiently computed, which, in turn, facilitates the selection of the tuning parameter. We also show that the final estimate enjoys nice theoretical properties given that the tuning parameter is appropriately chosen. Simulation and a real data example demonstrate promising performance of the new approach.
Sparse image reconstruction for molecular imaging
- IEEE Trans. Image Process
, 2009
"... Abstract—The application that motivates this paper is molecular imaging at the atomic level. When discretized at subatomic distances, the volume is inherently sparse. Noiseless measurements from an imaging technology can be modeled by convolution of the image with the system point spread function (p ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract—The application that motivates this paper is molecular imaging at the atomic level. When discretized at subatomic distances, the volume is inherently sparse. Noiseless measurements from an imaging technology can be modeled by convolution of the image with the system point spread function (psf). Such is the case with magnetic resonance force microscopy (MRFM), an emerging technology where imaging of an individual tobacco mosaic virus was recently demonstrated with nanometer resolution. We also consider additive white Gaussian noise (AWGN) in the measurements. Many prior works of sparse estimators have focused on the case when H has low coherence; however, the system matrix H in our application is the convolution matrix for the system psf. A typical convolution matrix has high coherence. This paper, therefore, does not assume a low coherence H. A discrete-continuous form of the Laplacian and atom at zero (LAZE) p.d.f. used by Johnstone and Silverman is formulated, and two sparse estimators derived by maximizing the joint p.d.f. of the observation and image conditioned on the hyperparameters. A thresholding rule that generalizes the hard and soft thresholding rule appears in the course of the derivation. This so-called hybrid thresholding rule, when used in the iterative thresholding framework, gives rise to the hybrid estimator, a generalization of the lasso. Estimates of the hyperparameters for the lasso and hybrid estimator are obtained via Stein’s unbiased risk estimate (SURE). A numerical study with a Gaussian psf and two sparse images shows that the hybrid estimator outperforms the lasso. Index Terms—Biomedical image processing, image restoration, magnetic force microscopy, sparse image prior, Stein’s unbiased risk estimate.
Feature Selection via Block-Regularized Regression
"... Identifying co-varying causal elements in very high dimensional feature space with internal structures, e.g., a space with as many as millions of linearly ordered features, as one typically encounters in problems such as whole genome association (WGA) mapping, remains an open problem in statistical ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Identifying co-varying causal elements in very high dimensional feature space with internal structures, e.g., a space with as many as millions of linearly ordered features, as one typically encounters in problems such as whole genome association (WGA) mapping, remains an open problem in statistical learning. We propose a block-regularized regression model for sparse variable selection in a high-dimensional space where the covariates are linearly ordered, and are possibly subject to local statistical linkages (e.g., block structures) due to spacial or temporal proximity of the features. Our goal is to identify a small subset of relevant covariates that are not merely from random positions in the ordering, but grouped as contiguous blocks from large number of ordered covariates. Following a typical linear regression framework between the features and the response, our proposed model employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a 1st-order Markovian process along the feature sequence that “activates” the regression coefficients in a coupled fashion. We describe a sampling-based learning algorithm and demonstrate the performance of our method on simulated and biological data for marker identification under WGA. 1
Penalized quadratic inference functions for variable selection in longitudinal research
, 2006
"... For decades, much research has been devoted to developing and comparing variable selection methods, but primarily for the classical case of independent observations. Existing variable-selection methods can be adapted to cluster-correlated observations, but some adaptation is required. For example, ..."
Abstract
- Add to MetaCart
For decades, much research has been devoted to developing and comparing variable selection methods, but primarily for the classical case of independent observations. Existing variable-selection methods can be adapted to cluster-correlated observations, but some adaptation is required. For example, classical model fit statistics such as AIC and BIC are undefined if the likelihood function is unknown (Pan, 2001). Little research has been done on variable selection for generalized estimating equations (GEE, Liang and Zeger, 1986) and similar correlated data approaches. This thesis will review existing work on model selection for GEE and propose new model selection options for GEE, as well as for a more sophisticated marginal modeling approach based on quadratic inference functions (QIF, Qu, Lindsay, and Li, 2000), which has better asymptotic properties than classic GEE. The focus is on selection using continuous penalties such as LASSO (Tibshirani, 1996) or SCAD (Fan and Li, 2001) rather than the older discrete penalties such as AIC and BIC. The
On the Nonnegative Garrote Estimator 1
, 2005
"... We study the nonnegative garrote estimator from three different aspects: computation, consistency and flexibility. We show that the nonnegative garrote estimate has a piecewise linear solution path. Using this fact, we propose an efficient algorithm for computing the whole solution path for the nonn ..."
Abstract
- Add to MetaCart
We study the nonnegative garrote estimator from three different aspects: computation, consistency and flexibility. We show that the nonnegative garrote estimate has a piecewise linear solution path. Using this fact, we propose an efficient algorithm for computing the whole solution path for the nonnegative garrote estimate. We also show that the nonnegative garrote has the nice property that with probability tending to one, the solution path contains an estimate that correctly identifies the set of important variables and is consistent for the coefficients of the important variables. Such property is valid for another popular variable selection method, LASSO, only under restrictive conditions. We propose a slight modification that retains the attractive properties of the original nonnegative garrote, but is more widely applicable. To demonstrate the flexibility of the proposed estimator, we consider an extension to the nonparametric regression setup. Simulations and a real example show that the proposed method is very competitive in terms of variable selection and estimation accuracy when compared with other variable selection and estimation methods.
An Efficient Variable Selection Approach for Analyzing Designed Experiments
"... The analysis of experiments where a large number of potential variables are examined is driven by the principles of effect sparsity, effect hierarchy, and effect heredity. We propose an efficient variable selection strategy to specifically address the unique challenges faced by such analysis. The pr ..."
Abstract
- Add to MetaCart
The analysis of experiments where a large number of potential variables are examined is driven by the principles of effect sparsity, effect hierarchy, and effect heredity. We propose an efficient variable selection strategy to specifically address the unique challenges faced by such analysis. The proposed methods are natural extensions of a general-purpose variable selection algorithm, LARS (Efron et al., 2004). They are very fast to compute and can find sparse models that better satisfy the goals of experiments. Simulations and real examples are used to illustrate the wide applicability of the proposed methods. 1
Bayesian generalized double Pareto shrinkage
, 2010
"... We propose a generalized double Pareto prior for shrinkage estimation in linear models. The prior can be obtained via a scale mixture of Laplace or normal distributions, while forming a bridge between the Laplace and Normal-Jeffreys ’ priors. While it has a spike at zero like the Laplace density, it ..."
Abstract
- Add to MetaCart
We propose a generalized double Pareto prior for shrinkage estimation in linear models. The prior can be obtained via a scale mixture of Laplace or normal distributions, while forming a bridge between the Laplace and Normal-Jeffreys ’ priors. While it has a spike at zero like the Laplace density, it also has a Student-t-like tail behavior. We show strong consistency of the posterior in regression models with a diverging number of parameters, providing a template to be used for other priors in similar settings. Bayesian computation is straightforward via a simple Gibbs sampling algorithm. We also investigate the properties of the maximum a posteriori estimator and reveal connections with some well-established regularization procedures. The performance of the new prior is tested through simulations.
Hierarchical array priors for ANOVA decompositions
, 2012
"... ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matri ..."
Abstract
- Add to MetaCart
ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categorical factors, it is plausible that an ANOVA decomposition exhibits some consistency across orders of effects, in that the levels of a factor that have similar main-effect coefficients may also have similar coefficients in higher-order interaction terms. In such a case, estimation of the higher-order interactions should be improved by borrowing information from the main effects and lower-order interactions. To take advantage of such patterns, this article introduces a class of hierarchical prior distributions for collections of interaction arrays that can adapt to the presence of such interactions. These prior distributions are based on a type of array-variate normal distribution, for which a covariance matrix for each factor is estimated. This prior is able to adapt to potential similarities among the levels of a factor, and incorporate any such information into the estimation of the effects in which the factor appears. In the presence of such similarities, this prior is able to borrow information from well-estimated main effects and lower-order interactions to assist in the

