Results 1  10
of
15
Variable Selection for Cox's Proportional Hazards Model and Frailty Model
 ANNALS OF STATISTICS
, 2002
"... A class of variable selection procedures for parametric models via nonconcave penalized likelihood was proposed in Fan and Li (2001a). It has been shown there that the resulting procedures perform as well as if the subset of significant variables were known in advance. Such a property is called an o ..."
Abstract

Cited by 46 (11 self)
 Add to MetaCart
A class of variable selection procedures for parametric models via nonconcave penalized likelihood was proposed in Fan and Li (2001a). It has been shown there that the resulting procedures perform as well as if the subset of significant variables were known in advance. Such a property is called an oracle property. The proposed procedures were illustrated in the context of linear regression, robust linear regression and generalized linear models. In this paper, the nonconcave penalized likelihood approach is extended further to the Cox proportional hazards model and the Cox proportional hazards frailty model, two commonly used semiparametric models in survival analysis. As a result, new variable selection procedures for these two commonlyused models are proposed. It is demonstrated how the rates of convergence depend on the regularization parameter in the penalty function. Further, with a proper choice of the regularization parameter and the penalty function, the proposed estimators possess an oracle property. Standard error formulae are derived and their accuracies are empirically tested. Simulation studies show that the proposed procedures are more stable in prediction and more effective in computation than the best subset variable selection, and they reduce model complexity as effectively as the best subset variable selection. Compared with the LASSO, which is the penalized likelihood method with the L1penalty, proposed by Tibshirani, the newly proposed approaches have better theoretic properties and finite sample performance.
The variable selection problem
 Journal of the American Statistical Association
, 2000
"... The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables ..."
Abstract

Cited by 39 (2 self)
 Add to MetaCart
The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables or predictors, but there is uncertainty about which subset to use. This vignette reviews some of the key developments which have led to the wide variety of approaches for this problem. 1
Bayes model averaging with selection of regressors
 Journal of the Royal Statistical Society. Series B, Statistical Methodology
, 2002
"... Summary. When a number of distinct models contend for use in prediction, the choice of a single model can offer rather unstable predictions. In regression, stochastic search variable selection with Bayesian model averaging offers a cure for this robustness issue but at the expense of requiring very ..."
Abstract

Cited by 33 (8 self)
 Add to MetaCart
Summary. When a number of distinct models contend for use in prediction, the choice of a single model can offer rather unstable predictions. In regression, stochastic search variable selection with Bayesian model averaging offers a cure for this robustness issue but at the expense of requiring very many predictors. Here we look at Bayes model averaging incorporating variable selection for prediction. This offers similar meansquare errors of prediction but with a vastly reduced predictor space. This can greatly aid the interpretation of the model. It also reduces the cost if measured variables have costs. The development here uses decision theory in the context of the multivariate general linear model. In passing, this reduced predictor space Bayes model averaging is contrasted with singlemodel approximations. A fast algorithm for updating regressions in the Markov chain Monte Carlo searches for posterior inference is developed, allowing many more variables than observations to be contemplated. We discuss the merits of absolute rather than proportionate shrinkage in regression, especially when there are more variables than observations. The methodology is illustrated on a set of spectroscopic data used for measuring the amounts of different sugars in an aqueous solution.
The choice of variables in multivariate regression: a nonconjugate Bayesian decision theory approach
, 1999
"... INTRODUCTION Choice of regressor variables in linear regression has attracted considerable attention in the literature, from forward, backward and stepwise regression, model choice criteria such as Akaike's information criterion, to Bayesian techniques. We will focus on the Bayesian University o ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
INTRODUCTION Choice of regressor variables in linear regression has attracted considerable attention in the literature, from forward, backward and stepwise regression, model choice criteria such as Akaike's information criterion, to Bayesian techniques. We will focus on the Bayesian University of Kent at Canterbury, Institute of Mathematics and Statistics, Cornwallis Building, Canterbury, CT2 7NF, UK. FAX 01227827932, email Philip.J.Brown@ukc.ac.uk y University College London, UK z Texas A & M University, USA 1 decision theory framework, first given by Lindley (1968) for univariate multiple regression, where costs attach to the inclusion of regressor variables. Here it is required to predict a future vector observation Y f comprising r components. Predictions are judged by quadratic loss to which is added a cost penalty on the regressor variables, x f
A Case Study of Stochastic Optimization in Health Policy: Problem Formulation and Preliminary Results
 Journal of Global Optimization
, 2000
"... Abstract. We use Bayesian decision theory to address a variable selection problem arising in attempts to indirectly measure the quality of hospital care, by comparing observed mortality rates to expected values based on patient sickness at admission. Our method weighs data collection costs against p ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Abstract. We use Bayesian decision theory to address a variable selection problem arising in attempts to indirectly measure the quality of hospital care, by comparing observed mortality rates to expected values based on patient sickness at admission. Our method weighs data collection costs against predictive accuracy to find an optimal subset of the available admission sickness variables. The approach involves maximizing expected utility across possible subsets, using Monte Carlo methods based on random division of the available data into N modeling and validation splits to approximate the expectation. After exploring the geometry of the solution space, we compare a variety of stochastic optimization methods — including genetic algorithms (GA), simulated annealing (SA), tabu search (TS), threshold acceptance (TA), and messy simulated annealing (MSA) — on their performance in finding good subsets of variables, and we clarify the role of N in the optimization. Preliminary results indicate that TS is somewhat better than TA and SA in this problem, with MSA and GA well behind the other three methods. Sensitivity analysis reveals broad stability of our conclusions.
Predictor Selection for Model Averaging
 Bayesian Methods with Applications to Science, Policy, and Official Statistics, (pp. 553562). International Society for Bayesian Analysis
, 2001
"... When a number of distinct models is available for prediction, choice of a single model can offer unstable results. In regression, stochastic search variable selection with Bayesian model averaging is a solution for this robustness issue but utilizes very many predictors. Here we look at Bayesian mod ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
When a number of distinct models is available for prediction, choice of a single model can offer unstable results. In regression, stochastic search variable selection with Bayesian model averaging is a solution for this robustness issue but utilizes very many predictors. Here we look at Bayesian model averaging that incorporates variable selection for prediction and use decision theory in the context of the multivariate general linear model with continuous covariates. We obtain similar mean square errors of prediction but with a greatly reduced predictor space that helps model interpretation. The paper summarises some results from Brown et al. (2001b). Here we provide a new example by applying the results to the selection of wavelet coefficients when regressing constituents of biscuit doughs on nearinfrared spectra. In the example the number of predictors greatly exceeds the number of observations. Keywords: MULTIVARIATE GENERAL LINEAR MODEL; BAYESIAN MODEL AVERAGING; DECISION THEORY; VARIABLE SELECTION; WAVELETS. 1.
GIBBS POSTERIOR FOR VARIABLE SELECTION IN HIGHDIMENSIONAL CLASSIFICATION AND DATA MINING 1
, 810
"... In the popular approach of “Bayesian variable selection ” (BVS), one uses prior and posterior distributions to select a subset of candidate variables to enter the model. A completely new direction will be considered here to study BVS with a Gibbs posterior originating in statistical mechanics. The G ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
In the popular approach of “Bayesian variable selection ” (BVS), one uses prior and posterior distributions to select a subset of candidate variables to enter the model. A completely new direction will be considered here to study BVS with a Gibbs posterior originating in statistical mechanics. The Gibbs posterior is constructed from a risk function of practical interest (such as the classification error) and aims at minimizing a risk function without modeling the data probabilistically. This can improve the performance over the usual Bayesian approach, which depends on a probability model which may be misspecified. Conditions will be provided to achieve good risk performance, even in the presence of high dimensionality, when the number of candidate variables “K ” can be much larger than the sample size “n. ” In addition, we develop a convenient Markov chain Monte Carlo algorithm to implement BVS with the Gibbs posterior.
When Did Bayesian Inference Become
"... Comment on article by Gelfand et al.........................Jay M. Ver Hoef 99 ..."
Abstract
 Add to MetaCart
Comment on article by Gelfand et al.........................Jay M. Ver Hoef 99
Model Selection for Multivariate Failure Time Data
"... We discuss a penalized pseudopartial likelihood method for variable and modelselection with multivariate failure time data. The proposed method has some nice asymptotic properties. ..."
Abstract
 Add to MetaCart
We discuss a penalized pseudopartial likelihood method for variable and modelselection with multivariate failure time data. The proposed method has some nice asymptotic properties.