Results 1 - 10
of
12
Bayesian Model Averaging in proportional hazard models: Assessing the risk of a stroke
- Applied Statistics
, 1997
"... Evaluating the risk of stroke is important in reducing the incidence of this devastating disease. Here, we apply Bayesian model averaging to variable selection in Cox proportional hazard models in the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
Evaluating the risk of stroke is important in reducing the incidence of this devastating disease. Here, we apply Bayesian model averaging to variable selection in Cox proportional hazard models in the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for stroke. We introduce a technique based on the leaps and bounds algorithm which e ciently locates and ts the best models in the very large model space and thereby extends all subsets regression to Cox models. For each independent variable considered, the method provides the posterior probability that it belongs in the model. This is more directly interpretable than the corresponding P-values, and also more valid in that it takes account of model uncertainty. P-values from models preferred by stepwise methods tend to overstate the evidence for the predictive value of a variable. In our data Bayesian model averaging predictively outperforms standard model selection methods for assessing
Nonparametric Selection of Input Variables for Connectionist Learning
, 1996
"... re. However, for a range of explored problems, the relative ordering of mutual information estimates remains correct, despite inaccuracies in individual estimates. Analysis of forward selection explores the amount of data required to select a certain number of relevant input variables. It is shown t ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
re. However, for a range of explored problems, the relative ordering of mutual information estimates remains correct, despite inaccuracies in individual estimates. Analysis of forward selection explores the amount of data required to select a certain number of relevant input variables. It is shown that in order to select a certain number of relevant input variables, the amount of required data increases roughly exponentially as more relevant input variables are considered. It is also shown that the chances of forward selection ending up in a local minimum are reduced by bootstrapping the data. Finally, the method is compared to two connectionist methods for input variable selection: Sensitivity Based Pruning and Automatic Relevance Determination. It is shown that the new method outperforms these two when the number of independent, candidate input variables is large. However, the method requires the number of relevant input variables to be relatively small. These results are confirmed o
Greedy Basis Pursuit
, 2006
"... We introduce Greedy Basis Pursuit (GBP), a new algorithm for computing signal representations using overcomplete dictionaries. GBP is rooted in computational geometry and exploits an equivalence between minimizing the ℓ 1-norm of the representation coefficients and determining the intersection of th ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We introduce Greedy Basis Pursuit (GBP), a new algorithm for computing signal representations using overcomplete dictionaries. GBP is rooted in computational geometry and exploits an equivalence between minimizing the ℓ 1-norm of the representation coefficients and determining the intersection of the signal with the convex hull of the dictionary. GBP unifies the different advantages of previous algorithms: like standard approaches to Basis Pursuit, GBP computes representations that have minimum ℓ 1-norm; like greedy algorithms such as Matching Pursuit, GBP builds up representations, sequentially selecting atoms. We describe the algorithm, demonstrate its performance, and provide code. Experiments show that GBP can provide a fast alternative to standard linear programming approaches to Basis Pursuit.
Modellus: Automated Modeling of Complex Data Center Applications
"... The rising complexity of distributed server applications in enterprise data centers has made the tasks of modeling and analyzing their behavior increasingly difficult. This paper presents Modellus, a novel system for automated modeling of complex data center applications using statistical methods fr ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The rising complexity of distributed server applications in enterprise data centers has made the tasks of modeling and analyzing their behavior increasingly difficult. This paper presents Modellus, a novel system for automated modeling of complex data center applications using statistical methods from data mining and machine learning. Modellus can automatically derive models to predict the resource usage of an application and the workload it triggers; these models can be composed to capture multiple dependencies between interacting applications. Model accuracy is maintained by fast, distributed testing, automated relearning of models when they change, and methods to bound prediction errors in composite models. We have implemented a prototype of Modellus, deployed it on a data center testbed, and evaluated its efficacy for modeling and analysis of several distributed server applications. Our results show that this feature-based modeling technique is able to make predictions across several data center tiers, and maintain predictive accuracy (typically 95 % or better) in the face of significant shifts in workload composition; we also demonstrate practical applications of the Modellus system to prediction and provisioning of real-world applications. 1
Modellus: Automated Modeling of Complex Internet Data Center Applications
, 2009
"... Distributed server applications have become commonplace in today’s Internet and business environments. The data centers hosting these applications—large clusters of networked servers and storage—have in turn become increasingly complex. Some of this is due to complexity of the applications themselve ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Distributed server applications have become commonplace in today’s Internet and business environments. The data centers hosting these applications—large clusters of networked servers and storage—have in turn become increasingly complex. Some of this is due to complexity of the applications themselves, which may have multiple
(MARS; Multivariate Additive Regression Splines) Logistic Regressions for Prediction of
, 2001
"... Q: What environmental factors determine the distribution of the Red-Spotted Toad in a fragmented desert landscape? Q: Can logistic regression using GLM or GAM-MARS be used to address this question? 008LEB02.RPT ˜ 12/11/03 Note: To conserve space on the cover, the title of this report has been abbrev ..."
Abstract
- Add to MetaCart
Q: What environmental factors determine the distribution of the Red-Spotted Toad in a fragmented desert landscape? Q: Can logistic regression using GLM or GAM-MARS be used to address this question? 008LEB02.RPT ˜ 12/11/03 Note: To conserve space on the cover, the title of this report has been abbreviated. The full title appears on the following page. EPA/600/R-01/081
Least Angle and L1 Regression: A Review ∗
, 802
"... Abstract: Least Angle Regression is a promising technique for variable selection applications, offering a nice alternative to stepwise regression. It provides an explanation for the similar behavior of LASSO (L1-penalized ..."
Abstract
- Add to MetaCart
Abstract: Least Angle Regression is a promising technique for variable selection applications, offering a nice alternative to stepwise regression. It provides an explanation for the similar behavior of LASSO (L1-penalized
THE PREDICTOR'S AVERAGE ESTIMATED VARIANCE CRITERION FOR THE SELECTION-OF-VARIABLES PROBLEM IN GENERAL LINEAR MODELS By
, 1971
"... 1. The matrix XIX should be replaced by 1. XIX n (a) in lines 3, 4 and 12 on p. 17. (b) in line 7, p. 22. (c) in line 2, p. 25. 2. The expression of AEV(y) = s2p. 2 should read AEV(y) = s pIn on line 13, p. 17, and AEV(Yi) = si2p/n on line 2, p. 25. 3. In Table 2 (pp. 23-24) the calculated AEV st ..."
Abstract
- Add to MetaCart
1. The matrix XIX should be replaced by 1. XIX n (a) in lines 3, 4 and 12 on p. 17. (b) in line 7, p. 22. (c) in line 2, p. 25. 2. The expression of AEV(y) = s2p. 2 should read AEV(y) = s pIn on line 13, p. 17, and AEV(Yi) = si2p/n on line 2, p. 25. 3. In Table 2 (pp. 23-24) the calculated AEV statistics should be divided by 68.I I
Nonorthogonal Problems
"... you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact inform ..."
Abstract
- Add to MetaCart
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at

