Results 1  10
of
43
Tree induction vs. logistic regression: A learningcurve analysis
 CEDER WORKING PAPER #IS0102, STERN SCHOOL OF BUSINESS
, 2001
"... Tree induction and logistic regression are two standard, offtheshelf methods for building models for classi cation. We present a largescale experimental comparison of logistic regression and tree induction, assessing classification accuracy and the quality of rankings based on classmembership pr ..."
Abstract

Cited by 85 (17 self)
 Add to MetaCart
Tree induction and logistic regression are two standard, offtheshelf methods for building models for classi cation. We present a largescale experimental comparison of logistic regression and tree induction, assessing classification accuracy and the quality of rankings based on classmembership probabilities. We use a learningcurve analysis to examine the relationship of these measures to the size of the training set. The results of the study show several remarkable things. (1) Contrary to prior observations, logistic regression does not generally outperform tree induction. (2) More specifically, and not surprisingly, logistic regression is better for smaller training sets and tree induction for larger data sets. Importantly, this often holds for training sets drawn from the same domain (i.e., the learning curves cross), so conclusions about inductionalgorithm superiority on a given domain must be based on an analysis of the learning curves. (3) Contrary to conventional wisdom, tree induction is effective atproducing probabilitybased rankings, although apparently comparatively less so foragiven training{set size than at making classifications. Finally, (4) the domains on which tree induction and logistic regression are ultimately preferable canbecharacterized surprisingly well by a simple measure of signaltonoise ratio.
Bayesian model averaging
 STAT.SCI
, 1999
"... Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to overcon dent inferences and decisions tha ..."
Abstract

Cited by 61 (1 self)
 Add to MetaCart
Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to overcon dent inferences and decisions that are more risky than one thinks they are. Bayesian model averaging (BMA) provides a coherent mechanism for accounting for this model uncertainty. Several methods for implementing BMA haverecently emerged. We discuss these methods and present anumber of examples. In these examples, BMA provides improved outofsample predictive performance. We also provide a catalogue of
Bayesian Model Averaging in proportional hazard models: Assessing the risk of a stroke
 Applied Statistics
, 1997
"... Evaluating the risk of stroke is important in reducing the incidence of this devastating disease. Here, we apply Bayesian model averaging to variable selection in Cox proportional hazard models in the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for ..."
Abstract

Cited by 43 (5 self)
 Add to MetaCart
Evaluating the risk of stroke is important in reducing the incidence of this devastating disease. Here, we apply Bayesian model averaging to variable selection in Cox proportional hazard models in the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for stroke. We introduce a technique based on the leaps and bounds algorithm which e ciently locates and ts the best models in the very large model space and thereby extends all subsets regression to Cox models. For each independent variable considered, the method provides the posterior probability that it belongs in the model. This is more directly interpretable than the corresponding Pvalues, and also more valid in that it takes account of model uncertainty. Pvalues from models preferred by stepwise methods tend to overstate the evidence for the predictive value of a variable. In our data Bayesian model averaging predictively outperforms standard model selection methods for assessing
Methods and criteria for model selection
 Journal of the American Statistical Association
"... Model selection is an important part of any statistical analysis, and indeed is central to the pursuit of science in general. Many authors have examined this question, from both frequentist and Bayesian perspectives, and many tools for selecting the \best model " have been suggested in the lite ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
(Show Context)
Model selection is an important part of any statistical analysis, and indeed is central to the pursuit of science in general. Many authors have examined this question, from both frequentist and Bayesian perspectives, and many tools for selecting the \best model " have been suggested in the literature. This paper evaluates the various proposals from a decision{theoretic perspective, as a way of bringing coherence to a complex and central question in the eld.
Likelihoodbased Data Squashing: A Modeling Approach to Instance Construction.
, 2002
"... Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analy ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analyses carried out on the original dataset. Likelihoodbased data squashing (LDS) differs from a previously published squashing algorithm insofar as it uses a statistical model to squash the data. The results show that LDS provides excellent squashing performance even when the target statistical analysis departs from the model used to squash the data.
Bayesian Variable Selection for Proportional Hazards Models
, 1996
"... The authors consider the problem of Bayesian variable selection for proportional hazards regression models with right censored data. They propose a semiparametric approach in which a nonparametric prior is specified for the baseline hazard rate and a fully parametric prior is specified for the regr ..."
Abstract

Cited by 19 (1 self)
 Add to MetaCart
The authors consider the problem of Bayesian variable selection for proportional hazards regression models with right censored data. They propose a semiparametric approach in which a nonparametric prior is specified for the baseline hazard rate and a fully parametric prior is specified for the regression coe#cients. For the baseline hazard, they use a discrete gamma process prior, and for the regression coe#cients and the model space, they propose a semiautomatic parametric informative prior specification that focuses on the observables rather than the parameters. To implement the methodology, they propose a Markov chain Monte Carlo method to compute the posterior model probabilities. Examples using simulated and real data are given to demonstrate the methodology. R ESUM E Les auteurs abordent d'un point de vue bayesien le problemedelaselection de variables dans les modeles de regression des risques proportionnels en presence de censure a droite. Ils proposent une approche semip...
Could a CAMELS Downgrade Model Improve OffSite Surveillance? Federal Reserve Bank of St
 Louis Economic Review
, 2002
"... The cornerstone of bank supervision is a regular schedule of thorough, onsite examinations. Under rules set forth in the Federal Deposit Insurance Corporation Improvement Act of 1991 (FDICIA), most U.S. banks must submit to a fullscope federal or state examination every 12 months; small, wellcapi ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
The cornerstone of bank supervision is a regular schedule of thorough, onsite examinations. Under rules set forth in the Federal Deposit Insurance Corporation Improvement Act of 1991 (FDICIA), most U.S. banks must submit to a fullscope federal or state examination every 12 months; small, wellcapitalized banks must be examined every 18 months. These examinations focus on six components of bank safety and soundness: capital protection (C), asset quality (A), management competence (M), earnings strength (E), liquidity risk exposure (L), and market risk sensitivity (S). At the close of each exam, examiners award a grade of one (best) through five (worst) to each component. Supervisors then draw on these six component ratings to assign a composite CAMELS rating, which is also expressed on a scale of one through five. (See the insert for a detailed description of the composite ratings.) In general, banks with composite ratings of one or two are considered safe and sound, whereas banks with ratings of three, four, or five are considered unsatisfactory. As of March 31, 2000, nearly 94 percent of U.S. banks posted composite CAMELS ratings of one or two. Bank supervisors support onsite examinations with offsite surveillance. Offsite surveillance uses quarterly financial data and anecdotal evidence to schedule and plan onsite exams. Although onsite examination is the most effective tool for spotting safetyandsoundness problems, it is costly and R. Alton Gilbert is a vice president and banking advisor, Andrew P. Meyer is an economist, and Mark D. Vaughan is a supervisory policy officer and economist at the Federal Reserve Bank of St. Louis. The
Syntactic probabilities affect pronunciation variation in spontaneous speech
, 2009
"... Speakers frequently have a choice among multiple ways of expressing one and the same thought. When choosing between syntactic constructions for expressing a given meaning, speakers are sensitive to probabilistic tendencies for syntactic, semantic or contextual properties of an utterance to favor on ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
(Show Context)
Speakers frequently have a choice among multiple ways of expressing one and the same thought. When choosing between syntactic constructions for expressing a given meaning, speakers are sensitive to probabilistic tendencies for syntactic, semantic or contextual properties of an utterance to favor one construction or another. Taken together, such tendencies may align to make one construction overwhelmingly more probable, marginally more probable, or no more probable than another. Here, we present evidence that acoustic features of spontaneous speech reflect these probabilities: when speakers choose a less probable construction, they are more likely to be disfluent, and their fluent words are likely to have a relatively longer duration. Conversely, words in more probable constructions are shorter and spoken more fluently. Our findings suggest that the di¤ering probabilities of a syntactic construction in context are not epiphenomenal, but reflect a part of a speakers ’ knowledge of their language.
Bayesian Analysis of Ordered Categorical Data from Industrial Experiments
 Technometrics
, 1995
"... Data from industrial experiments often involve an ordered categorical response, such as a qualitative rating. ANOVA based analyses may be inappropriate for such data, suggesting the use of Generalized Linear Models (GLMs). When the data are observed from a fractionated experiment, likelihoodbas ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Data from industrial experiments often involve an ordered categorical response, such as a qualitative rating. ANOVA based analyses may be inappropriate for such data, suggesting the use of Generalized Linear Models (GLMs). When the data are observed from a fractionated experiment, likelihoodbased GLM estimates may be innite, especially when factors have large eects. These diculties are overcome with a Bayesian GLM, which is implemented via the Gibbs sampling algorithm. Techniques for modeling data and for subsequently using the identied model to optimize the process are outlined. An important advantage in the optimization stage is that uncertainty in the parameter estimates is accounted for in the model. For robust design experiments, the Bayesian approach easily incorporates the variability of the noise factors using the response modeling approach (Welch, Yu, Kang and Sacks 1990 and Shoemaker, Tsui and Wu 1991). This approach and its techniques are used to analyze two...
The subselect R package
, 2012
"... Version 0.12 The subselect package addresses the issue of variable selection in different statistical contexts, among which exploratory data analyses; univariate or multivariate linear models; generalized linear models; principal components analysis; linear discriminant analysis, canonical correlati ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Version 0.12 The subselect package addresses the issue of variable selection in different statistical contexts, among which exploratory data analyses; univariate or multivariate linear models; generalized linear models; principal components analysis; linear discriminant analysis, canonical correlation analysis. Selecting variable subsets requires the definition of a numerical criterion which measures the quality of any given variable subset as a surrogate for the full set of variables. The current version of the subselect package provides eight different criteria. For each available criterion, the package provides a function that computes the criterion value of any given subset. More significantly, the package provides efficient search functions that seek the best subsets of any given size, for a specified criterion. Contents 1