Results 1  10
of
20
Tree induction vs. logistic regression: A learningcurve analysis
 CEDER WORKING PAPER #IS0102, STERN SCHOOL OF BUSINESS
, 2001
"... Tree induction and logistic regression are two standard, offtheshelf methods for building models for classi cation. We present a largescale experimental comparison of logistic regression and tree induction, assessing classification accuracy and the quality of rankings based on classmembership pr ..."
Abstract

Cited by 62 (16 self)
 Add to MetaCart
Tree induction and logistic regression are two standard, offtheshelf methods for building models for classi cation. We present a largescale experimental comparison of logistic regression and tree induction, assessing classification accuracy and the quality of rankings based on classmembership probabilities. We use a learningcurve analysis to examine the relationship of these measures to the size of the training set. The results of the study show several remarkable things. (1) Contrary to prior observations, logistic regression does not generally outperform tree induction. (2) More specifically, and not surprisingly, logistic regression is better for smaller training sets and tree induction for larger data sets. Importantly, this often holds for training sets drawn from the same domain (i.e., the learning curves cross), so conclusions about inductionalgorithm superiority on a given domain must be based on an analysis of the learning curves. (3) Contrary to conventional wisdom, tree induction is effective atproducing probabilitybased rankings, although apparently comparatively less so foragiven training{set size than at making classifications. Finally, (4) the domains on which tree induction and logistic regression are ultimately preferable canbecharacterized surprisingly well by a simple measure of signaltonoise ratio.
Bayesian model averaging
 STAT.SCI
, 1999
"... Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to overcon dent inferences and decisions tha ..."
Abstract

Cited by 42 (0 self)
 Add to MetaCart
Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to overcon dent inferences and decisions that are more risky than one thinks they are. Bayesian model averaging (BMA) provides a coherent mechanism for accounting for this model uncertainty. Several methods for implementing BMA haverecently emerged. We discuss these methods and present anumber of examples. In these examples, BMA provides improved outofsample predictive performance. We also provide a catalogue of
Bayesian Model Averaging in proportional hazard models: Assessing the risk of a stroke
 Applied Statistics
, 1997
"... Evaluating the risk of stroke is important in reducing the incidence of this devastating disease. Here, we apply Bayesian model averaging to variable selection in Cox proportional hazard models in the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for ..."
Abstract

Cited by 28 (5 self)
 Add to MetaCart
Evaluating the risk of stroke is important in reducing the incidence of this devastating disease. Here, we apply Bayesian model averaging to variable selection in Cox proportional hazard models in the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for stroke. We introduce a technique based on the leaps and bounds algorithm which e ciently locates and ts the best models in the very large model space and thereby extends all subsets regression to Cox models. For each independent variable considered, the method provides the posterior probability that it belongs in the model. This is more directly interpretable than the corresponding Pvalues, and also more valid in that it takes account of model uncertainty. Pvalues from models preferred by stepwise methods tend to overstate the evidence for the predictive value of a variable. In our data Bayesian model averaging predictively outperforms standard model selection methods for assessing
Likelihoodbased Data Squashing: A Modeling Approach to Instance Construction.
, 2002
"... Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analy ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analyses carried out on the original dataset. Likelihoodbased data squashing (LDS) differs from a previously published squashing algorithm insofar as it uses a statistical model to squash the data. The results show that LDS provides excellent squashing performance even when the target statistical analysis departs from the model used to squash the data.
Bayesian Variable Selection for Proportional Hazards Models
, 1996
"... The authors consider the problem of Bayesian variable selection for proportional hazards regression models with right censored data. They propose a semiparametric approach in which a nonparametric prior is specified for the baseline hazard rate and a fully parametric prior is specified for the regr ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
The authors consider the problem of Bayesian variable selection for proportional hazards regression models with right censored data. They propose a semiparametric approach in which a nonparametric prior is specified for the baseline hazard rate and a fully parametric prior is specified for the regression coe#cients. For the baseline hazard, they use a discrete gamma process prior, and for the regression coe#cients and the model space, they propose a semiautomatic parametric informative prior specification that focuses on the observables rather than the parameters. To implement the methodology, they propose a Markov chain Monte Carlo method to compute the posterior model probabilities. Examples using simulated and real data are given to demonstrate the methodology. R ESUM E Les auteurs abordent d'un point de vue bayesien le problemedelaselection de variables dans les modeles de regression des risques proportionnels en presence de censure a droite. Ils proposent une approche semip...
Could a CAMELS Downgrade Model Improve OffSite Surveillance? Federal Reserve Bank of St
 Louis Economic Review
, 2002
"... The cornerstone of bank supervision is a regular schedule of thorough, onsite examinations. Under rules set forth in the Federal Deposit Insurance Corporation Improvement Act of 1991 (FDICIA), most U.S. banks must submit to a fullscope federal or state examination every 12 months; small, wellcapi ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
The cornerstone of bank supervision is a regular schedule of thorough, onsite examinations. Under rules set forth in the Federal Deposit Insurance Corporation Improvement Act of 1991 (FDICIA), most U.S. banks must submit to a fullscope federal or state examination every 12 months; small, wellcapitalized banks must be examined every 18 months. These examinations focus on six components of bank safety and soundness: capital protection (C), asset quality (A), management competence (M), earnings strength (E), liquidity risk exposure (L), and market risk sensitivity (S). At the close of each exam, examiners award a grade of one (best) through five (worst) to each component. Supervisors then draw on these six component ratings to assign a composite CAMELS rating, which is also expressed on a scale of one through five. (See the insert for a detailed description of the composite ratings.) In general, banks with composite ratings of one or two are considered safe and sound, whereas banks with ratings of three, four, or five are considered unsatisfactory. As of March 31, 2000, nearly 94 percent of U.S. banks posted composite CAMELS ratings of one or two. Bank supervisors support onsite examinations with offsite surveillance. Offsite surveillance uses quarterly financial data and anecdotal evidence to schedule and plan onsite exams. Although onsite examination is the most effective tool for spotting safetyandsoundness problems, it is costly and R. Alton Gilbert is a vice president and banking advisor, Andrew P. Meyer is an economist, and Mark D. Vaughan is a supervisory policy officer and economist at the Federal Reserve Bank of St. Louis. The
Bayesian Analysis of Ordered Categorical Data from Industrial Experiments
 Technometrics
, 1995
"... Data from industrial experiments often involve an ordered categorical response, such as a qualitative rating. ANOVA based analyses may be inappropriate for such data, suggesting the use of Generalized Linear Models (GLMs). When the data are observed from a fractionated experiment, likelihoodbas ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Data from industrial experiments often involve an ordered categorical response, such as a qualitative rating. ANOVA based analyses may be inappropriate for such data, suggesting the use of Generalized Linear Models (GLMs). When the data are observed from a fractionated experiment, likelihoodbased GLM estimates may be innite, especially when factors have large eects. These diculties are overcome with a Bayesian GLM, which is implemented via the Gibbs sampling algorithm. Techniques for modeling data and for subsequently using the identied model to optimize the process are outlined. An important advantage in the optimization stage is that uncertainty in the parameter estimates is accounted for in the model. For robust design experiments, the Bayesian approach easily incorporates the variability of the noise factors using the response modeling approach (Welch, Yu, Kang and Sacks 1990 and Shoemaker, Tsui and Wu 1991). This approach and its techniques are used to analyze two...
Penalized Cox models and Frailty
, 1998
"... A very general mechanism for penalized regression has been added to the coxph ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
A very general mechanism for penalized regression has been added to the coxph
ISSN 13619802 Manchester Centre for Statistical Science
"... this report. Sections 2.4  2.6 are summaries of the corresponding sections of Rudolfer (2001), which should be referred to for more details. Section 2.3, on the other hand, describes the dataset in detail, being based on Rudolfer (2001), Section 2, since it is not widely known outside the circle o ..."
Abstract
 Add to MetaCart
this report. Sections 2.4  2.6 are summaries of the corresponding sections of Rudolfer (2001), which should be referred to for more details. Section 2.3, on the other hand, describes the dataset in detail, being based on Rudolfer (2001), Section 2, since it is not widely known outside the circle of clinical neurophysiologists and is essential to understand the results of the further methods. 2.2 Notation Muddled notation produces muddled thought. Precise notation produces precise thought. Hence, we shall adopt the following convention throughout this report: sample observed values: small Roman letters population random variables: LARGE Roman letters population parameters: Greek letters estimatE: observed sample statistic that estimates a population parameter estimatOR: random variable of which the estimate is an observed value 2.3 Carpal Tunnel Syndrome (CTS) Dataset: Rudolfer(2001), Section 2 CTS = cluster of certain hand symptoms (to be specified later) cause = entrapment of the median nerve in the Carpal Tunnel at the wrist An excellent and comprehensive account of CTS is given in Rosenbaum & Ochoa (1993). 2.3.1 (ORDINAL) Response Variable Y = # # # # # # # # # # # # # # # # # # # # # # # 1 : No Abnormality Detected (NAD) 2 : Mild CTS 3 : Moderate CTS 4 : Severe CTS Important property of ordinal Y : the event j} is defined. For a nonordinal Y , the statement "Y j" is meaningless. 2.3.2 Predictor Variables These are contained in the vector x = (x 1 , . . . , x p ) , which divides into three types of variables: x = # # # # # # # # # # # # # history clinical signs nerve conduction studies
The Practical Utility of Incorporating Model Selection Uncertainty
, 2004
"... Predictions of disease outcome in prognostic factor models are usually based on one selected model. However, often several models fit the data equally well, but these models might di#er substantially in terms of included explanatory variables and might lead to di#erent predictions for individual pat ..."
Abstract
 Add to MetaCart
Predictions of disease outcome in prognostic factor models are usually based on one selected model. However, often several models fit the data equally well, but these models might di#er substantially in terms of included explanatory variables and might lead to di#erent predictions for individual patients. For survival data we discuss two approaches for accounting for model selection uncertainty in two data examples with the main emphasis on variable selection in a proportional hazard Cox model. The main aim of our investigation is to establish in which ways either of the two approaches are useful in such prognostic models. The first approach is Bayesian model averaging (BMA) adapted for the proportional hazard model (Volinsky et al., 1997). As a new approach we propose a method which averages over a set of possible models using weights estimated from bootstrap resampling as proposed by Buckland et al. (1997), but in addition we perform an initial screening of variables based on the inclusion frequency of each variable to reduce the set of variables and corresponding models. The main objective of prognostic models is prediction, but the interpretation of single e#ects is also important and models should be general enough to ensure transportability to other clinical centres. In the data examples we compare predictions of the two approaches with "conventional" predictions from one selected model and with predictions from the full model. Confidence intervals are compared in one example. Comparisons are based on the partial predictive score and the Brier score. We conclude that the two model averaging methods yield similar results and are especially useful when there is a high number of potential prognostic factors, most likely some of them without influence in a multivariab...