Results 1  10
of
16
Assessment and Propagation of Model Uncertainty
, 1995
"... this paper I discuss a Bayesian approach to solving this problem that has long been available in principle but is only now becoming routinely feasible, by virtue of recent computational advances, and examine its implementation in examples that involve forecasting the price of oil and estimating the ..."
Abstract

Cited by 113 (0 self)
 Add to MetaCart
this paper I discuss a Bayesian approach to solving this problem that has long been available in principle but is only now becoming routinely feasible, by virtue of recent computational advances, and examine its implementation in examples that involve forecasting the price of oil and estimating the chance of catastrophic failure of the U.S. Space Shuttle.
Statistical Themes and Lessons for Data Mining
, 1997
"... Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statist ..."
Abstract

Cited by 32 (3 self)
 Add to MetaCart
Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statistical themes and lessons that are directly relevant to data mining and attempts to identify opportunities where close cooperation between the statistical and computational communities might reasonably provide synergy for further progress in data analysis.
Bayesian Analysis Of A Random Link Function In Binary Response Regression
 Journal of the American Statistical Association Management
, 1994
"... Binary response regression is a useful technique for analyzing categorical data. Popular binary models use special link functions such as the logit or the probit link. We assume that the inverse link function H is a random member of the class of normal scale mixture cdfs. We propose three different ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Binary response regression is a useful technique for analyzing categorical data. Popular binary models use special link functions such as the logit or the probit link. We assume that the inverse link function H is a random member of the class of normal scale mixture cdfs. We propose three different models for this random H : (i) H is a finite scale mixture with a Dirichlet distribution prior on the mixing distribution; (ii) H is a general scale mixture, the mixing distribution has a Dirichlet process prior; and (iii) H is a scale mixture of truncated normal distributions with the mixing distribution having a Dirichlet prior. We describe Bayesian analyses of these models using data augmentation and Gibbs sampling. Model diagnostics by cross validation of the conditional predictive distributions are proposed. These analyses are illustrated in two examples. Our proposed models match the performances of Bayesian probit and t link models in the first example whereas they outperform probit and t link models in the second example.
A Survival Kit on Quantile Estimation
, 1993
"... Questions concerning risk management in finance and premium calculation in nonlife insurance often involve quantile estimation. We give an introduction to the basic extreme value theory which yields a methodological basis for the analysis of such questions. 1. Introduction The following two dates ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Questions concerning risk management in finance and premium calculation in nonlife insurance often involve quantile estimation. We give an introduction to the basic extreme value theory which yields a methodological basis for the analysis of such questions. 1. Introduction The following two dates are important with respect to risk management: ffl February 1, 1953, ffl January 28, 1986. Upon asking "why?" to audiences of bankers and/or (re)insurers, we hardly ever got the right answer. First of all, during the night of February 1, 1953, at various locations the sea dykes in the Netherlands collapsed during a severe storm, causing major flooding in large parts of coastal Holland and killing over 1800 people. An extremal event (a record surge) caused a protective system (the sea dykes) to break down. In the wake of this disaster, the Dutch government established the Deltacommittee. Under van Dantzig, statistical problems discussed included the task of answering the following question...
Failure
"... We present a new visual method for assessing the predictive power of models with categorical outcomes. This technique allows the analyst to quickly and easily choose among alternative model specifications based upon the modelsʼ ability to consistently match highprobability predictions to actual occ ..."
Abstract
 Add to MetaCart
We present a new visual method for assessing the predictive power of models with categorical outcomes. This technique allows the analyst to quickly and easily choose among alternative model specifications based upon the modelsʼ ability to consistently match highprobability predictions to actual occurrences of the event of interest, and lowprobability predictions to nonoccurrences of the event of interest. Unlike existing methods for assessing predictive power for logit and probit models such as the use of “percent correctly predicted ” statistics, Brier scores and the ROC plot, our “separation plot ” has the advantage of producing a visual display that is more informative and easier to explain to a general audience than a ROC plot, while also remaining insensitive to the user's often arbitrary choice of threshold for distinguishing between events and nonevents. We show how to implement this tool as a function in R. 2. How itʼs made Example using a logit model of Oring failure as a function of launch temperature using data from the Challenger dataset 1. Start with actual and predicted data
Prediction Intervals for Class Probabilities
, 2007
"... Prediction intervals for class probabilities are of interest in machine learning because they can quantify the uncertainty about the class probability estimate for a test instance. The idea is that all likely class probability values of the test instance are included, with a prespecified confidence ..."
Abstract
 Add to MetaCart
Prediction intervals for class probabilities are of interest in machine learning because they can quantify the uncertainty about the class probability estimate for a test instance. The idea is that all likely class probability values of the test instance are included, with a prespecified confidence level, in the calculated prediction interval. This thesis proposes a probabilistic model for calculating such prediction intervals. Given the unobservability of class probabilities, a Bayesian approach is employed to derive a complete distribution of the class probability of a test instance based on a set of class observations of training instances in the neighbourhood of the test instance. A random decision tree ensemble learning algorithm is also proposed, whose prediction output constitutes the neighbourhood that is used by the Bayesian model to produce a PI for the test instance. The Bayesian model, which is used in conjunction with the ensemble learning algorithm and the standard nearestneighbour classifier, is evaluated on artificial datasets and modified real datasets. i Acknowledgments
c ○ 1997 Kluwer Academic Publishers. Manufactured in The Netherlands. Statistical Themes and Lessons for Data Mining
, 1996
"... Abstract. Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some ..."
Abstract
 Add to MetaCart
Abstract. Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statistical themes and lessons that are directly relevant to data mining and attempts to identify opportunities where close cooperation between the statistical and computational communities might reasonably provide synergy for further progress in data analysis.
Profit Maximizing Estimators And Medical Decision Rules
, 2001
"... When outcomes of actions are random variables with... In this paper we present a method for simultaneous estimation and prediction that minimizes costs of estimation and prediction errors using a common loss function. This nonparametric method is the frequentists analogue of the Bayesian procedures ..."
Abstract
 Add to MetaCart
When outcomes of actions are random variables with... In this paper we present a method for simultaneous estimation and prediction that minimizes costs of estimation and prediction errors using a common loss function. This nonparametric method is the frequentists analogue of the Bayesian procedures using predictive densities. This technique outperforms the conventional twostep method in both small and large samples. Disease management typically involves problem of joint estimation (diagnosis) and prediction (treatment). In an application to diagnosing hypo and hyperglycemia in diabetic patients and choosing optimal insulin doses, it is shown that average cost of treatment errors is lower with the approach presented in this paper.