Results 1  10
of
10
Bayesian Statistics
 in WWW', Computing Science and Statistics
, 1989
"... ∗ Signatures are on file in the Graduate School. This dissertation presents two topics from opposite disciplines: one is from a parametric realm and the other is based on nonparametric methods. The first topic is a jackknife maximum likelihood approach to statistical model selection and the second o ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
∗ Signatures are on file in the Graduate School. This dissertation presents two topics from opposite disciplines: one is from a parametric realm and the other is based on nonparametric methods. The first topic is a jackknife maximum likelihood approach to statistical model selection and the second one is a convex hull peeling depth approach to nonparametric massive multivariate data analysis. The second topic includes simulations and applications on massive astronomical data. First, we present a model selection criterion, minimizing the KullbackLeibler distance by using the jackknife method. Various model selection methods have been developed to choose a model of minimum KullbackLiebler distance to the true model, such as Akaike information criterion (AIC), Bayesian information criterion (BIC), Minimum description length (MDL), and Bootstrap information criterion. Likewise, the jackknife method chooses a model of minimum KullbackLeibler distance through bias reduction. This bias, which is inevitable in model
A LargeSample Model Selection Criterion Based on Kullback's Symmetric Divergence
 Statistical and Probability Letters
, 1999
"... The Akaike information criterion, AIC, is a widely known and extensively used tool for statistical model selection. AIC serves as an asymptotically unbiased estimator of a variant of Kullback's directed divergence between the true model and a fitted approximating model. The directed divergence is an ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
The Akaike information criterion, AIC, is a widely known and extensively used tool for statistical model selection. AIC serves as an asymptotically unbiased estimator of a variant of Kullback's directed divergence between the true model and a fitted approximating model. The directed divergence is an asymmetric measure of separation between two statistical models, meaning that an alternate directed divergence may be obtained by reversing the roles of the two models in the definition of the measure. The sum of the two directed divergences is Kullback's symmetric divergence. Since the symmetric divergence combines the information in two related though distinct measures, it functions as a gauge of model disparity which is arguably more sensitive than either of its individual components. With this motivation, we propose a model selection criterion which serves as an asymptotically unbiased estimator of a variant of the symmetric divergence between the true model and a fitted approximating model. We examine the performance of the criterion relative to other wellknown criteria in a simulation study. Keywords: AIC, Akaike information criterion, Idivergence, Jdivergence, KullbackLeibler information, relative entropy. Correspondence: Joseph E. Cavanaugh, Department of Statistics, 222 Math Sciences Bldg., University of Missouri, Columbia, MO 65211. y This research was supported by NSF grant DMS9704436. 1.
Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm
, 2000
"... We consider the model selection problem in the class of stationary variable length Markov chains (VLMC) on a nite space. The processes in this class are still Markovian of higher order, but with memory of variable length. Various aims in selecting a VLMC can be formalized with dierent nonequivalent ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
We consider the model selection problem in the class of stationary variable length Markov chains (VLMC) on a nite space. The processes in this class are still Markovian of higher order, but with memory of variable length. Various aims in selecting a VLMC can be formalized with dierent nonequivalent risks, such as nal prediction error or expected KullbackLeibler information. We consider the asymptotic behavior of dierent risk functions and show how they can be generally estimated with the same resampling strategy. Such estimated risks then yield new model selection criteria. In particular, we obtain a datadriven tuning of Rissanen's tree structured context algorithm which is a computationally feasible procedure for selection and estimation of a VLMC. Key words and phrases. Bootstrap, zeroone loss, nal prediction error, nitememory source, FSMX model, KullbackLeibler information, L 2 loss, optimal tree pruning, resampling, tree model. Short title: Selecting variable length Mar...
A Bootstrap Variant of AIC for StateSpace Model Selection
 STATISTICA SINICA
, 1997
"... Following in the recent work of Hurvich and Tsai (1989, 1991, 1993) and Hurvich, Shumway, and Tsai (1990), we propose a corrected variant of AIC developed for the purpose of smallsample statespace model selection. Our variant of AIC utilizes bootstrapping in the statespace framework (Stoffer and ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
Following in the recent work of Hurvich and Tsai (1989, 1991, 1993) and Hurvich, Shumway, and Tsai (1990), we propose a corrected variant of AIC developed for the purpose of smallsample statespace model selection. Our variant of AIC utilizes bootstrapping in the statespace framework (Stoffer and Wall (1991)) to provide an estimate of the expected KullbackLeibler discrepancy between the model generating the data and a fitted approximating model. We present simulation results which demonstrate that in smallsample settings, our criterion estimates the expected discrepancy with less bias than traditional AIC and certain other competitors. As a result, our AIC variant serves as an effective tool for selecting a model of appropriate dimension. We present an asymptotic justification for our criterion in the Appendix.
InSample OutofSample Fit: Their Joint Distribution and Its Implications for Model Selection.” Unpublished manuscript
, 2008
"... Prepared for the 5th ECB Workshop on Forecasting Techniques We consider the case where a parameter, ; is estimated by maximizing a criterion function, Q(X;). The estimate is then used to evaluate the criterion function with the same data, X, as well as with an independent data set, Y. The insample ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Prepared for the 5th ECB Workshop on Forecasting Techniques We consider the case where a parameter, ; is estimated by maximizing a criterion function, Q(X;). The estimate is then used to evaluate the criterion function with the same data, X, as well as with an independent data set, Y. The insample …t and outofsample …t relative to that of 0; the “true ” parameter, are given by Tx;x =
TreeStructured GARCH Models
, 2000
"... We propose a new GARCH model with treestructured multiple thresholds for volatility estimation in nancial time series. The approach relies on the idea of a binary tree where every terminal node parameterizes a (local) GARCH model for a partition cell of the predictor space. Fitting of such trees is ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We propose a new GARCH model with treestructured multiple thresholds for volatility estimation in nancial time series. The approach relies on the idea of a binary tree where every terminal node parameterizes a (local) GARCH model for a partition cell of the predictor space. Fitting of such trees is constructed within the likelihood framework for nonGaussian observations: it is very dierent from the wellknown CART procedure for regression which is based on residual sum of squares. Our strategy includes the classical GARCH model as a special case and allows to increase modelcomplexity in a systematic and exible way. We derive a consistency result and conclude with simulations and real data analysis that the new method has better predictive potential in comparison with other approaches. Keywords: Conditional variance; Financial time series; GARCH model; Maximum likelihood; Threshold model; Tree model; Volatility. 1 Introduction We propose a new method for estimating volatility in...
A Regression Model Selection Criterion Based on Bootstrap Bumping for Use With Resistant Fitting
, 2000
"... We propose a model selection criterion for regression applications where resistant fitting is appropriate. Our criterion gauges the adequacy of a fitted model based on the median squared error of prediction. The criterion is easily computed using the bootstrap "bumping" algorithm of Tibshirani and K ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We propose a model selection criterion for regression applications where resistant fitting is appropriate. Our criterion gauges the adequacy of a fitted model based on the median squared error of prediction. The criterion is easily computed using the bootstrap "bumping" algorithm of Tibshirani and Knight (1999), which provides a convenient method for obtaining least median of squares model parameter estimates. We present an example to illustrate the merit of the criterion in instances where the underlying data set contains influential values. Additionally, we present and discuss the results of a simulation study which illustrates the effectiveness of the criterion under a wide range of error distributions.
MODEL BUILDING IN PROC PHREG WITH AUTOMATIC VARIABLE SELECTION AND INFORMATION CRITERIA
"... In SUGI’29 presentation, we suggested that our strategy of model building in PROC LOGISTIC (see also our SUGI’26 and SUGI’28 papers) could work for PROC PHREG as well. Our suggestion was based on the close similarity between logistic and Cox’s regressions, including information criteria and stepwise ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In SUGI’29 presentation, we suggested that our strategy of model building in PROC LOGISTIC (see also our SUGI’26 and SUGI’28 papers) could work for PROC PHREG as well. Our suggestion was based on the close similarity between logistic and Cox’s regressions, including information criteria and stepwise, forward, backward and score options. Here we elaborate on this suggestion. As in logistic regression, we propose an approach to model building for prediction in survival analysis based on the combination of stepwise regression, Akaike information criteria, and the best subset selection. As in the case of PROC LOGISTIC, the approach inherits some strong features of the three components mentioned above. In particular, the approach helps to avoid the agonizing process of choosing the “right ” critical pvalue in stepwise regression.
THE USE OF WAVELET PACKETS FOR EVENT DETECTION
"... In this paper, we propose a best basis selection method to choose a set of packets from a wavelet packet tree. Our goal is to obtain packets that show changes in both energy and frequency. The criterion adapted to choose the best basis is the KullbackLeibler Distance (KLD). When there is no event t ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
In this paper, we propose a best basis selection method to choose a set of packets from a wavelet packet tree. Our goal is to obtain packets that show changes in both energy and frequency. The criterion adapted to choose the best basis is the KullbackLeibler Distance (KLD). When there is no event to be detected, the estimated KLD follows roughly an exponential distribution depending on only one parameter: the length of the windows partitioning the signal. When events are detected in a packet, the distribution of the estimated KLD deviates from the exponential distribution. The statistics KolmogorovSmirnov are used to measure the separation between experimental and theoretical cumulative distributions in order to highlight the presence of ruptures, then to select the most relevant packets. 1.
Devant le jury:
"... pour obtenir le grade de Docteur de l’Institut des Sciences et Industries du Vivant et de l’Environnement (Agro Paris Tech) Spécialité: STATISTIQUE Présentée et soutenue publiquement ..."
Abstract
 Add to MetaCart
pour obtenir le grade de Docteur de l’Institut des Sciences et Industries du Vivant et de l’Environnement (Agro Paris Tech) Spécialité: STATISTIQUE Présentée et soutenue publiquement