Results 11  20
of
154
A LargeSample Model Selection Criterion Based on Kullback's Symmetric Divergence
 Statistical and Probability Letters
, 1999
"... The Akaike information criterion, AIC, is a widely known and extensively used tool for statistical model selection. AIC serves as an asymptotically unbiased estimator of a variant of Kullback's directed divergence between the true model and a fitted approximating model. The directed divergence is an ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
The Akaike information criterion, AIC, is a widely known and extensively used tool for statistical model selection. AIC serves as an asymptotically unbiased estimator of a variant of Kullback's directed divergence between the true model and a fitted approximating model. The directed divergence is an asymmetric measure of separation between two statistical models, meaning that an alternate directed divergence may be obtained by reversing the roles of the two models in the definition of the measure. The sum of the two directed divergences is Kullback's symmetric divergence. Since the symmetric divergence combines the information in two related though distinct measures, it functions as a gauge of model disparity which is arguably more sensitive than either of its individual components. With this motivation, we propose a model selection criterion which serves as an asymptotically unbiased estimator of a variant of the symmetric divergence between the true model and a fitted approximating model. We examine the performance of the criterion relative to other wellknown criteria in a simulation study. Keywords: AIC, Akaike information criterion, Idivergence, Jdivergence, KullbackLeibler information, relative entropy. Correspondence: Joseph E. Cavanaugh, Department of Statistics, 222 Math Sciences Bldg., University of Missouri, Columbia, MO 65211. y This research was supported by NSF grant DMS9704436. 1.
Performance Prediction for Exponential Language Models
"... We investigate the task of performance prediction for language models belonging to the exponential family. First, we attempt to empirically discover a formula for predicting test set crossentropy for ngram language models. We build models over varying domains, data set sizes, and ngram orders, an ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
We investigate the task of performance prediction for language models belonging to the exponential family. First, we attempt to empirically discover a formula for predicting test set crossentropy for ngram language models. We build models over varying domains, data set sizes, and ngram orders, and perform linear regression to see whether we can model test set performance as a simple function of training set performance and various model statistics. Remarkably, we find a simple relationship that predicts test set performance with a correlation of 0.9997. We analyze why this relationship holds and show that it holds for other exponential language models as well, including classbased models and minimum discrimination information models. Finally, we discuss how this relationship can be applied to improve language model performance. 1
An inverse problem statistical methodology summary
 CRSCTR0801, NCSU, January, 2008; Chapter 11 in Statistical Estimation Approaches in Epidemiology, (edited by Gerardo Chowell, Mac Hyman, Nick Hengartner, Luis M.A Bettencourt 115 Carlos CastilloChavez
, 2009
"... We discuss statistical and computational aspects of inverse or parameter estimation problems based on Ordinary Least Squares and Generalized Least Squares with appropriate corresponding data noise assumptions of constant variance and nonconstant variance (relative error), respectively. Among the top ..."
Abstract

Cited by 11 (8 self)
 Add to MetaCart
We discuss statistical and computational aspects of inverse or parameter estimation problems based on Ordinary Least Squares and Generalized Least Squares with appropriate corresponding data noise assumptions of constant variance and nonconstant variance (relative error), respectively. Among the topics included here are mathematical model, statistical model and data assumptions, and some techniques (residual plots, sensitivity analysis, model comparison tests) for verifying these. The ideas are illustrated throughout with the popular logistic growth model of Verhulst and Pearl as well as with a recently developed population level model of pneumococcal disease spread.
Dna segmentation as a model selection process
 In International Conference on Research in Computational Molecular Biology (RECOMB
"... Previous divideandconquer segmentation analyses of DNA sequences do not provide a satisfactory stopping criterion for the recursion. This paper proposes that segmentation be considered as a model selection process. Using the tools in model selection, a limit for the stopping criterion on the relax ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Previous divideandconquer segmentation analyses of DNA sequences do not provide a satisfactory stopping criterion for the recursion. This paper proposes that segmentation be considered as a model selection process. Using the tools in model selection, a limit for the stopping criterion on the relaxed end can be determined. The Bayesian information criterion, in particular, provides a much more stringent stopping criterion than what is currently used. Such a stringent criterion can be used to delineate larger DNA domains. A relationship between the stopping criterion and the average domain size is empirically determined, which may aid in the determination of isochore borders. 1.
Automatic Arima Time Series Modeling And Forecasting Adaptive Input/Output Prefetching
, 2002
"... This thesis presents a comprehensive software framework  Automodeler  to provide automatic modeling and forecasting for input/output (I/O) request interarrival times. In Automodeler, ARIMA models of interarrival times are automatically identified and built during application ex ecution. Model par ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
This thesis presents a comprehensive software framework  Automodeler  to provide automatic modeling and forecasting for input/output (I/O) request interarrival times. In Automodeler, ARIMA models of interarrival times are automatically identified and built during application ex ecution. Model parameters are recursively estimated in realtime for every new request arrival, adapting to changes that are intrinsic or external to the running application. Online forecasts are subsequently generated based on the updated parameters.
Minimum Message Length Autoregressive Model Order Selection
 International Conference on Intelligent Sensing and Information Processing (ICISIP
, 2004
"... We derive a Minimum Message Length (MML) estimator for stationary and nonstationary autoregressive models using the Wallace and Freeman (1987) approximation. The MML estimator’s model selection performance is empirically compared with AIC, AICc, BIC and HQ in a Monte Carlo experiment by uniformly sa ..."
Abstract

Cited by 10 (9 self)
 Add to MetaCart
We derive a Minimum Message Length (MML) estimator for stationary and nonstationary autoregressive models using the Wallace and Freeman (1987) approximation. The MML estimator’s model selection performance is empirically compared with AIC, AICc, BIC and HQ in a Monte Carlo experiment by uniformly sampling from the autoregressive stationarity region. Generally applicable, uniform priors are used on the coefficients, model order and log σ 2 for the MML estimator. The experimental results show the MML estimator to have the best overall average mean squared prediction error and best ability to choose the true model order.
Combining Time Series Models for Forecasting
, 2002
"... Statistical models (e.g., ARIMA models) have been commonly used in time series data analysis and forecasting. Typically one model is selected based on a selection criterion (e.g., AIC), hypothesis testing, and/or graphical inspections. The selected model is then used to forecast future values. Howev ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
Statistical models (e.g., ARIMA models) have been commonly used in time series data analysis and forecasting. Typically one model is selected based on a selection criterion (e.g., AIC), hypothesis testing, and/or graphical inspections. The selected model is then used to forecast future values. However, model selection is often unstable and may cause an unnecessarily high variability in the final estimation/prediction. In this work, we propose the use of an algorithm AFTER to convexly combine the models for a better performance of prediction. The weights are sequentially updated after each additional observation. Simulations and real data examples are used to compare performance of our approach with model selection methods. The results show advantage of combining by AFTER over selection in term of forecasting accuracy at several settings.
Bivariate Tensorproduct BSplines in a Partly Linear Model
, 1996
"... : In some applications, the mean or median response is linearly related to some variables but the relation to additional variables are not easily parameterized. Partly linear models arise naturally in such circumstances. Suppose that a random sample f(T i ; X i ; Y i ); i = 1; 2; \Delta \Delta \Delt ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
: In some applications, the mean or median response is linearly related to some variables but the relation to additional variables are not easily parameterized. Partly linear models arise naturally in such circumstances. Suppose that a random sample f(T i ; X i ; Y i ); i = 1; 2; \Delta \Delta \Delta ; ng is modeled by Y i = X T i fi 0 + g 0 (T i ) + error i , where Y i is a realvalued response, X i 2 R p and T i ranges over a unit square, and g 0 is an unknown function with a certain degree of smoothness. We make use of bivariate tensorproduct Bsplines as an approximation of the function g 0 and consider Mtype regression splines by minimization of P n i=1 ae(Y i \Gamma X T i fi \Gamma g n (T i )) for some convex function ae. Mean, median and quantile regressions are included in this class. We show under appropriate conditions that the parameter estimate of fi achieves its information bound asymptotically and the function estimate of g 0 attains the optimal rate of convergen...
Penalized loss functions for Bayesian model comparison
"... The deviance information criterion (DIC) is widely used for Bayesian model comparison, despite the lack of a clear theoretical foundation. DIC is shown to be an approximation to a penalized loss function based on the deviance, with a penalty derived from a crossvalidation argument. This approximati ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
The deviance information criterion (DIC) is widely used for Bayesian model comparison, despite the lack of a clear theoretical foundation. DIC is shown to be an approximation to a penalized loss function based on the deviance, with a penalty derived from a crossvalidation argument. This approximation is valid only when the effective number of parameters in the model is much smaller than the number of independent observations. In disease mapping, a typical application of DIC, this assumption does not hold and DIC underpenalizes more complex models. Another deviancebased loss function, derived from the same decisiontheoretic framework, is applied to mixture models, which have previously been considered an unsuitable application for DIC.
A Bootstrap Variant of AIC for StateSpace Model Selection
 STATISTICA SINICA
, 1997
"... Following in the recent work of Hurvich and Tsai (1989, 1991, 1993) and Hurvich, Shumway, and Tsai (1990), we propose a corrected variant of AIC developed for the purpose of smallsample statespace model selection. Our variant of AIC utilizes bootstrapping in the statespace framework (Stoffer and ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
Following in the recent work of Hurvich and Tsai (1989, 1991, 1993) and Hurvich, Shumway, and Tsai (1990), we propose a corrected variant of AIC developed for the purpose of smallsample statespace model selection. Our variant of AIC utilizes bootstrapping in the statespace framework (Stoffer and Wall (1991)) to provide an estimate of the expected KullbackLeibler discrepancy between the model generating the data and a fitted approximating model. We present simulation results which demonstrate that in smallsample settings, our criterion estimates the expected discrepancy with less bias than traditional AIC and certain other competitors. As a result, our AIC variant serves as an effective tool for selecting a model of appropriate dimension. We present an asymptotic justification for our criterion in the Appendix.