Results 1  10
of
26
Model selection by resampling penalization
, 2007
"... We present a new family of model selection algorithms based on the resampling heuristics. It can be used in several frameworks, do not require any knowledge about the unknown law of the data, and may be seen as a generalization of local Rademacher complexities and Vfold crossvalidation. In the cas ..."
Abstract

Cited by 20 (11 self)
 Add to MetaCart
We present a new family of model selection algorithms based on the resampling heuristics. It can be used in several frameworks, do not require any knowledge about the unknown law of the data, and may be seen as a generalization of local Rademacher complexities and Vfold crossvalidation. In the case example of leastsquare regression on histograms, we prove oracle inequalities, and that these algorithms are naturally adaptive to both the smoothness of the regression function and the variability of the noise level. Then, interpretating Vfold crossvalidation in terms of penalization, we enlighten the question of choosing V. Finally, a simulation study illustrates the strength of resampling penalization algorithms against some classical ones, in particular with heteroscedastic data.
Nonparametric inference for additive models
 J. Amer. Statist. Assoc
, 2005
"... Additive models with backfitting algorithms are popular multivariate nonparametric fitting techniques. However, the inferences of the models have not been very well developed, due partially to the complexity of the backfitting estimators. There are few tools available to answer some important and fr ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
Additive models with backfitting algorithms are popular multivariate nonparametric fitting techniques. However, the inferences of the models have not been very well developed, due partially to the complexity of the backfitting estimators. There are few tools available to answer some important and frequently asked questions, such as whether a specific additive component is significant or admits a certain parametric form. In an attempt to address these issues, we extend the generalized likelihood ratio (GLR) tests to additive models, using the backfitting estimator. We demonstrate that under the null models, the newly proposed GLR statistics follow asymptotically rescaled chisquared distributions, with the scaling constants and the degrees of freedom independent of the nuisance parameters. This demonstrates that the Wilks phenomenon continues to hold under a variety of smoothing techniques and more relaxed models with unspecified error distributions. We further prove that the GLR tests are asymptotically optimal in terms of rates of convergence for nonparametric hypothesis testing. In addition, for testing a parametric additive model, we propose a bias corrected method to improve the performance of the GLR. The biascorrected test is shown to share the Wilks type of property. Simulations are conducted to demonstrate the Wilks phenomenon and the power of the proposed tests. A real example is used to illustrate the performance of the testing approach.
Exact Likelihood Ratio Tests for Penalized Splines,” under revision, available at www.orie.cornell.edu/~davidr/papers
 Biometrika
, 2003
"... Penalized splinebased additive models allow a simple mixed model representation where the variance components control departures from linear models. The smoothing parameter is the ratio between the randomcoefficient and error variances and tests for linear regression reduce to tests for zero rando ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
Penalized splinebased additive models allow a simple mixed model representation where the variance components control departures from linear models. The smoothing parameter is the ratio between the randomcoefficient and error variances and tests for linear regression reduce to tests for zero randomcoefficient variances. We propose exact likelihood and restricted likelihood ratio tests, (R)LRTs, for testing polynomial regression versus a general alternative modeled by penalized splines. Their spectral decompositions are used as the basis of fast simulation algorithms. We derive the asymptotic local power properties of (R)LRTs under weak conditions. In particular we characterize the local alternatives that are detected with asymptotic probability 1. Confidence intervals for the smoothing parameter are obtained by inverting the (R)LRT for a fixed smoothing parameter versus a general alternative. We discuss F and R tests and show that ignoring the variability in the smoothing parameter estimator can have a dramatic effect on their null distributions. The power of several known tests is investigated and a small set of tests with good power properties is identified. Some key words: Linear mixed models, penalized splines, smoothing, zero variance components. 2 Ciprian Crainiceanu et al. 1.
Goodness of fit via nonparametric likelihood ratios
 Scandinavian Journal of Statistics
, 2004
"... Abstract. To test if a density f is equal to a specified f0, one knows by the Neyman–Pearson lemma the form of the optimal test at a specified alternative f1. Any nonparametric density estimation scheme allows an estimate of f. This leads to estimated likelihood ratios. Properties are studied of tes ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
Abstract. To test if a density f is equal to a specified f0, one knows by the Neyman–Pearson lemma the form of the optimal test at a specified alternative f1. Any nonparametric density estimation scheme allows an estimate of f. This leads to estimated likelihood ratios. Properties are studied of tests which for the density estimation ingredient use loglinear expansions. Such expansions are either coupled with subset selectors like the AIC and the BIC regimes, or use order growing with sample size. Our tests are generalised to testing adequacy of general parametric models, and work also in higher dimensions. The tests are related to but different from the ‘smooth tests ’ which go back to Neyman (1937) and which have been studied extensively in recent literature. Our tests are largesample equivalent to such smooth tests under local alternative conditions, but different and often better under nonlocal conditions.
Model selection in electromagnetic source analysis with an application to VEF’s
 IEEE Transactions on Biomedical Engineering
, 2002
"... Abstract — In electromagnetic source analysis it is necessary to determine how many sources are required to describe the EEG or MEG adequately. Model selection procedures (MSP’s, or goodness of fit procedures) give an estimate of the required number of sources. Existing and new MSP’s are evaluated i ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
(Show Context)
Abstract — In electromagnetic source analysis it is necessary to determine how many sources are required to describe the EEG or MEG adequately. Model selection procedures (MSP’s, or goodness of fit procedures) give an estimate of the required number of sources. Existing and new MSP’s are evaluated in different source and noise settings: two sources which are close or distant, and noise which is uncorrelated or correlated. The commonly used MSP residual variance is seen to be ineffective, that is it often selects too many sources. Alternatives like the adjusted Hotelling’s test, Bayes information criterion, and the Wald test on source amplitudes are seen to be effective. The adjusted Hotelling’s test is recommended if a conservative approach is taken, and MSP’s such as Bayes information criterion or the Wald test on source amplitudes are recommended if a more liberal approach is desirable. The MSP’s are applied to empirical data (visual evoked fields). I.
Datadriven rateoptimal specification testing in regression models
, 2003
"... We propose new datadriven smooth tests for a parametric regression function. The smoothing parameter is selected through a new criterion that favors a large smoothing parameter under the null hypothesis. The resulting test is adaptive rateoptimal and consistent against Pitman local alternatives ap ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
We propose new datadriven smooth tests for a parametric regression function. The smoothing parameter is selected through a new criterion that favors a large smoothing parameter under the null hypothesis. The resulting test is adaptive rateoptimal and consistent against Pitman local alternatives approaching the parametric model at a rate arbitrarily close to 1 / √ n. Asymptotic critical values come from the standard normal distribution and bootstrap can be used in small samples. A general formalization allows to consider a large class of linear smoothing methods, which can be tailored for detection of additive alternatives. 1. Introduction Consider n observations (Yi, Xi) in R × R p and the heteroscedastic regression model with unknown mean m(·) and variance σ 2 (·) Yi = m(Xi) + εi, E[εiXi] = 0 and Var[εiXi] = σ 2 (Xi). We want to test that the regression belongs to some parametric family {µ(·; θ); θ ∈ Θ}, that is
VFOLD CROSSVALIDATION IMPROVED: VFOLD PENALIZATION
 SUBMITTED TO THE ANNALS OF STATISTICS
, 2008
"... We study the efficiency of Vfold crossvalidation (VFCV) for model selection from the nonasymptotic viewpoint, and suggest an improvement on it, which we call “Vfold penalization”. Considering a particular (though simple) regression problem, we prove that VFCV with a bounded V is suboptimal for m ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
We study the efficiency of Vfold crossvalidation (VFCV) for model selection from the nonasymptotic viewpoint, and suggest an improvement on it, which we call “Vfold penalization”. Considering a particular (though simple) regression problem, we prove that VFCV with a bounded V is suboptimal for model selection, because it “overpenalizes ” all the more that V is large. Hence, asymptotic optimality requires V to go to infinity. However, when the signaltonoise ratio is low, it appears that overpenalizing is necessary, so that the optimal V is not always the larger one, despite of the variability issue. This is confirmed by some simulated data. In order to improve on the prediction performance of VFCV, we define a new model selection procedure, called “Vfold penalization” (penVF). It is a Vfold subsampling version of Efron’s bootstrap penalties, so that it has the same computational cost as VFCV, while being more flexible. In a heteroscedastic regression framework, assuming the models to have a particular structure, we prove that penVF satisfies a nonasymptotic oracle inequality with a leading constant that tends to 1 when the sample size goes to infinity. In particular, this implies adaptivity to the smoothness of the regression function, even with a highly heteroscedastic noise. Moreover, it is easy to overpenalize with penVF, independently from the V parameter. A simulation study shows that this results in a significant improvement on VFCV in nonasymptotic situations.
Moderate Deviations of Minimum Contrast Estimators under Contamination
"... Since statistical models are simplifications of reality, it is important in estimation theory to study the behavior of estimators also under distributions (slightly) different from the proposed model. In testing theory, when dealing with test statistics where nuisance parameters are estimated, knowl ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Since statistical models are simplifications of reality, it is important in estimation theory to study the behavior of estimators also under distributions (slightly) different from the proposed model. In testing theory, when dealing with test statistics where nuisance parameters are estimated, knowledge of the behavior of the estimators of the nuisance parameters is needed under alternatives to evaluate the power. In this paper the moderate deviation behavior of minimum contrast estimators is investigated not only under the supposed model, but also under distributions close to the model. A particular example is the (multivariate) maximum likelihood estimator determined within the proposed model. The setup is quite general, including for instance also discrete distributions. The rate of convergence under alternatives is determined both when comparing the minimum contrast estimator with a ”natural ” parameter in the parameter space and when comparing it with the proposed ”true ” value in the parameter space. It turns out that under the model the asymptotic optimality of the maximum likelihood estimator in the local sense continues to hold in the moderate deviation area.
Lackof fit tests in semiparametric mixed models. Available at www.econ.kuleuven.be/fetew/pdf publicaties/KBI 0709.pdf
, 2007
"... In this paper we obtain the asymptotic distribution of restricted likelihood ratio tests in mixed linear models with a fixed and finite number of random effects. We explain why for such models the often quoted 50:50 mixture of a chisquared random variable with one degree of freedom and a pointmass ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In this paper we obtain the asymptotic distribution of restricted likelihood ratio tests in mixed linear models with a fixed and finite number of random effects. We explain why for such models the often quoted 50:50 mixture of a chisquared random variable with one degree of freedom and a pointmass at zero does not hold. Our motivation is a study of the use of wavelets for lackoffit testing within a mixed model framework. Even though wavelets have received a lot of attention in the last say 15 years for the estimation of piecewise smooth functions, much less is known about their ability to check the adequacy of a parametric model when fitting the observed data. In particular we study the testing power of wavelets for testing a hypothesized parametric model within a mixed model framework. Experimental results show that in several situations the waveletbased test significantly outperforms the competitor based on penalized regression splines. The obtained results are also applicable for testing in mixed models in general, and shed some new insight into previous results.
Rateoptimal datadriven specification testing in regression models
, 2001
"... We propose a general procedure for testing that a regression function has a prescribed parametric form. We allow for multivariate regressors, nonnormal errors and heteroscedasticity of unknown form. The test relies upon a nonparametric linear estimation method, such as a sieves expansion or the ker ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We propose a general procedure for testing that a regression function has a prescribed parametric form. We allow for multivariate regressors, nonnormal errors and heteroscedasticity of unknown form. The test relies upon a nonparametric linear estimation method, such as a sieves expansion or the kernel method. The choice of the smoothing parameter is datadriven. Under the null hypothesis, the asymptotic distribution of the test statistic is the standard normal distribution. Use of bootstrap critical values is formally justified. The test is shown to be adaptive and rateoptimal in the minimax sense. Detection of Pitmantype local alternatives is also studied.