Results 1 -
8 of
8
Higher criticism for detecting sparse heterogeneous mixtures
- Ann. Statist
, 2004
"... Higher Criticism, or second-level significance testing, is a multiple comparisons concept mentioned in passing by Tukey (1976). It concerns a situation where there are many independent tests of significance and one is interested in rejecting the joint null hypothesis. Tukey suggested to compare the ..."
Abstract
-
Cited by 51 (10 self)
- Add to MetaCart
Higher Criticism, or second-level significance testing, is a multiple comparisons concept mentioned in passing by Tukey (1976). It concerns a situation where there are many independent tests of significance and one is interested in rejecting the joint null hypothesis. Tukey suggested to compare the fraction of observed significances at a given α-level to the expected fraction under the joint null, in fact he suggested to standardize the difference of the two quantities and form a z-score; the resulting z-score tests the significance of the body of significance tests. We consider a generalization, where we maximize this z-score over a range of significance levels 0 < α ≤ α0. We are able to show that the resulting Higher Criticism statistic is effective at resolving a very subtle testing problem: testing whether n normal means are all zero versus the alternative that a small fraction is nonzero. The subtlety of this ‘sparse normal means ’ testing problem can be seen from work of Ingster (1999) and Jin (2002), who studied such problems in great detail. In their studies, they identified an interesting range of cases where the small fraction of nonzero means is so
The Likelihood Ratio Test for Homogeneity in the Finite Mixture Models
, 2001
"... The authors study the asymptotic behaviour of the likelihood ratio statistic for testing homogeneity in the finite mixture models of a general parametric distribution family. They prove that the limiting distribution of this statistic is the squared supremum of a truncated standard Gaussian process. ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
The authors study the asymptotic behaviour of the likelihood ratio statistic for testing homogeneity in the finite mixture models of a general parametric distribution family. They prove that the limiting distribution of this statistic is the squared supremum of a truncated standard Gaussian process. The autocorrelation function of the Gaussian process is explicitly presented. A re-sampling procedure is recommended to obtain the asymptotic p-value. Three kernel functions, normal, binomial and Poisson, are used in a simulation study which illustrates the procedure.
Variational Bayes Solution of Linear Neural Networks and its Generalization Performance,” Neural Computation
, 2005
"... It is well-known that, in unidentifiable models, the Bayes estimation provides much better generalization performance than the maximum likelihood (ML) estimation. However, its accurate approximation by Markov chain Monte Carlo methods requires huge computational costs. As an alternative, a tractable ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
It is well-known that, in unidentifiable models, the Bayes estimation provides much better generalization performance than the maximum likelihood (ML) estimation. However, its accurate approximation by Markov chain Monte Carlo methods requires huge computational costs. As an alternative, a tractable approximation method, called the variational Bayes (VB) approach, has recently been proposed and been attracting people’s attention. Its advantage over the expectation maximization (EM) algorithm, often used for realizing the ML estimation, has been experimentally shown in many applications, nevertheless, has not been theoretically shown yet. In this paper, through the analysis of the simplest unidentifiable models, we theoretically show some properties of the VB approach. We first prove that, in three-layer linear neural networks, the VB approach is asymptotically equivalent to a positive-part James-Stein type shrinkage estimation. Then, we theoretically clarify its free energy, generalization error, and training error. Comparing them with those of the ML estimation and of the Bayes estimation, we discuss the advantage of the VB approach. We also show that, unlike in the Bayes estimation, the free energy and the generalization error are less simply related with each other, and that, in typical cases, the VB free energy well approximates the Bayes one, while the VB generalization error significantly differs from the Bayes one. 1 1
TAILOR-MADE TESTS FOR GOODNESS OF FIT TO SEMIPARAMETRIC HYPOTHESES
"... We introduce a new framework for constructing tests of general semiparametric hypotheses which have nontrivial power on the n −1/2 scale in every direction, and can be tailored to put substantial power on alternatives of importance. The approach is based on combining test statistics based on stochas ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We introduce a new framework for constructing tests of general semiparametric hypotheses which have nontrivial power on the n −1/2 scale in every direction, and can be tailored to put substantial power on alternatives of importance. The approach is based on combining test statistics based on stochastic processes of score statistics with bootstrap critical values. 1. Introduction. The
Constrained Nonparametric Maximum Likelihood Estimation for Mixture Models
- Journal of Statistics
, 1997
"... A nonparametric mixture model specifies that observations arise from a mixture distribution, R f(x; `) dG(`); where the mixing distribution, G; is completely unspecified. A number of algorithms have been developed to obtain unconstrained maximum likelihood estimates of G (see, for instance, Laird ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
A nonparametric mixture model specifies that observations arise from a mixture distribution, R f(x; `) dG(`); where the mixing distribution, G; is completely unspecified. A number of algorithms have been developed to obtain unconstrained maximum likelihood estimates of G (see, for instance, Laird 1978, Bohning 1985, and Lesperance and Kalbfleisch 1992), but none of these algorithms lead to estimates when functional constraints are present. In many cases, there is a natural interest in functionals OE(G), such as the mean and variance, of the mixing distribution, and profile likelihoods and confidence intervals for OE(G) are desired. In this paper we develop a penalized generalization of the ISDM algorithm of Kalbfleisch and Lesperance (1992), that can be used to solve the problem of constrained estimation. We also discuss its usage in various different applications. Convergence results and numerical examples are given for the generalized ISDM algorithm, and asymptotic results are dev...
Testing Homogeneity in a Mixture Distribution via the L² Distance Between Competing Models
- Journal of the American Statistical Society
, 2004
"... Ascertaining the number of components in a mixture distribution is an interesting and challenging problem for statisticians. Chen, Chen, and Kalbeisch (2001) recently proposed a modified likelihood ratio test (MLRT), which is distribution-free and locally most powerful, asymptotically. In this paper ..."
Abstract
- Add to MetaCart
Ascertaining the number of components in a mixture distribution is an interesting and challenging problem for statisticians. Chen, Chen, and Kalbeisch (2001) recently proposed a modified likelihood ratio test (MLRT), which is distribution-free and locally most powerful, asymptotically. In this paper we present a new method for testing whether a finite mixture distribution is homogeneous. Our method, the D-test, is based on the L² distance between a fitted homogeneous model and a fitted heterogeneous model. For mixture components from standard distributions, our D-test statistic has closed-form expressions in terms of parameter estimates, whereas likelihood ratio-type test statistics do not. Thus, our test has potential for data mining applications. The convergence rate of the D-test statistic under a null hypothesis of homogeneity is established. The D-test is shown to be competitive with the MLRT when the mixture components are normal. The MLRT performs better for small sample sizes when the mixture components are exponential, but in this case there is little visual separation and, hence, little L² separation between the homogeneous and heterogeneous models. Thus, we propose that the measure underlying the L² be modified according to a suitable weight function, which is equivalent to transforming the data before applying the D-test. Such a modification produces a generalized D-test that is competitive in the aforementioned case. After applying our method to a data set in which the observations are measurements of firms' financial performances, we conclude with discussion and remarks.
� � � ��d k θ ↦ → θ1,...,θk, θj − fj {θi} n
, 2003
"... 3.9 A likelihood ratio test for nested composite hypotheses: Wilks’s theorem. Let Θ be a d-dimensional parameter space, specifically, an open set in Rd. Let H0 be a k-dimensional subset of Θ, in a sense to be made more precise below, for some k< d. For example, H0 could be the intersection with Θ of ..."
Abstract
- Add to MetaCart
3.9 A likelihood ratio test for nested composite hypotheses: Wilks’s theorem. Let Θ be a d-dimensional parameter space, specifically, an open set in Rd. Let H0 be a k-dimensional subset of Θ, in a sense to be made more precise below, for some k< d. For example, H0 could be the intersection with Θ of a k-dimensional flat hyperplane. Let {Pθ, θ ∈ Θ} be an equivalent family of laws on a sample space (X, B) with a likelihood function f(θ, x)> 0 for all θ ∈ Θ and x ∈ X. Assume that observations X1,...,Xn are i.i.d. Pθ for some θ ∈ Θ. We want to test the hypothesis that θ ∈ H0. S. S. Wilks proposed the following test: let L(θ, x): = log f(θ, x) be the log likelihood. For n observations, let the maximum log likelihoods over Θ and H0 be respectively MLLd: = sup L(θ, Xj), MLLk: = sup L(θ, Xj). θ∈Θ θ∈H0 j=1 j=1 Let W: = 2(MLLd − MLLk). Wilks found that if the hypothesis H0 is true, then the distribution of W converges as n →∞to a χ2 distribution with d − k degrees of freedom,
Inference in Perturbation Models, Finite Mixtures and Scan Statistics: The Volume-of-Tube Formula
, 2006
"... This research creates a general class of perturbation models which are described by an underlying null model that accounts for most of the structure in data and a perturbation that accounts for possible small localized departures. The perturbation models encompass finite mixture models and spatial s ..."
Abstract
- Add to MetaCart
This research creates a general class of perturbation models which are described by an underlying null model that accounts for most of the structure in data and a perturbation that accounts for possible small localized departures. The perturbation models encompass finite mixture models and spatial scan process. In this article, (1) we propose a new test statistic to detect the presence of perturbation, including the case where the null model contains a set of nuisance parameters, and show that it is equivalent to the likelihood ratio test; (2) we establish that the asymptotic distribution of the test statistic is equivalent to the supremum of a Gaussian random field over a high-dimensional manifold (e.g., curve, surface etc.) with boundaries and singularities; (3) we derive a technique for approximating the quantiles of the test statistic using the Hotelling-Weyl-Naiman volume-of-tube formula; and (4) we solve the long-pending problem of testing for the order of a mixture model; in particular, derive the asymptotic null distribution for a general family of mixture models including the multivariate mixtures. The inferential theory developed in this article is applicable for a class of non-regular statistical problems involving loss of identifiability or when some of the parameters are on the boundary of the parametric space.

