Results 1  10
of
30
Asymptotic Model Selection for Naive Bayesian Networks
 In Proc. of the 18th Conference on Uncertainty in Artificial Intelligence (UAI02
, 2002
"... We develop a closed form asymptotic formula to compute the marginal likelihood of data given a naive Bayesian network model with two hidden states and binary features. ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
We develop a closed form asymptotic formula to compute the marginal likelihood of data given a naive Bayesian network model with two hidden states and binary features.
Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2001
"... ..."
Consistent Estimation Of Mixture Complexity
, 2001
"... ... This article presents a semiparametric methodology yielding almost sure convergence of the estimated number of components to the true but unknown number of components. The scope of application is vast, as mixture models are routinely employed across the entire diverse application range of st ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
... This article presents a semiparametric methodology yielding almost sure convergence of the estimated number of components to the true but unknown number of components. The scope of application is vast, as mixture models are routinely employed across the entire diverse application range of statistics, including nearly all of the social and experimental sciences.
Breakdown points for maximum likelihood estimators of locationscale mixtures
 The Annals of Statistics
, 2004
"... MLestimation based on mixtures of Normal distributions is a widely used tool for cluster analysis. However, a single outlier can make the parameter estimation of at least one of the mixture components break down. Among others, the estimation of mixtures of tdistributions by McLachlan and Peel [Fin ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
MLestimation based on mixtures of Normal distributions is a widely used tool for cluster analysis. However, a single outlier can make the parameter estimation of at least one of the mixture components break down. Among others, the estimation of mixtures of tdistributions by McLachlan and Peel [Finite Mixture Models (2000) Wiley, New York] and the addition of a further mixture component accounting for “noise ” by Fraley and Raftery [The Computer J. 41 (1998) 578–588] were suggested as more robust alternatives. In this paper, the definition of an adequate robustness measure for cluster analysis is discussed and bounds for the breakdown points of the mentioned methods are given. It turns out that the two alternatives, while adding stability in the presence of outliers of moderate size, do not possess a substantially better breakdown behavior than estimation based on Normal mixtures. If the number of clusters s is treated as fixed, r additional points suffice for all three methods to let the
Analysis of Irish thirdlevel college applications data
, 2006
"... The Irish college admissions system involves prospective students listing up to ten courses in order of preference on their application. Places in third level educational institutions are subsequently offered to the applicants on the basis of both their preferences and their final second level exami ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
The Irish college admissions system involves prospective students listing up to ten courses in order of preference on their application. Places in third level educational institutions are subsequently offered to the applicants on the basis of both their preferences and their final second level examination results. The college applications system is a large area of public debate in Ireland. Detractors suggest the process creates artificial demand for ‘high profile ’ courses, causing applicants to ignore their vocational callings. Supporters argue that the system is impartial and transparent. The Irish college degree applications data from the year 2000 is analyzed using mixture models based on ranked data models to investigate the types of application behavior exhibited by college applicants. The results of this analysis show that applicants form groups according to both the discipline and geographical location of their course choices. In addition, there is evidence of the suggested ‘points race ’ for high profile courses. Finally, gender emerges as an influential factor when studying course choice behavior.
Nonparametric Identification and Estimation of Multivariate Mixtures
, 2008
"... This article analyzes the identifiability of kvariate, Mcomponent finite mixture models without making parametric assumptions on the component distributions. We consider the identifiability of both the number of components and the component distributions. Under the assumption of conditionally inde ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
This article analyzes the identifiability of kvariate, Mcomponent finite mixture models without making parametric assumptions on the component distributions. We consider the identifiability of both the number of components and the component distributions. Under the assumption of conditionally independent marginals that have been used in the existing literature, we reveal an important link between the number of variables (k), the number of values each variable can take, and the number of identifiable components. The number of components (M) is nonparametrically identifiable if k ≥ 2 and each element of the variables takes at least M different values. The mixing proportions and the component distributions are nonparametrically identified if k ≥ 3 and each element of the variables takes at least M different values. Our requirement on k substantially improves the existing work, which requires either k ≥ 2M − 1 or k ≥ 6M log M. The number of components is identified by the rank of a matrix constructed from the distribution function of the data. Exploiting this property, we propose a procedure to nonparametrically estimate the number of components.
Estimating and testing the order of a model
, 2002
"... This paper deals with order identification for nested models in the i.i.d. framework. We study the asymptotic efficiency of two generalized likelihood ratio tests of the order. They are based on two estimators which are proved to be strongly consistent. A version of Stein’s lemma yields an optimal u ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
This paper deals with order identification for nested models in the i.i.d. framework. We study the asymptotic efficiency of two generalized likelihood ratio tests of the order. They are based on two estimators which are proved to be strongly consistent. A version of Stein’s lemma yields an optimal underestimation error exponent. The lemma also implies that the overestimation error exponent is necessarily trivial. Our tests admit nontrivial underestimation error exponents. The optimal underestimation error exponent is achieved in some situations. The overestimation error can decay exponentially with respect to a positive power of the number of observations. These results are proved under mild assumptions by relating the underestimation (resp. overestimation) error to large (resp. moderate) deviations of the loglikelihood process. In particular, it is not necessary that the classical Cramér condition be satisfied; namely, the logdensities are not required to admit every exponential moment. Three benchmark examples with specific difficulties (location mixture of normal distributions, abrupt changes and various regressions) are detailed so as to illustrate the generality of our results.
Segmenting magnetic resonance images via hierarchical mixture modelling
 Comput. Statist. Data Anal
, 2006
"...  We present a statistically innovative as well as scientifically and practically relevant method for automatically segmenting magnetic resonance images using hierarchical mixture models. Our method is a general tool fo ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
 We present a statistically innovative as well as scientifically and practically relevant method for automatically segmenting magnetic resonance images using hierarchical mixture models. Our method is a general tool for automated cortical analysis which promises to contribute substantially to the science of neuropsychiatry. We demonstrate that our method has advantages over competing approaches on a magnetic resonance brain imagery segmentation task.
Parsimonious Gaussian Mixture Models
"... completed during a visit to the Center for Statistics in the Social Sciences which was supported by NIH ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
completed during a visit to the Center for Statistics in the Social Sciences which was supported by NIH