Results 1 - 10
of
73
On Bayesian analysis of mixtures with an unknown number of components
- INSTITUTE OF INTERNATIONAL ECONOMICS PROJECT ON INTERNATIONAL COMPETITION POLICY," COM/DAFFE/CLP/TD(94)42
, 1997
"... ..."
On Spectral Learning of Mixtures of Distributions
"... We consider the problem of learning mixtures of distributions via spectral methods and derive a tight characterization of when such methods are useful. Specifically, given a mixture-sample, let i , C i , w i denote the empirical mean, covariance matrix, and mixing weight of the i-th component. We ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
We consider the problem of learning mixtures of distributions via spectral methods and derive a tight characterization of when such methods are useful. Specifically, given a mixture-sample, let i , C i , w i denote the empirical mean, covariance matrix, and mixing weight of the i-th component. We prove that a very simple algorithm, namely spectral projection followed by single-linkage clustering, properly classifies every point in the sample when each i is separated from all j by 2 (1/w i +1/w j ) plus a term that depends on the concentration properties of the distributions in the mixture. This second term is very small for many distributions, including Gaussians, Log-concave, and many others. As a result, we get the best known bounds for learning mixtures of arbitrary Gaussians in terms of the required mean separation. On the other hand, we prove that given any k means i and mixing weights w i , there are (many) sets of matrices C i such that each i is separated from all j by 2 (1/w i + 1/w j ) , but applying spectral projection to the corresponding Gaussian mixture causes it to collapse completely, i.e., all means and covariance matrices in the projected mixture are identical.
Robust mixture modelling using the t distribution
- Statistics and Computing
"... Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.
A Spectral Algorithm for Learning Mixtures of Distributions
- Journal of Computer and System Sciences
, 2002
"... We show that a simple spectral algorithm for learning a mixture of k spherical Gaussians in R works remarkably well --- it succeeds in identifying the Gaussians assuming essentially the minimum possible separation between their centers that keeps them unique (solving an open problem of [1]). The ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
We show that a simple spectral algorithm for learning a mixture of k spherical Gaussians in R works remarkably well --- it succeeds in identifying the Gaussians assuming essentially the minimum possible separation between their centers that keeps them unique (solving an open problem of [1]). The sample complexity and running time are polynomial in both n and k. The algorithm also works for the more general problem of learning a mixture of "weakly isotropic" distributions (e.g. a mixture of uniform distributions on cubes).
Optimal Time Bounds for Approximate Clustering
, 2002
"... Clusteringisafundamentalprobleminunsuper-vised learning, andhasbeenstudiedwidelyboth asaproblemoflearningmixture modelsandasanoptimizationproblem. Inthispaper, we studyclusteringwithrespectthe k-median objectivefunction, anaturalformulationofclusteringin whichweattempttominimize the average distance ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
Clusteringisafundamentalprobleminunsuper-vised learning, andhasbeenstudiedwidelyboth asaproblemoflearningmixture modelsandasanoptimizationproblem. Inthispaper, we studyclusteringwithrespectthe k-median objectivefunction, anaturalformulationofclusteringin whichweattempttominimize the average distancetoclustercenters. Oneofthe maincontributionsofthispaperisasimplebutpowerful samplingtechniquethatwecall successivesampling thatcouldbeofindependentinterest. Weshowthatoursamplingprocedurecan rapidlyidentify asmallsetofpoints(ofsizejust O(k log n/k))thatsummarizetheinputpoints forthepurposeofclustering. Usingsuccessive sampling, we develop analgorithmforthe k-medianproblemthatrunsin O(nk) timeforawiderangeof valuesof k andisguaranteed, with high probability, to return a solution with cost at most a constant factor times optimal. We also establish a lower bound of \Omega ( nk) onanyrandom-izedconstant-factorapproximation algorithm for the k-median problem that succeeds with even a negligible (say
Managing uncertainty in call centers using Poisson mixtures
- Applied Stochastic Models in Business and Industry
, 2001
"... We model a call center as a queueing model with Poisson arrivals having an unknown varying arrival rate. We show how to compute prediction intervals for the arrival rate, and use the Erlang formula for the waiting time to compute the consequences for the occupancy level of the call center. We compar ..."
Abstract
-
Cited by 25 (4 self)
- Add to MetaCart
We model a call center as a queueing model with Poisson arrivals having an unknown varying arrival rate. We show how to compute prediction intervals for the arrival rate, and use the Erlang formula for the waiting time to compute the consequences for the occupancy level of the call center. We compare it to the current practice of using a point estimate of the arrival rate (assumed constant) as forecast.
Dirichlet Prior Sieves in Finite Normal Mixtures
- Statistica Sinica
, 2002
"... Abstract: The use of a finite dimensional Dirichlet prior in the finite normal mixture model has the effect of acting like a Bayesian method of sieves. Posterior consistency is directly related to the dimension of the sieve and the choice of the Dirichlet parameters in the prior. We find that naive ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Abstract: The use of a finite dimensional Dirichlet prior in the finite normal mixture model has the effect of acting like a Bayesian method of sieves. Posterior consistency is directly related to the dimension of the sieve and the choice of the Dirichlet parameters in the prior. We find that naive use of the popular uniform Dirichlet prior leads to an inconsistent posterior. However, a simple adjustment to the parameters in the prior induces a random probability measure that approximates the Dirichlet process and yields a posterior that is strongly consistent for the density and weakly consistent for the unknown mixing distribution. The dimension of the resulting sieve can be selected easily in practice and a simple and efficient Gibbs sampler can be used to sample the posterior of the mixing distribution. Key words and phrases: Bose-Einstein distribution, Dirichlet process, identification, method of sieves, random probability measure, relative entropy, weak convergence.
Analyzing Developmental Trajectories: A Semiparametric, Group-Based Approach
- Psychological Methods
, 1999
"... A developmental trajectory describes the course of a behavior over age or time. A group-based method for identifying distinctive groups of individual trajectories within the population and for profiling the characteristics of group members is demonstrated. Such clusters might include groups of " ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
A developmental trajectory describes the course of a behavior over age or time. A group-based method for identifying distinctive groups of individual trajectories within the population and for profiling the characteristics of group members is demonstrated. Such clusters might include groups of "increasers. " "decreasers," and "no changers. " Suitably defined probability distributions are used to handle 3 data types—count, binary, and psychometric scale data. Four capabilities are demonstrated: (a) the capability to identify rather than assume distinctive groups of trajectories, (b) the capability to estimate the proportion of the population following each such trajectory group, (c) the capability to relate group membership probability to individual characteristics and circumstances, and (d) the capability to use the group membership probabilities for various other purposes such as creating profiles of group members. Over the past decade, major advances have been made in methodology for analyzing individual-level developmental trajectories. The two main branches of methodology are hierarchical modeling (Bryk &
Poisson process partition calculus with an application to Bayesian . . .
, 2005
"... This article develops, and describes how to use, results concerning disintegrations of Poisson random measures. These results are fashioned as simple tools that can be tailor-made to address inferential questions arising in a wide range of Bayesian nonparametric and spatial statistical models. The P ..."
Abstract
-
Cited by 21 (9 self)
- Add to MetaCart
This article develops, and describes how to use, results concerning disintegrations of Poisson random measures. These results are fashioned as simple tools that can be tailor-made to address inferential questions arising in a wide range of Bayesian nonparametric and spatial statistical models. The Poisson disintegration method is based on the formal statement of two results concerning a Laplace functional change of measure and a Poisson Palm/Fubini calculus in terms of random partitions of the integers {1,...,n}. The techniques are analogous to, but much more general than, techniques for the Dirichlet process and weighted gamma process developed in [Ann. Statist. 12
A spectral algorithm for learning mixture models
- J. Comput. Syst. Sci
, 2004
"... Abstract We show that a simple spectral algorithm for learning a mixture of k spherical Gaussians in R n works remarkably well-- it succeeds in identifying the Gaussians assuming essentially the minimum possible separation between their centers that keeps them unique (solving an open problem of [1]) ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
Abstract We show that a simple spectral algorithm for learning a mixture of k spherical Gaussians in R n works remarkably well-- it succeeds in identifying the Gaussians assuming essentially the minimum possible separation between their centers that keeps them unique (solving an open problem of [1]). The sample complexity and running time are polynomial in both n and k. The algorithm can be applied to the more general problem of learning a mixture of "weakly isotropic " distributions (e.g. a mixture of uniform distributions on cubes). 1 Introduction Learning a mixture of distributions is a classical problem in statistics and learning theory (see [10, 14]); more recently, it has also been proposed as a model for clustering. In the basic version of the problem we are given random samples from a mixture of k distributions, F1; : : : ; Fk. Each sample is drawn independently with probability wi from the i'th distribution. The numbers w

