Results 1  10
of
152
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract

Cited by 557 (28 self)
 Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", "Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
How many clusters? Which clustering method? Answers via modelbased cluster analysis
 THE COMPUTER JOURNAL
, 1998
"... ..."
Probabilistic Linear Discriminant Analysis for
 Inferences About Identity ,” ICCV
, 2007
"... Many current face recognition algorithms perform badly when the lighting or pose of the probe and gallery images differ. In this paper we present a novel algorithm designed for these conditions. We describe face data as resulting from a generative model which incorporates both withinindividual and b ..."
Abstract

Cited by 115 (5 self)
 Add to MetaCart
(Show Context)
Many current face recognition algorithms perform badly when the lighting or pose of the probe and gallery images differ. In this paper we present a novel algorithm designed for these conditions. We describe face data as resulting from a generative model which incorporates both withinindividual and betweenindividual variation. In recognition we calculate the likelihood that the differences between face images are entirely due to withinindividual variability. We extend this to the nonlinear case where an arbitrary face manifold can be described and noise is positiondependent. We also develop a “tied ” version of the algorithm that allows explicit comparison across quite different viewing conditions. We demonstrate that our model produces state of the art results for (i) frontal face recognition (ii) face recognition under varying pose. 1.
MCLUST: Software for Modelbased Cluster Analysis
 Journal of Classification
, 1999
"... MCLUST is a software package for cluster analysis written in Fortran and interfaced to the SPLUS commercial software package1. It implements parameterized Gaussian hierarchical clustering algorithms [16, 1, 7] and the EM algorithm for parameterized Gaussian mixture models [5, 13, 3, 14] with the po ..."
Abstract

Cited by 93 (16 self)
 Add to MetaCart
(Show Context)
MCLUST is a software package for cluster analysis written in Fortran and interfaced to the SPLUS commercial software package1. It implements parameterized Gaussian hierarchical clustering algorithms [16, 1, 7] and the EM algorithm for parameterized Gaussian mixture models [5, 13, 3, 14] with the possible addition of a Poisson noise term. MCLUST also includes functions that combine hierarchical clustering, EM and the Bayesian Information Criterion (BIC) in a comprehensive clustering strategy [4, 8]. Methods of this type have shown promise in a number of practical applications, including character recognition [16], tissue segmentation [1], mine eld and seismic fault detection [4], identi cation of textile aws from images [2], and classi cation of astronomical data [3, 15]. Aweb page with related links can be found at
Computational Auditory Scene Recognition
 In IEEE Int’l Conf. on Acoustics, Speech, and Signal Processing
, 2001
"... v 1 ..."
(Show Context)
Bayesian regularization for normal mixture estimation and modelbased clustering
, 2005
"... Normal mixture models are widely used for statistical modeling of data, including cluster analysis. However maximum likelihood estimation (MLE) for normal mixtures using the EM algorithm may fail as the result of singularities or degeneracies. To avoid this, we propose replacing the MLE by a maximum ..."
Abstract

Cited by 58 (4 self)
 Add to MetaCart
(Show Context)
Normal mixture models are widely used for statistical modeling of data, including cluster analysis. However maximum likelihood estimation (MLE) for normal mixtures using the EM algorithm may fail as the result of singularities or degeneracies. To avoid this, we propose replacing the MLE by a maximum a posteriori (MAP) estimator, also found by the EM algorithm. For choosing the number of components and the model parameterization, we propose a modified version of BIC, where the likelihood is evaluated at the MAP instead of the MLE. We use a highly dispersed proper conjugate prior, containing a small fraction of one observation’s worth of information. The resulting method avoids degeneracies and singularities, but when these are not present it gives similar results to the standard method using MLE, EM and BIC. Key words: BIC; EM algorithm; mixture models; modelbased clustering; conjugate prior; posterior mode. 1
Health status monitoring through analysis of behavioral patterns
 8th congress of the Italian Association for Artificial Intelligence (AI*IA) on Ambient Intelligence
, 2003
"... Abstract. With the rapid growth of the elderly population, there is a need to assess the ability of elders to maintain an independent and healthy lifestyle. One possible method is to employ the concepts of ambient intelligence to remotely monitor an elder’s activity. The SmartHouse project uses a sy ..."
Abstract

Cited by 56 (4 self)
 Add to MetaCart
(Show Context)
Abstract. With the rapid growth of the elderly population, there is a need to assess the ability of elders to maintain an independent and healthy lifestyle. One possible method is to employ the concepts of ambient intelligence to remotely monitor an elder’s activity. The SmartHouse project uses a system of basic sensors to monitor a person’s inhome activity, and a prototype of the system is being tested within a subject’s home. We examine whether the system can be used to detect behavioral patterns. Mixture models are used to develop a probabilistic model of behavioral patterns. The results of the mixture model analysis are then compared to a log of events kept by the user. 1
Efficient clustering of uncertain data
 In: ICDM (2006
"... We study the problem of clustering data objects whose locations are uncertain. A data object is represented by an uncertainty region over which a probability density function (pdf) is defined. One method to cluster uncertain objects of this sort is to apply the UKmeans algorithm, which is based on ..."
Abstract

Cited by 52 (6 self)
 Add to MetaCart
(Show Context)
We study the problem of clustering data objects whose locations are uncertain. A data object is represented by an uncertainty region over which a probability density function (pdf) is defined. One method to cluster uncertain objects of this sort is to apply the UKmeans algorithm, which is based on the traditional Kmeans algorithm. In UKmeans, an object is assigned to the cluster whose representative has the smallest expected distance to the object. For arbitrary pdf, calculating the expected distance between an object and a cluster representative requires expensive integration computation. We study various pruning methods to avoid such expensive expected distance calculation. 1.
Probabilistic models for inference about iden‐ tity
 IEEE TPAMI
, 2012
"... Abstract—Many face recognition algorithms use “distancebased ” methods: feature vectors are extracted from each face and distances in feature space are compared to determine matches. In this paper we argue for a fundamentally different approach. We consider each image as having been generated from ..."
Abstract

Cited by 51 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Many face recognition algorithms use “distancebased ” methods: feature vectors are extracted from each face and distances in feature space are compared to determine matches. In this paper we argue for a fundamentally different approach. We consider each image as having been generated from several underlying causes, some of which are due to identity (latent identity variables, or LIVs) and some of which are not. In recognition we evaluate the probability that two faces have the same underlying identity cause. We make these ideas concrete by developing a series of novel generative models which incorporate both withinindividual and betweenindividual variation. We consider both the linear case where signal and noise are represented by a subspace, and the nonlinear case where an arbitrary face manifold can be described and noise is positiondependent. We also develop a “tied ” version of the algorithm that allows explicit comparison of faces across quite different viewing conditions. We demonstrate that our model produces results that are comparable or better than the state of the art for both frontal face recognition and face recognition under varying pose.