Results 1  10
of
329
Unsupervised learning of finite mixture models
 IEEE Transactions on pattern analysis and machine intelligence
, 2002
"... AbstractÐThis paper proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective ªunsupervisedº is justified by two properties of the algorithm: 1) it is capable of selecting the number of components and 2) unlike the standard expectationmaximization ..."
Abstract

Cited by 267 (20 self)
 Add to MetaCart
AbstractÐThis paper proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective ªunsupervisedº is justified by two properties of the algorithm: 1) it is capable of selecting the number of components and 2) unlike the standard expectationmaximization (EM) algorithm, it does not require careful initialization. The proposed method also avoids another drawback of EM for mixture fitting: the possibility of convergence toward a singular estimate at the boundary of the parameter space. The novelty of our approach is that we do not use a model selection criterion to choose one among a set of preestimated candidate models; instead, we seamlessly integrate estimation and model selection in a single algorithm. Our technique can be applied to any type of parametric mixture model for which it is possible to write an EM algorithm; in this paper, we illustrate it with experiments involving Gaussian mixtures. These experiments testify for the good performance of our approach. Index TermsÐFinite mixtures, unsupervised learning, model selection, minimum message length criterion, Bayesian methods, expectationmaximization algorithm, clustering. æ 1
Nearoptimal hashing algorithms for approximate nearest neighbor in high dimensions
, 2008
"... In this article, we give an overview of efficient algorithms for the approximate and exact nearest neighbor problem. The goal is to preprocess a dataset of objects (e.g., images) so that later, given a new query object, one can quickly return the dataset object that is most similar to the query. The ..."
Abstract

Cited by 237 (4 self)
 Add to MetaCart
In this article, we give an overview of efficient algorithms for the approximate and exact nearest neighbor problem. The goal is to preprocess a dataset of objects (e.g., images) so that later, given a new query object, one can quickly return the dataset object that is most similar to the query. The problem is of significant interest in a wide variety of areas.
Spectral Partitioning Works: Planar graphs and finite element meshes
 In IEEE Symposium on Foundations of Computer Science
, 1996
"... Spectral partitioning methods use the Fiedler vectorthe eigenvector of the secondsmallest eigenvalue of the Laplacian matrixto find a small separator of a graph. These methods are important components of many scientific numerical algorithms and have been demonstrated by experiment to work extr ..."
Abstract

Cited by 144 (8 self)
 Add to MetaCart
Spectral partitioning methods use the Fiedler vectorthe eigenvector of the secondsmallest eigenvalue of the Laplacian matrixto find a small separator of a graph. These methods are important components of many scientific numerical algorithms and have been demonstrated by experiment to work extremely well. In this paper, we show that spectral partitioning methods work well on boundeddegree planar graphs and finite element meshes the classes of graphs to which they are usually applied. While naive spectral bisection does not necessarily work, we prove that spectral partitioning techniques can be used to produce separators whose ratio of vertices removed to edges cut is O( p n) for boundeddegree planar graphs and twodimensional meshes and O i n 1=d j for wellshaped ddimensional meshes. The heart of our analysis is an upper bound on the secondsmallest eigenvalues of the Laplacian matrices of these graphs. 1. Introduction Spectral partitioning has become one of the mos...
Simple heuristics for unit disk graphs
 NETWORKS
, 1995
"... Unit disk graphs are intersection graphs of circles of unit radius in the plane. We present simple and provably good heuristics for a number of classical NPhard optimization problems on unit disk graphs. The problems considered include maximum independent set, minimum vertex cover, minimum coloring ..."
Abstract

Cited by 126 (6 self)
 Add to MetaCart
Unit disk graphs are intersection graphs of circles of unit radius in the plane. We present simple and provably good heuristics for a number of classical NPhard optimization problems on unit disk graphs. The problems considered include maximum independent set, minimum vertex cover, minimum coloring and minimum dominating set. We also present an online coloring heuristic which achieves a competitive ratio of 6 for unit disk graphs. Our heuristics do not need a geometric representation of unit disk graphs. Geometric representations are used only in establishing the performance guarantees of the heuristics. Several of our approximation algorithms can be extended to intersection graphs of circles of arbitrary radii in the plane, intersection graphs of regular polygons, and to intersection graphs of higher dimensional regular objects.
Monstrous moonshine and monstrous Lie superalgebras
 INVENT. MATH
, 1992
"... We prove Conway and Norton’s moonshine conjectures for the infinite dimensional representation of the monster simple group constructed by Frenkel, Lepowsky and Meurman. To do this we use the noghost theorem from string theory to construct a family of generalized KacMoody superalgebras of rank 2, w ..."
Abstract

Cited by 110 (0 self)
 Add to MetaCart
We prove Conway and Norton’s moonshine conjectures for the infinite dimensional representation of the monster simple group constructed by Frenkel, Lepowsky and Meurman. To do this we use the noghost theorem from string theory to construct a family of generalized KacMoody superalgebras of rank 2, which are closely related to the monster and several of the other sporadic simple groups. The denominator formulas of these superalgebras imply relations between the Thompson functions of elements of the monster (i.e. the traces of elements of the monster on Frenkel, Lepowsky, and Meurman’s representation), which are the replication formulas conjectured by Conway and Norton. These replication formulas are strong enough to verify that the Thompson functions have most of the “moonshine ” properties conjectured by Conway and Norton, and in particular they are modular functions of genus 0. We also construct a second family of KacMoody superalgebras related to elements of Conway’s sporadic simple group Co1. These superalgebras have even rank between 2 and 26; for example two of the Lie algebras we get have ranks 26 and 18, and one of the superalgebras has rank 10. The denominator formulas of these algebras give some new infinite product identities, in the same way that the denominator
Separators for spherepackings and nearest neighbor graphs
 J. ACM
, 1997
"... Abstract. A collection of n balls in d dimensions forms a kply system if no point in the space is covered by more than k balls. We show that for every kply system �, there is a sphere S that intersects at most O(k 1/d n 1�1/d) balls of � and divides the remainder of � into two parts: those in the ..."
Abstract

Cited by 74 (7 self)
 Add to MetaCart
Abstract. A collection of n balls in d dimensions forms a kply system if no point in the space is covered by more than k balls. We show that for every kply system �, there is a sphere S that intersects at most O(k 1/d n 1�1/d) balls of � and divides the remainder of � into two parts: those in the interior and those in the exterior of the sphere S, respectively, so that the larger part contains at most (1 � 1/(d � 2))n balls. This bound of O(k 1/d n 1�1/d) is the best possible in both n and k. We also present a simple randomized algorithm to find such a sphere in O(n) time. Our result implies that every knearest neighbor graphs of n points in d dimensions has a separator of size O(k 1/d n 1�1/d). In conjunction with a result of Koebe that every triangulated planar graph is isomorphic to the intersection graph of a diskpacking, our result not only gives a new geometric proof of the planar separator theorem of Lipton and Tarjan, but also generalizes it to higher dimensions. The separator algorithm can be used for point location and geometric divide and conquer in a fixed dimensional space.
On Lattice Quantization Noise
 IEEE Trans. Inform. Theory
, 1996
"... Abstract We present several results regarding the properties of a random vector, uniformly distributed over a lattice cell. This random vector is the quantization noise of a lattice quantizer at high resolution, or the noise of a dithered lattice quantizer at all distortion levels. We find that for ..."
Abstract

Cited by 73 (20 self)
 Add to MetaCart
Abstract We present several results regarding the properties of a random vector, uniformly distributed over a lattice cell. This random vector is the quantization noise of a lattice quantizer at high resolution, or the noise of a dithered lattice quantizer at all distortion levels. We find that for the optimal lattice quantizers this noise is widesensestationary and white. Any desirable noise spectra may be realized by an appropriate linear transformation (“shaping”) of a lattice quantizer. As the dimension increases, the normalized second.moment of the optimal lattice quantizer goes to 1/2xe, and consequently the quantization noise approaches a white Gaussian process in the divergence sense. In entropycoded dithered quantization, which can be modeled accurately as passing the source through an additive noise channel, this limit behavior implies that for large lattice dimension both the error and the bit rate approach the error and the information rate of an Additive White Gaussian Noise (AWGN) channel. Index TermsLattice, quantization noise, shaping, normalized second moment, divergence from Gaussianity. I I.
Bayesian Approaches to Gaussian Mixture Modelling
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... A Bayesianbased methodology is presented which automatically penalises overcomplex models being fitted to unknown data. We show that, with a Gaussian mixture model, the approach is able to select an `optimal' number of components in the model and so partition data sets. The performance of the Baye ..."
Abstract

Cited by 73 (2 self)
 Add to MetaCart
A Bayesianbased methodology is presented which automatically penalises overcomplex models being fitted to unknown data. We show that, with a Gaussian mixture model, the approach is able to select an `optimal' number of components in the model and so partition data sets. The performance of the Bayesian method is compared to other methods of optimal model selection and found to give good results. The methods are tested on synthetic and real data sets. Introduction Scientific disciplines generate data. In the attempt to understand the patterns present in such data sets methods which perform some form of unsupervised partitioning or modelling are particularly useful. Such an approach is only of use, however, if it offers a less complex representation of the data than the data set itself. This introduces an apparent conflict, however, as any model improves its fit to the data monotonically with increases in its complexity (the number of model parameters)  a model as complex as the data...
Generalized multiple description coding with correlating transforms
 IEEE Trans. Inform. Theory
, 2001
"... Abstract—Multiple description (MD) coding is source coding in which several descriptions of the source are produced such that various reconstruction qualities are obtained from different subsets of the descriptions. Unlike multiresolution or layered source coding, there is no hierarchy of descriptio ..."
Abstract

Cited by 61 (2 self)
 Add to MetaCart
Abstract—Multiple description (MD) coding is source coding in which several descriptions of the source are produced such that various reconstruction qualities are obtained from different subsets of the descriptions. Unlike multiresolution or layered source coding, there is no hierarchy of descriptions; thus, MD coding is suitable for packet erasure channels or networks without priority provisions. Generalizing work by Orchard, Wang, Vaishampayan, and Reibman, a transformbased approach is developed for producing descriptions of antuple source,. The descriptions are sets of transform coefficients, and the transform coefficients of different descriptions are correlated so that missing coefficients can be estimated. Several transform optimization results are presented for memoryless Gaussian sources, including a complete solution of the aP, aPcase with arbitrary weighting of the descriptions. The technique is effective only when independent components of the source have differing variances. Numerical studies show that this method performs well at low redundancies, as compared to uniform MD scalar quantization. Index Terms—Erasure channels, integertointeger transforms, packet networks, robust source coding.
A class of Lorentzian KacMoody algebras, Nucl. Phys. B645
, 2002
"... We consider a natural generalisation of the class of hyperbolic KacMoody algebras. We describe in detail the conditions under which these algebras are Lorentzian. We also construct their fundamental weights, and analyse whether they possess a real principal so(1,2) subalgebra. Our class of algebras ..."
Abstract

Cited by 52 (11 self)
 Add to MetaCart
We consider a natural generalisation of the class of hyperbolic KacMoody algebras. We describe in detail the conditions under which these algebras are Lorentzian. We also construct their fundamental weights, and analyse whether they possess a real principal so(1,2) subalgebra. Our class of algebras include the Lorentzian KacMoody algebras that have recently been proposed as symmetries of Mtheory and the closed bosonic string.