Results 1  10
of
144
Data Clustering: A Review
 ACM COMPUTING SURVEYS
, 1999
"... Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exp ..."
Abstract

Cited by 1282 (13 self)
 Add to MetaCart
Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify crosscutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.
Quantization
 IEEE TRANS. INFORM. THEORY
, 1998
"... The history of the theory and practice of quantization dates to 1948, although similar ideas had appeared in the literature as long ago as 1898. The fundamental role of quantization in modulation and analogtodigital conversion was first recognized during the early development of pulsecode modula ..."
Abstract

Cited by 638 (11 self)
 Add to MetaCart
The history of the theory and practice of quantization dates to 1948, although similar ideas had appeared in the literature as long ago as 1898. The fundamental role of quantization in modulation and analogtodigital conversion was first recognized during the early development of pulsecode modulation systems, especially in the 1948 paper of Oliver, Pierce, and Shannon. Also in 1948, Bennett published the first highresolution analysis of quantization and an exact analysis of quantization noise for Gaussian processes, and Shannon published the beginnings of rate distortion theory, which would provide a theory for quantization as analogtodigital conversion and as data compression. Beginning with these three papers of fifty years ago, we trace the history of quantization from its origins through this decade, and we survey the fundamentals of the theory and many of the popular and promising techniques for quantization.
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract

Cited by 282 (15 self)
 Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacianbased methods in a statistical setting.
Interactive Texture Mapping
, 1993
"... This paper describes a new approach to texture mapping. A global method to lower the distortion of the mapped image is presented; by considering a general optimization function we view the mapping as an energyminimization process. We have constructed an interactive texture tool, which is fast and e ..."
Abstract

Cited by 130 (2 self)
 Add to MetaCart
This paper describes a new approach to texture mapping. A global method to lower the distortion of the mapped image is presented; by considering a general optimization function we view the mapping as an energyminimization process. We have constructed an interactive texture tool, which is fast and easy to use, to manipulate atlases in texture space. We present the tool’s large set of interactive operations on mapping functions. We also introduce an algorithm which automatically generates an atlas for any type of object. These techniques allow the mapping of different textures onto the same object and handle noncontinuous mapping functions, needed for complicated mapped objects.
Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces
 Journal of Machine Learning Research
, 2004
"... We propose a novel method of dimensionality reduction for supervised learning problems. Given a regression or classification problem in which we wish to predict a response variable Y from an explanatory variable X, we treat the problem of dimensionality reduction as that of finding a lowdimensional ..."
Abstract

Cited by 116 (25 self)
 Add to MetaCart
We propose a novel method of dimensionality reduction for supervised learning problems. Given a regression or classification problem in which we wish to predict a response variable Y from an explanatory variable X, we treat the problem of dimensionality reduction as that of finding a lowdimensional “effective subspace ” for X which retains the statistical relationship between X and Y. We show that this problem can be formulated in terms of conditional independence. To turn this formulation into an optimization problem we establish a general nonparametric characterization of conditional independence using covariance operators on reproducing kernel Hilbert spaces. This characterization allows us to derive a contrast function for estimation of the effective subspace. Unlike many conventional methods for dimensionality reduction in supervised learning, the proposed method requires neither assumptions on the marginal distribution of X, nor a parametric model of the conditional distribution of Y. We present experiments that compare the performance of the method with conventional methods.
Automated Construction Of Classifications Conceptual Clustering Versus Numerical Taxonomy
, 1983
"... A method for automated construction of classifications called conceptual clustering is described and compared to methods used in numerical taxonomy. This method arranges objects into classes rep resenting certain descriptive concepts, rather than into.classes defined solely by a similarity metric i ..."
Abstract

Cited by 92 (11 self)
 Add to MetaCart
A method for automated construction of classifications called conceptual clustering is described and compared to methods used in numerical taxonomy. This method arranges objects into classes rep resenting certain descriptive concepts, rather than into.classes defined solely by a similarity metric in some a priori defined attribute space. A specific form of the method is conjunctive conceptual clustering, in which descriptive concepts are conjunetive statements involving rela tions on selected object attributes and optimized aeeording to an assumed global criterion of clustering quality. The method, implemented in program CLUSTER/2, is tested together with 18 numerical taxonomy methods on two exemplary problems: 1) a construction of a classification of popular microcomputers and 2) the reconstruction of a classification of selected plant disease categories. In both experiments, the majority of numerical taxonomy methods (14 out of 18) produced results which were difficult to interpret and seemed to be arbitrary. In contrast to this, the conceptual clustering method produced results that had a simple interpretation and corresponded well to solutions pre ferred by people.
Broadband fading channels: signal burstiness and capacity
 IEEE Trans. Inform. Theory
, 2002
"... Abstract—Médard and Gallager recently showed that very large bandwidths on certain fading channels cannot be effectively used by direct sequence or related spreadspectrum systems. This paper complements the work of Médard and Gallager. First, it is shown that a key informationtheoretic inequality ..."
Abstract

Cited by 56 (5 self)
 Add to MetaCart
Abstract—Médard and Gallager recently showed that very large bandwidths on certain fading channels cannot be effectively used by direct sequence or related spreadspectrum systems. This paper complements the work of Médard and Gallager. First, it is shown that a key informationtheoretic inequality of Médard and Gallager can be directly derived using the theory of capacity per unit cost, for a certain fourthorder cost function, called fourthegy. This provides insight into the tightness of the bound. Secondly, the bound is explored for a widesensestationary uncorrelated scattering (WSSUS) fading channel, which entails mathematically defining such a channel. In this context, the fourthegy can be expressed using the ambiguity function of the input signal. Finally, numerical data and conclusions are presented for directsequence type input signals. Index Terms—Channel capacity, fading channels, spread spectrum, widesensestationary uncorrelated scattering (WSSUS) fading channels. I.
Multiple Regimes in Northern Hemisphere Height Fields via Mixture Model Clustering
 J. Atmos. Sci
, 1998
"... Mixture model clustering is applied to Northern Hemisphere (NH) 700mb geopotential height anomalies. A mixture model is a flexible probability density estimation technique, consisting of a linear combination of k component densities. A key feature of the mixture modeling approach to clustering is t ..."
Abstract

Cited by 49 (28 self)
 Add to MetaCart
Mixture model clustering is applied to Northern Hemisphere (NH) 700mb geopotential height anomalies. A mixture model is a flexible probability density estimation technique, consisting of a linear combination of k component densities. A key feature of the mixture modeling approach to clustering is the ability to estimate a posterior probability distribution for k, the number of clusters, given the data and the model, and thus objectively determine the number of clusters that is most likely to fit the data. A data set of 44 winters of NH 700mb fields is projected onto its two leading empirical orthogonal functions (EOFs) and analyzed using mixtures of Gaussian components. Crossvalidated likelihood is used to determine the best value of k, the number of clusters. The posterior probability so determined peaks at k = 3 and thus yields clear evidence for 3 clusters in the NH 700mb data. The 3cluster result is found to be robust with respect to variations in data preprocessing and data an...
Kernel measures of conditional dependence
 In Adv. NIPS
, 2008
"... We propose a new measure of conditional dependence of random variables, based on normalized crosscovariance operators on reproducing kernel Hilbert spaces. Unlike previous kernel dependence measures, the proposed criterion does not depend on the choice of kernel in the limit of infinite data, for a ..."
Abstract

Cited by 48 (32 self)
 Add to MetaCart
We propose a new measure of conditional dependence of random variables, based on normalized crosscovariance operators on reproducing kernel Hilbert spaces. Unlike previous kernel dependence measures, the proposed criterion does not depend on the choice of kernel in the limit of infinite data, for a wide class of kernels. At the same time, it has a straightforward empirical estimate with good convergence behaviour. We discuss the theoretical properties of the measure, and demonstrate its application in experiments. 1
Injective hilbert space embeddings of probability measures
 In COLT
, 2008
"... A Hilbert space embedding for probability measures has recently been proposed, with applications including dimensionality reduction, homogeneity testing and independence testing. This embedding represents any probability measure as a mean element in a reproducing kernel Hilbert space (RKHS). The emb ..."
Abstract

Cited by 35 (24 self)
 Add to MetaCart
A Hilbert space embedding for probability measures has recently been proposed, with applications including dimensionality reduction, homogeneity testing and independence testing. This embedding represents any probability measure as a mean element in a reproducing kernel Hilbert space (RKHS). The embedding function has been proven to be injective when the reproducing kernel is universal. In this case, the embedding induces a metric on the space of probability distributions defined on compact metric spaces. In the present work, we consider more broadly the problem of specifying characteristic kernels, defined as kernels for which the RKHS embedding of probability measures is injective. In particular, characteristic kernels can include nonuniversal kernels. We restrict ourselves to translationinvariant kernels on Euclidean space, and define the associated metric on probability measures in terms of the Fourier spectrum of the kernel and characteristic functions of these measures. The support of the kernel spectrum is important in finding whether a kernel is characteristic: in particular, the embedding is injective if and only if the kernel spectrum has the entire domain as its support. Characteristic kernels may nonetheless have difficulty in distinguishing certain distributions on the basis of finite samples, again due to the interaction of the kernel spectrum and the characteristic functions of the measures. 1