Results 11  20
of
1,276
Clustering with Bregman Divergences
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergence ..."
Abstract

Cited by 309 (52 self)
 Add to MetaCart
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences. The proposed algorithms unify centroidbased parametric clustering approaches, such as classical kmeans and informationtheoretic clustering, which arise by special choices of the Bregman divergence. The algorithms maintain the simplicity and scalability of the classical kmeans algorithm, while generalizing the basic idea to a very large class of clustering loss functions. There are two main contributions in this paper. First, we pose the hard clustering problem in terms of minimizing the loss in Bregman information, a quantity motivated by ratedistortion theory, and present an algorithm to minimize this loss. Secondly, we show an explicit bijection between Bregman divergences and exponential families. The bijection enables the development of an alternative interpretation of an ecient EM scheme for learning models involving mixtures of exponential distributions. This leads to a simple soft clustering algorithm for all Bregman divergences.
Contour and Texture Analysis for Image Segmentation
, 2001
"... This paper provides an algorithm for partitioning grayscale images into disjoint regions of coherent brightness and texture. Natural images contain both textured and untextured regions, so the cues of contour and texture differences are exploited simultaneously. Contours are treated in the interveni ..."
Abstract

Cited by 300 (28 self)
 Add to MetaCart
This paper provides an algorithm for partitioning grayscale images into disjoint regions of coherent brightness and texture. Natural images contain both textured and untextured regions, so the cues of contour and texture differences are exploited simultaneously. Contours are treated in the intervening contour framework, while texture is analyzed using textons. Each of these cues has a domain of applicability, so to facilitate cue combination we introduce a gating operator based on the texturedness of the neighborhood at a pixel. Having obtained a local measure of how likely two nearby pixels are to belong to the same region, we use the spectral graph theoretic framework of normalized cuts to find partitions of the image into regions of coherent texture and brightness. Experimental results on a wide range of images are shown.
Multiple Description Coding: Compression Meets the Network
, 2001
"... This article focuses on the compressed representations of the pictures ..."
Abstract

Cited by 287 (7 self)
 Add to MetaCart
This article focuses on the compressed representations of the pictures
Survey of clustering data mining techniques
, 2002
"... Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in math ..."
Abstract

Cited by 247 (0 self)
 Add to MetaCart
Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in mathematics, statistics, and numerical analysis. From a machine learning perspective clusters correspond to hidden patterns, the search for clusters is unsupervised learning, and the resulting system represents a data concept. From a practical perspective clustering plays an outstanding role in data mining applications such as scientific data exploration, information retrieval and text mining, spatial database applications, Web analysis, CRM, marketing, medical diagnostics, computational biology, and many others. Clustering is the subject of active research in several fields such as statistics, pattern recognition, and machine learning. This survey focuses on clustering in data mining. Data mining adds to clustering the complications of very large datasets with very many attributes of different types. This imposes unique
Deterministic Annealing for Clustering, Compression, Classification, Regression, and Related Optimization Problems
 Proceedings of the IEEE
, 1998
"... this paper. Let us place it within the neural network perspective, and particularly that of learning. The area of neural networks has greatly benefited from its unique position at the crossroads of several diverse scientific and engineering disciplines including statistics and probability theory, ph ..."
Abstract

Cited by 247 (11 self)
 Add to MetaCart
this paper. Let us place it within the neural network perspective, and particularly that of learning. The area of neural networks has greatly benefited from its unique position at the crossroads of several diverse scientific and engineering disciplines including statistics and probability theory, physics, biology, control and signal processing, information theory, complexity theory, and psychology (see [45]). Neural networks have provided a fertile soil for the infusion (and occasionally confusion) of ideas, as well as a meeting ground for comparing viewpoints, sharing tools, and renovating approaches. It is within the illdefined boundaries of the field of neural networks that researchers in traditionally distant fields have come to the realization that they have been attacking fundamentally similar optimization problems.
Object retrieval with large vocabularies and fast spatial matching
 In Proc. IEEE Conf. on Computer Vision and Pattern Recognition
, 2007
"... In this paper, we present a largescale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our ..."
Abstract

Cited by 247 (17 self)
 Add to MetaCart
In this paper, we present a largescale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our system on a dataset of over 1 million images crawled from the photosharing site, Flickr [3], using Oxford landmarks as queries. Building an imagefeature vocabulary is a major time and performance bottleneck, due to the size of our dataset. To address this problem we compare different scalable methods for building a vocabulary and introduce a novel quantization method based on randomized trees which we show outperforms the current stateoftheart on an extensive
Centroidal Voronoi tessellations: Applications and algorithms
 SIAM Rev
, 1999
"... Abstract. A centroidal Voronoi tessellation is a Voronoi tessellation whose generating points are the centroids (centers of mass) of the corresponding Voronoi regions. We give some applications of such tessellations to problems in image compression, quadrature, finite difference methods, distributio ..."
Abstract

Cited by 237 (25 self)
 Add to MetaCart
Abstract. A centroidal Voronoi tessellation is a Voronoi tessellation whose generating points are the centroids (centers of mass) of the corresponding Voronoi regions. We give some applications of such tessellations to problems in image compression, quadrature, finite difference methods, distribution of resources, cellular biology, statistics, and the territorial behavior of animals. We discuss methods for computing these tessellations, provide some analyses concerning both the tessellations and the methods for their determination, and, finally, present the results of some numerical experiments.
An Efficient kMeans Clustering Algorithm: Analysis and Implementation
, 2000
"... Kmeans clustering is a very popular clustering technique, which is used in numerous applications. Given a set of n data points in R d and an integer k, the problem is to determine a set of k points R d , called centers, so as to minimize the mean squared distance from each data point to its ..."
Abstract

Cited by 207 (3 self)
 Add to MetaCart
Kmeans clustering is a very popular clustering technique, which is used in numerous applications. Given a set of n data points in R d and an integer k, the problem is to determine a set of k points R d , called centers, so as to minimize the mean squared distance from each data point to its nearest center. A popular heuristic for kmeans clustering is Lloyd's algorithm. In this paper we present a simple and efficient implementation of Lloyd's kmeans clustering algorithm, which we call the filtering algorithm. This algorithm is very easy to implement. It differs from most other approaches in that it precomputes a kdtree data structure for the data points rather than the center points. We establish the practical efficiency of the filtering algorithm in two ways. First, we present a datasensitive analysis of the algorithm's running time. Second, we have implemented the algorithm and performed a number of empirical studies, both on synthetically generated data and on real...
Unsupervised Segmentation of ColorTexture Regions in Images and Video
, 2001
"... A new method for unsupervised segmentation of colortexture regions in images and video is presented. This method, which we refer to as JSEG, consists of two independent steps: color quantization and spatial segmentation. In the first step, colors in the image are quantized to several representative ..."
Abstract

Cited by 206 (1 self)
 Add to MetaCart
A new method for unsupervised segmentation of colortexture regions in images and video is presented. This method, which we refer to as JSEG, consists of two independent steps: color quantization and spatial segmentation. In the first step, colors in the image are quantized to several representative classes that can be used to differentiate regions in the image. The image pixels are then replaced by their corresponding color class labels, thus forming a classmap of the image. The focus of this work is on spatial segmentation, where a criterion for "good" segmentation using the classmap is proposed. Applying the criterion to local windows in the classmap results in the "Jimage, " in which high and low values correspond to possible boundaries and interiors of colortexture regions. A region growing method is then used to segment the image based on the multiscale Jimages. A similar approach is applied to video sequences. An additional region tracking scheme is embedded into the region growing process to achieve consistent segmentation and tracking results, even for scenes with nonrigid object motion. Experiments show the robustness of the JSEG algorithm on real images and video.
From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images
, 2007
"... A fullrank matrix A ∈ IR n×m with n < m generates an underdetermined system of linear equations Ax = b having infinitely many solutions. Suppose we seek the sparsest solution, i.e., the one with the fewest nonzero entries: can it ever be unique? If so, when? As optimization of sparsity is combinato ..."
Abstract

Cited by 199 (31 self)
 Add to MetaCart
A fullrank matrix A ∈ IR n×m with n < m generates an underdetermined system of linear equations Ax = b having infinitely many solutions. Suppose we seek the sparsest solution, i.e., the one with the fewest nonzero entries: can it ever be unique? If so, when? As optimization of sparsity is combinatorial in nature, are there efficient methods for finding the sparsest solution? These questions have been answered positively and constructively in recent years, exposing a wide variety of surprising phenomena; in particular, the existence of easilyverifiable conditions under which optimallysparse solutions can be found by concrete, effective computational methods. Such theoretical results inspire a bold perspective on some important practical problems in signal and image processing. Several wellknown signal and image processing problems can be cast as demanding solutions of undetermined systems of equations. Such problems have previously seemed, to many, intractable. There is considerable evidence that these problems often have sparse solutions. Hence, advances in finding sparse solutions to underdetermined systems energizes research on such signal and image processing problems – to striking effect. In this paper we review the theoretical results on sparse solutions of linear systems, empirical