Results 1  10
of
66
Data Clustering: A Review
 ACM COMPUTING SURVEYS
, 1999
"... Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exp ..."
Abstract

Cited by 1308 (13 self)
 Add to MetaCart
Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify crosscutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.
Unsupervised Learning from Dyadic Data
, 1998
"... Dyadic data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This includes event cooccurrences, histogram data, and single stimulus preference data as special cases. Dyadic data arises naturally in many applic ..."
Abstract

Cited by 100 (9 self)
 Add to MetaCart
Dyadic data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This includes event cooccurrences, histogram data, and single stimulus preference data as special cases. Dyadic data arises naturally in many applications ranging from computational linguistics and information retrieval to preference analysis and computer vision. In this paper, we present a systematic, domainindependent framework for unsupervised learning from dyadic data by statistical mixture models. Our approach covers different models with flat and hierarchical latent class structures and unifies probabilistic modeling and structure discovery. Mixture models provide both, a parsimonious yet flexible parameterization of probability distributions with good generalization performance on sparse data, as well as structural information about datainherent grouping structure. We propose an annealed version of the standard Expectation Maximization algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains.
Efficient GraphBased Energy Minimization Methods In Computer Vision
, 1999
"... ms (we show that exact minimization in NPhard in these cases). These algorithms produce a local minimum in interesting large move spaces. Furthermore, one of them nds a solution within a known factor from the optimum. The algorithms are iterative and compute several graph cuts at each iteration. Th ..."
Abstract

Cited by 82 (5 self)
 Add to MetaCart
ms (we show that exact minimization in NPhard in these cases). These algorithms produce a local minimum in interesting large move spaces. Furthermore, one of them nds a solution within a known factor from the optimum. The algorithms are iterative and compute several graph cuts at each iteration. The running time at each iteration is eectively linear due to the special graph structure. In practice it takes just a few iterations to converge. Moreover most of the progress happens during the rst iteration. For a certain piecewise constant prior we adapt the algorithms developed for the piecewise smooth prior. One of them nds a solution within a factor of two from the optimum. In addition we develop a third algorithm which nds a local minimum in yet another move space. We demonstrate the eectiveness of our approach on image restoration, stereo, and motion. For the data with ground truth, our methods signicantly outperform standard methods. Biographical Sketch Olga
Self Organization in Vision: Stochastic Clustering for Image Segmentation, Perceptual Grouping, and Image Database Organization
, 2001
"... We present a stochastic clustering algorithm which uses pairwise similarity of elements, and show how it can be used to address various problems in computer vision, including the lowlevel image segmentation, midlevel perceptual grouping, and highlevel image database organization. The clustering p ..."
Abstract

Cited by 76 (4 self)
 Add to MetaCart
We present a stochastic clustering algorithm which uses pairwise similarity of elements, and show how it can be used to address various problems in computer vision, including the lowlevel image segmentation, midlevel perceptual grouping, and highlevel image database organization. The clustering problem is viewed as a graph partitioning problem, where nodes represent data elements and the weights of the edges represent pairwise similarities. We generate samples of cuts in this graph, by using Karger's contraction algorithm, and compute an "average" cut which provides the basis for our solution to the clustering problem. The stochastic nature of our method makes it robust against noise, including accidental edges and small spurious clusters. The complexity of our algorithm is very low: O(E log² N) for N objects, E similarity relations and a fixed accuracy level. In addition, and without additional computational cost, our algorithm provides a hierarchy of nested partitions. We demonstrate the superiority of our method for image segmentation on a few synthetic and real images, B&W and color. Our other examples include the concatenation of edges in a cluttered scene (perceptual grouping), and the organization of an image database for the purpose of multiview 3D object recognition.
Segmentation given partial grouping constraints
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2004
"... Abstract—We consider data clustering problems where partial grouping is known a priori. We formulate such biased grouping problems as a constrained optimization problem, where structural properties of the data define the goodness of a grouping and partial grouping cues define the feasibility of a gr ..."
Abstract

Cited by 55 (3 self)
 Add to MetaCart
Abstract—We consider data clustering problems where partial grouping is known a priori. We formulate such biased grouping problems as a constrained optimization problem, where structural properties of the data define the goodness of a grouping and partial grouping cues define the feasibility of a grouping. We enforce grouping smoothness and fairness on labeled data points so that sparse partial grouping information can be effectively propagated to the unlabeled data. Considering the normalized cuts criterion in particular, our formulation leads to a constrained eigenvalue problem. By generalizing the RayleighRitz theorem to projected matrices, we find the global optimum in the relaxed continuous domain by eigendecomposition, from which a nearglobal optimum to the discrete labeling problem can be obtained effectively. We apply our method to real image segmentation problems, where partial grouping priors can often be derived based on a crude spatial attentional map that binds places with common salient features or focuses on expected object locations. We demonstrate not only that it is possible to integrate both image structures and priors in a single grouping process, but also that objects can be segregated from the background without specific object knowledge. Index Terms—Grouping, image segmentation, graph partitioning, bias, spatial attention, semisupervised clustering, partially labeled classification. æ
Outex  New framework for empirical evaluation of texture analysis algorithms
 Proc. 16th International Conference on Pattern Recognition
, 2002
"... This paper presents the current status of a new initiative aimed at developing a versatile framework and image database for empirical evaluation of texture analysis algorithms. The proposed Outex framework contains a large collection of surface textures captured under different conditions, which fac ..."
Abstract

Cited by 54 (11 self)
 Add to MetaCart
This paper presents the current status of a new initiative aimed at developing a versatile framework and image database for empirical evaluation of texture analysis algorithms. The proposed Outex framework contains a large collection of surface textures captured under different conditions, which facilitates construction of a wide range of texture analysis problems. The problems are encapsulated into test suites, for which baseline results obtained with algorithms from literature are provided. The rich functionality of the framework is demonstrated with examples in texture classification, segmentation and retrieval. The framework has a web site for public dissemination of the database and comparative results obtained by research groups world wide. 1.
A New GraphTheoretic Approach to Clustering, with Applications to Computer Vision
, 2004
"... This work applies cluster analysis as a unified approach for a wide range of vision applications, thereby combining the research domain of computer vision and that of machine learning. Cluster analysis is the formal study of algorithms and methods for recovering the inherent structure within a given ..."
Abstract

Cited by 44 (4 self)
 Add to MetaCart
This work applies cluster analysis as a unified approach for a wide range of vision applications, thereby combining the research domain of computer vision and that of machine learning. Cluster analysis is the formal study of algorithms and methods for recovering the inherent structure within a given dataset. Many problems of computer vision have precisely this goal, namely to find which visual entities belong to an inherent structure, e.g. in an image or in a database of images. For example, a meaningful structure in the context of image segmentation is a set of pixels which correspond to the same object in a scene. Clustering algorithms can be used to partition the pixels of an image into meaningful parts, which may correspond to different objects. In this work we focus on the problems of image segmentation and image database organization. The visual entities to consider are pixels and images, respectively. Our first contribution in this work is a novel partitional (flat) clustering algorithm. The algorithm uses pairwise representation, where the visual objects (pixels,
Grouping with Bias
 In Advances in Neural Information Processing Systems
, 2001
"... With the optimization of pattern discrimination as a goal, graph partitioning approaches often lack the capability to integrate prior knowledge to guide grouping. In this paper, we consider priors from unitary generative models, partially labeled data and spatial attention. These priors are modelled ..."
Abstract

Cited by 43 (4 self)
 Add to MetaCart
With the optimization of pattern discrimination as a goal, graph partitioning approaches often lack the capability to integrate prior knowledge to guide grouping. In this paper, we consider priors from unitary generative models, partially labeled data and spatial attention. These priors are modelled as constraints in the solution space. By imposing uniformity condition on the constraints, we restrict the feasible space to one of smooth solutions. A subspace projection method is developed to solve this constrained eigenproblem.
Blind motion deblurring using image statistics
 In Advances in Neural Information Processing Systems (NIPS
"... We address the problem of blind motion deblurring from a single image, caused by a few moving objects. In such situations only part of the image may be blurred, and the scene consists of layers blurred in different degrees. Most of of existing blind deconvolution research concentrates at recovering ..."
Abstract

Cited by 42 (3 self)
 Add to MetaCart
We address the problem of blind motion deblurring from a single image, caused by a few moving objects. In such situations only part of the image may be blurred, and the scene consists of layers blurred in different degrees. Most of of existing blind deconvolution research concentrates at recovering a single blurring kernel for the entire image. However, in the case of different motions, the blur cannot be modeled with a single kernel, and trying to deconvolve the entire image with the same kernel will cause serious artifacts. Thus, the task of deblurring needs to involve segmentation of the image into regions with different blurs. Our approach relies on the observation that the statistics of derivative filters in images are significantly changed by blur. Assuming the blur results from a constant velocity motion, we can limit the search to one dimensional box filter blurs. This enables us to model the expected derivatives distributions as a function of the width of the blur kernel. Those distributions are surprisingly powerful in discriminating regions with different blurs. The approach produces convincing deconvolution results on real world images with rich texture. 1