Results 1 - 10
of
57
Data Clustering: A Review
- ACM COMPUTING SURVEYS
, 1999
"... Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exp ..."
Abstract
-
Cited by 912 (9 self)
- Add to MetaCart
Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify cross-cutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.
Unsupervised Learning from Dyadic Data
, 1998
"... Dyadic data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This includes event co-occurrences, histogram data, and single stimulus preference data as special cases. Dyadic data arises naturally in many applic ..."
Abstract
-
Cited by 89 (9 self)
- Add to MetaCart
Dyadic data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This includes event co-occurrences, histogram data, and single stimulus preference data as special cases. Dyadic data arises naturally in many applications ranging from computational linguistics and information retrieval to preference analysis and computer vision. In this paper, we present a systematic, domain-independent framework for unsupervised learning from dyadic data by statistical mixture models. Our approach covers different models with flat and hierarchical latent class structures and unifies probabilistic modeling and structure discovery. Mixture models provide both, a parsimonious yet flexible parameterization of probability distributions with good generalization performance on sparse data, as well as structural information about data-inherent grouping structure. We propose an annealed version of the standard Expectation Maximization algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains.
Self Organization in Vision: Stochastic Clustering for Image Segmentation, Perceptual Grouping, and Image Database Organization
, 2001
"... We present a stochastic clustering algorithm which uses pairwise similarity of elements, and show how it can be used to address various problems in computer vision, including the low-level image segmentation, mid-level perceptual grouping, and high-level image database organization. The clustering p ..."
Abstract
-
Cited by 64 (4 self)
- Add to MetaCart
We present a stochastic clustering algorithm which uses pairwise similarity of elements, and show how it can be used to address various problems in computer vision, including the low-level image segmentation, mid-level perceptual grouping, and high-level image database organization. The clustering problem is viewed as a graph partitioning problem, where nodes represent data elements and the weights of the edges represent pairwise similarities. We generate samples of cuts in this graph, by using Karger's contraction algorithm, and compute an "average" cut which provides the basis for our solution to the clustering problem. The stochastic nature of our method makes it robust against noise, including accidental edges and small spurious clusters. The complexity of our algorithm is very low: O(|E| log² N) for N objects, |E| similarity relations and a fixed accuracy level. In addition, and without additional computational cost, our algorithm provides a hierarchy of nested partitions. We demonstrate the superiority of our method for image segmentation on a few synthetic and real images, B&W and color. Our other examples include the concatenation of edges in a cluttered scene (perceptual grouping), and the organization of an image database for the purpose of multi-view 3D object recognition.
Efficient Graph-Based Energy Minimization Methods In Computer Vision
, 1999
"... ms (we show that exact minimization in NP-hard in these cases). These algorithms produce a local minimum in interesting large move spaces. Furthermore, one of them nds a solution within a known factor from the optimum. The algorithms are iterative and compute several graph cuts at each iteration. Th ..."
Abstract
-
Cited by 63 (4 self)
- Add to MetaCart
ms (we show that exact minimization in NP-hard in these cases). These algorithms produce a local minimum in interesting large move spaces. Furthermore, one of them nds a solution within a known factor from the optimum. The algorithms are iterative and compute several graph cuts at each iteration. The running time at each iteration is eectively linear due to the special graph structure. In practice it takes just a few iterations to converge. Moreover most of the progress happens during the rst iteration. For a certain piecewise constant prior we adapt the algorithms developed for the piecewise smooth prior. One of them nds a solution within a factor of two from the optimum. In addition we develop a third algorithm which nds a local minimum in yet another move space. We demonstrate the eectiveness of our approach on image restoration, stereo, and motion. For the data with ground truth, our methods signicantly outperform standard methods. Biographical Sketch Olga
A New Graph-Theoretic Approach to Clustering, with Applications to Computer Vision
, 2004
"... This work applies cluster analysis as a unified approach for a wide range of vision applications, thereby combining the research domain of computer vision and that of machine learning. Cluster analysis is the formal study of algorithms and methods for recovering the inherent structure within a given ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
This work applies cluster analysis as a unified approach for a wide range of vision applications, thereby combining the research domain of computer vision and that of machine learning. Cluster analysis is the formal study of algorithms and methods for recovering the inherent structure within a given dataset. Many problems of computer vision have precisely this goal, namely to find which visual entities belong to an inherent structure, e.g. in an image or in a database of images. For example, a meaningful structure in the context of image segmentation is a set of pixels which correspond to the same object in a scene. Clustering algorithms can be used to partition the pixels of an image into meaningful parts, which may correspond to different objects. In this work we focus on the problems of image segmentation and image database organization. The visual entities to consider are pixels and images, respectively. Our first contribution in this work is a novel partitional (flat) clustering algorithm. The algorithm uses pairwise representation, where the visual objects (pixels,
Outex - New framework for empirical evaluation of texture analysis algorithms
- Proc. 16th International Conference on Pattern Recognition
, 2002
"... This paper presents the current status of a new initiative aimed at developing a versatile framework and image database for empirical evaluation of texture analysis algorithms. The proposed Outex framework contains a large collection of surface textures captured under different conditions, which fac ..."
Abstract
-
Cited by 37 (9 self)
- Add to MetaCart
This paper presents the current status of a new initiative aimed at developing a versatile framework and image database for empirical evaluation of texture analysis algorithms. The proposed Outex framework contains a large collection of surface textures captured under different conditions, which facilitates construction of a wide range of texture analysis problems. The problems are encapsulated into test suites, for which baseline results obtained with algorithms from literature are provided. The rich functionality of the framework is demonstrated with examples in texture classification, segmentation and retrieval. The framework has a web site for public dissemination of the database and comparative results obtained by research groups world wide. 1.
Segmentation given partial grouping constraints
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2004
"... We consider data clustering problems where partial grouping is known a priori. We formulate such biased grouping problems as a constrained optimization problem, where structural properties of the data define the goodness of a grouping and partial grouping cues define the feasibility of a grouping. ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
We consider data clustering problems where partial grouping is known a priori. We formulate such biased grouping problems as a constrained optimization problem, where structural properties of the data define the goodness of a grouping and partial grouping cues define the feasibility of a grouping. We enforce grouping smoothness and fairness on labeled data points so that sparse partial grouping information can be effectively propagated to the unlabeled data. Considering the normalized cuts criterion in particular, our formulation leads to a constrained eigenvalue problem. By generalizing the Rayleigh-Ritz theorem to projected matrices, we find the global optimum in the relaxed continuous domain by eigendecomposition, from which a near-global optimum to the discrete labeling problem can be obtained effectively. We apply our method to real image segmentation problems, where partial grouping priors can often be derived based on a crude spatial attentional map that binds places with common salient features or focuses on expected object locations. We demonstrate not only that it is possible to integrate both image structures and priors in a single grouping process, but also that objects can be segregated from the background without specific object knowledge.
Histogram Clustering for Unsupervised Segmentation and Image Retrieval
- Pattern Recognition Letters
, 1998
"... This paper introduces a novel statistical latent class model for probabilistic grouping of distributional and histogram data. Adopting the Bayesian framework, we propose to perform annealed maximum a posteriori estimation to compute optimal clustering solutions. In order to accelerate the optimizati ..."
Abstract
-
Cited by 34 (13 self)
- Add to MetaCart
This paper introduces a novel statistical latent class model for probabilistic grouping of distributional and histogram data. Adopting the Bayesian framework, we propose to perform annealed maximum a posteriori estimation to compute optimal clustering solutions. In order to accelerate the optimization process, an efficient multiscale formulation is developed. We present a prototypical application of this method for unsupervised segmentation of textured images based on local distributions of Gabor coefficients. Benchmark results indicate superior performance compared to --means clustering and proximity-based algorithms. In a second application the histogram clustering method is utilized to structure image databases for improved image retrieval. Key words: Histogram Clustering, Texture Segmentation, Multiscale Annealing, Image Retrieval 1 Introduction Grouping, segmentation, coarsening, and quantization are central and omnipresent topics in image processing and computer vision. I...
Blind motion deblurring using image statistics
- In Advances in Neural Information Processing Systems (NIPS
"... We address the problem of blind motion deblurring from a single image, caused by a few moving objects. In such situations only part of the image may be blurred, and the scene consists of layers blurred in different degrees. Most of of existing blind deconvolution research concentrates at recovering ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
We address the problem of blind motion deblurring from a single image, caused by a few moving objects. In such situations only part of the image may be blurred, and the scene consists of layers blurred in different degrees. Most of of existing blind deconvolution research concentrates at recovering a single blurring kernel for the entire image. However, in the case of different motions, the blur cannot be modeled with a single kernel, and trying to deconvolve the entire image with the same kernel will cause serious artifacts. Thus, the task of deblurring needs to involve segmentation of the image into regions with different blurs. Our approach relies on the observation that the statistics of derivative filters in images are significantly changed by blur. Assuming the blur results from a constant velocity motion, we can limit the search to one dimensional box filter blurs. This enables us to model the expected derivatives distributions as a function of the width of the blur kernel. Those distributions are surprisingly powerful in discriminating regions with different blurs. The approach produces convincing deconvolution results on real world images with rich texture. 1

