Results 1 - 10
of
22
Segmentation of multivariate mixed data via lossy coding and compression
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2007
"... Abstract—In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmen ..."
Abstract
-
Cited by 46 (11 self)
- Add to MetaCart
Abstract—In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate-distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm that depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data. Index Terms—Multivariate mixed data, data segmentation, data clustering, rate distortion, lossy coding, lossy compression, image segmentation, microarray data clustering. 1
A Unified Framework for Model-based Clustering
- Journal of Machine Learning Research
, 2003
"... Model-based clustering techniques have been widely used and have shown promising results in many applications involving complex data. This paper presents a unified framework for probabilistic model-based clustering based on a bipartite graph view of data and models that highlights the commonaliti ..."
Abstract
-
Cited by 43 (6 self)
- Add to MetaCart
Model-based clustering techniques have been widely used and have shown promising results in many applications involving complex data. This paper presents a unified framework for probabilistic model-based clustering based on a bipartite graph view of data and models that highlights the commonalities and differences among existing model-based clustering algorithms. In this view, clusters are represented as probabilistic models in a model space that is conceptually separate from the data space. For partitional clustering, the view is conceptually similar to the ExpectationMaximization (EM) algorithm. For hierarchical clustering, the graph-based view helps to visualize critical/important distinctions between similarity-based approaches and model-based approaches.
Event Detection by Eigenvector Decomposition Using Object and Frame
, 2004
"... We develop an event detection framework that has two significant advantages over past work. First, we introduce an extended set of time-wise and object-wise statistical features including not only the trajectories but also histograms and HMM's of speed, orientation, location, size, and aspect ratio. ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
We develop an event detection framework that has two significant advantages over past work. First, we introduce an extended set of time-wise and object-wise statistical features including not only the trajectories but also histograms and HMM's of speed, orientation, location, size, and aspect ratio. The proposed features are more expressive and enable detection of events that cannot be detected with trajectory-based features reported so far. Second, we introduce a spectral clustering method that can estimate the optimal number of clusters automatically. This novel clustering technique that is not adversely affected by high dimensionality. Unlike the conventional approaches that fit predefined models to events, we determine unusual events by analyzing the conformity scores. We compute affinity matrices and apply eigenvalue decomposition to find clusters to obtain the usual events. We prove that the number of clusters governs the number of eigenvectors used to span the feature similarity space. We also improve the feature selection process.
Parts-based 3D object classification
, 2004
"... This paper presents a parts-based method for classifying scenes of 3D objects into a set of pre-determined object classes. Working at the part level, as opposed to the whole object level, enables a more flexible class representation and allows scenes in which the query object is significantly occlud ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
This paper presents a parts-based method for classifying scenes of 3D objects into a set of pre-determined object classes. Working at the part level, as opposed to the whole object level, enables a more flexible class representation and allows scenes in which the query object is significantly occluded to be classified. In our approach, parts are extracted from training objects and grouped into part classes using a hierarchical clustering algorithm. Each part class is represented as a collection of semi-local shape features and can be used to perform part class recognition. A mapping from part classes to object classes is derived from the learned part classes and known object classes. At run-time, a 3D query scene is sampled, local shape features are computed, and the object class is determined using the learned part classes and the part-to-object mapping. The approach is demonstrated by classifying novel 3D scenes of vehicles into eight classes.
Unsupervised Segmentation of Natural Images via Lossy Data Compression
, 2007
"... In this paper, we cast natural-image segmentation as a problem of clustering texture features as multivariate mixed data. We model the distribution of the texture features using a mixture of Gaussian distributions. Unlike most existing clustering methods, we allow the mixture components to be degene ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
In this paper, we cast natural-image segmentation as a problem of clustering texture features as multivariate mixed data. We model the distribution of the texture features using a mixture of Gaussian distributions. Unlike most existing clustering methods, we allow the mixture components to be degenerate or nearly-degenerate. We contend that this assumption is particularly important for mid-level image segmentation, where degeneracy is typically introduced by using a common feature representation for different textures in an image. We show that such a mixture distribution can be effectively segmented by a simple agglomerative clustering algorithm derived from a lossy data compression approach. Using either 2D texture filter banks or simple fixed-size windows to obtain texture features, the algorithm effectively segments an image by minimizing the overall coding length of the feature vectors. We conduct comprehensive experiments to measure the performance of the algorithm in terms of visual evaluation and a variety of quantitative indices for image segmentation. The algorithm compares favorably against other well-known image-segmentation methods on the Berkeley image database.
Landscape of clustering algorithms
- In Proceedings of the 17th International Conference on Pattern Recognition (ICPR
, 2004
"... Numerous clustering algorithms, their taxonomies and evaluation studies are available in the literature. Despite the diversity of different clustering algorithms, solutions delivered by these algorithms exhibit many commonalities. An analysis of the similarity and properties of clustering objective ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Numerous clustering algorithms, their taxonomies and evaluation studies are available in the literature. Despite the diversity of different clustering algorithms, solutions delivered by these algorithms exhibit many commonalities. An analysis of the similarity and properties of clustering objective functions is necessary from the operational/user perspective. We revisit conventional categorization of clustering algorithms and attempt to relate them according to the partitions they produce. We empirically study the similarity of clustering solutions obtained by many traditional as well as relatively recent clustering algorithms on a number of real-world data sets. Sammon’s mapping and a complete-link clustering of the inter-clustering dissimilarity values are performed to detect a meaningful grouping of the objective functions. We find that only a small number of clustering algorithms are sufficient to represent a large spectrum of clustering criteria. For example, interesting groups of clustering algorithms are centered around the graph partitioning, linkage-based and Gaussian mixture model based algorithms. 1.
Soft clustering on graphs
- in Advances in Neural Information Processing Systems
, 2005
"... We propose a simple clustering framework on graphs that encode pairwise data similarities. Unlike usual similarity-based methods, the approach softly assigns data to clusters in a probabilistic way. More importantly, a hierarchical clustering is naturally derived in this framework to gradually merge ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
We propose a simple clustering framework on graphs that encode pairwise data similarities. Unlike usual similarity-based methods, the approach softly assigns data to clusters in a probabilistic way. More importantly, a hierarchical clustering is naturally derived in this framework to gradually merge lower-level clusters into higher-level ones. A random walk analysis indicates that the algorithm exposes clustering structures in various resolutions, i.e., a higher level statistically models a longer-term diffusion on graphs and thus discovers a more global clustering structure. Finally we provide very encouraging experimental results. 1
Likelihood based hierarchical clustering
- IEEE Trans. on Signal Processing
, 2004
"... This paper develops a new method for hierarchical clustering. Unlike other existing clustering schemes, our method is based on a generative, tree-structured model that represents relationships between the objects to be clustered, rather than directly modeling properties of objects themselves. In cer ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
This paper develops a new method for hierarchical clustering. Unlike other existing clustering schemes, our method is based on a generative, tree-structured model that represents relationships between the objects to be clustered, rather than directly modeling properties of objects themselves. In certain problems, this generative model naturally captures the physical mechanisms responsible for relationships among objects, for example, in certain evolutionary tree problems in genetics and communication network topology identification. The paper examines the networking problem in some detail, to illustrate the new clustering method. More broadly, the generative model may not reflect actual physical mechanisms, but it nonetheless provides a means for dealing with errors in the similarity matrix, simultaneously promoting two desirable features in clustering: intra-class similarity and inter-class dissimilarity.
Computational Models of Perceptual Organization
- Robotics Institute, Carnegie Mellon University
, 2003
"... Perceptual organization refers to the process of organizing sensory input into coherent and interpretable perceptual structures. This process is challenging due to the chicken-and-egg nature between the various sub-processes such as image segmentation, figure-ground segregation and object recognitio ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Perceptual organization refers to the process of organizing sensory input into coherent and interpretable perceptual structures. This process is challenging due to the chicken-and-egg nature between the various sub-processes such as image segmentation, figure-ground segregation and object recognition. Low-level processing requires the guidance of high-level knowledge to overcome noise; while high-level processing relies on low-level processes to reduce the computational complexity. Neither process can be sufficient on its own. Consequently, any system that carries out these processes in a sequence is bound to be brittle. An alternative system is one in which all processes interact with each other simultaneously. In this thesis, we develop a set of simple yet realistic interactive processing models for perceptual organization. We model the processing in the framework of spectral graph theory, with a criterion encoding the overall goodness of perceptual organization. We derive fast solutions for near-global optima of the criterion, and demonstrate the efficacy of the models on segmenting a wide range of real images. Through these models, we are able to capture a variety of perceptual phenomena: a unified treatment of various grouping, figure-ground and depth cues to produce popout, region segmentation and depth segregation in one step; and a unified framework for integrating bottom-up and top-down information to produce an object segmentation from spatial and object attention. We achieve these goals by empowering current spectral graph methods with a principled solution for multiclass spectral graph partitioning; expanded repertoire of grouping cues to include similarity, dissimilarity and ordering relationships; a theory for integrating sparse grouping cues; and a model ...

