Results 1 - 10
of
32
Data Clustering: A Review
- ACM COMPUTING SURVEYS
, 1999
"... Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exp ..."
Abstract
-
Cited by 912 (9 self)
- Add to MetaCart
Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify cross-cutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.
Discriminant Analysis by Gaussian Mixtures
- Journal of the Royal Statistical Society, Series B
, 1996
"... Fisher-Rao linear discriminant analysis (LDA) is a valuable tool for multigroup classification. LDA is equivalent to maximum likelihood classification assuming Gaussian distributions for each class. In this paper, we fit Gaussian mixtures to each class to facilitate effective classification in non-n ..."
Abstract
-
Cited by 124 (9 self)
- Add to MetaCart
Fisher-Rao linear discriminant analysis (LDA) is a valuable tool for multigroup classification. LDA is equivalent to maximum likelihood classification assuming Gaussian distributions for each class. In this paper, we fit Gaussian mixtures to each class to facilitate effective classification in non-normal settings, especially when the classes are clustered. Low dimensional views are an important by-product of LDA---our new techniques inherit this feature. We are able to control the within-class spread of the subclass centers relative to the between-class spread. Our technique for fitting these models permits a natural blend with nonparametric versions of LDA. Keywords: Classification, Pattern Recognition, Clustering, Nonparametric, Penalized. 1 Introduction In the generic classification or discrimination problem, the outcome of interest G falls into J unordered classes, which for convenience we denote by the set J = f1; 2; 3; \Delta \Delta \Delta Jg. We wish to build a rule for pred...
A Survey of Fuzzy Clustering Algorithms for Pattern Recognition
, 1998
"... Clustering algorithms aim at modelling fuzzy (i.e., ambiguous) unlabeled patterns efficiently. Our goal is to propose a theoretical framework where clustering systems can be compared on the basis of their learning strategies. In the first part of this work, the following issues are reviewed: relativ ..."
Abstract
-
Cited by 38 (2 self)
- Add to MetaCart
Clustering algorithms aim at modelling fuzzy (i.e., ambiguous) unlabeled patterns efficiently. Our goal is to propose a theoretical framework where clustering systems can be compared on the basis of their learning strategies. In the first part of this work, the following issues are reviewed: relative (probabilistic) and absolute (possibilistic) fuzzy membership functions and their relationships to the Bayes rule, batch and on-line learning, growing and pruning networks, modular network architectures, topologically perfect mapping, ecological nets and neuro-fuzziness. From this discussion an equivalence between the concepts of fuzzy clustering and soft competitive learning in clustering algorithms is proposed as a unifying framework in the comparison of clustering systems. Moreover, a set of functional attributes is selected for use as dictionary entries in our comparison. In the second part of this paper, five clustering algorithms taken from the literature are reviewed and compared on...
Growing radial basis neural networks: Merging supervised and unsupervised learning with network growth techniques
- IEEE Transactions on Neural Networks
, 1997
"... Abstract—This paper proposes a framework for constructing and training radial basis function (RBF) neural networks. The proposed growing radial basis function (GRBF) network begins with a small number of prototypes, which determine the locations of radial basis functions. In the process of training, ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
Abstract—This paper proposes a framework for constructing and training radial basis function (RBF) neural networks. The proposed growing radial basis function (GRBF) network begins with a small number of prototypes, which determine the locations of radial basis functions. In the process of training, the GRBF network grows by splitting one of the prototypes at each growing cycle. Two splitting criteria are proposed to determine which prototype to split in each growing cycle. The proposed hybrid learning scheme provides a framework for incorporating existing algorithms in the training of GRBF networks. These include unsupervised algorithms for clustering and learning vector quantization, as well as learning algorithms for training single-layer linear neural networks. A supervised learning scheme based on the minimization of the localized class-conditional variance is also proposed and tested. GRBF neural networks are evaluated and tested on a variety of data sets with very satisfactory results. Index Terms — Class-conditional variance, network growing, radial basis neural network, radial basis function, splitting criterion,
The Enhanced LBG Algorithm
, 2001
"... Clustering applications cover several elds such as audio and video data compression, pattern recognition, computer vision, medical image recognition, etc. In this paper we present a new clustering algorithm called Enhanced LBG (ELBG). It belongs to the hard and K-means vector quantization groups an ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Clustering applications cover several elds such as audio and video data compression, pattern recognition, computer vision, medical image recognition, etc. In this paper we present a new clustering algorithm called Enhanced LBG (ELBG). It belongs to the hard and K-means vector quantization groups and derives directly from the simpler LBG. The basic idea we have developed is the concept of utility of a codeword, a powerful instrument to overcome one of the main drawbacks of clustering algorithms: generally, the results achieved are not good in the case of a bad choice of the initial codebook. We will present our experimental results showing that ELBG is able to nd better codebooks than previous clustering techniques and the computational complexity is virtually the same as the simpler LBG.
Constructive Feedforward ART Clustering Networks - Part II
, 2002
"... Part I of this paper defines the class of constructive unsupervised on-line learning simplified adaptive resonance theory (SART) clustering networks. Proposed instances of class SART are the symmetric Fuzzy ART (S-Fuzzy ART) and the Gaussian ART (GART) network. In Part II of our work, a third networ ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Part I of this paper defines the class of constructive unsupervised on-line learning simplified adaptive resonance theory (SART) clustering networks. Proposed instances of class SART are the symmetric Fuzzy ART (S-Fuzzy ART) and the Gaussian ART (GART) network. In Part II of our work, a third network belonging to class SART, termed fully self-organizing SART (FOSART), is presented and discussed. FOSART is a constructive, soft-to-hard competitive, topology-preserving, minimum-distance-to-means clustering algorithm capable of: 1) generating processing units and lateral connections on an example-driven basis and 2) removing processing units and lateral connections on a minibatch basis. FOSART is compared with Fuzzy ART, S-Fuzzy ART, GART and other well-known clustering techniques (e.g., neural gas and self-organizing map) in several unsupervised learning tasks, such as vector quantization, perceptual grouping and 3-D surface reconstruction. These experiments prove that when compared with other unsupervised learning networks, FOSART provides an interesting balance between easy user interaction, performance accuracy, efficiency, robustness, and flexibility.
Scale-based Clustering using the Radial Basis Function Network
- IEEE Trans. Neural Networks
, 1996
"... This paper shows how scale-based clustering can be done using the Radial Basis Function (RBF) Network, with the RBF width as the scale parameter and a dummy target as the desired output. The technique suggests the "right" scale at which the given data set should be clustered, thereby providing a sol ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
This paper shows how scale-based clustering can be done using the Radial Basis Function (RBF) Network, with the RBF width as the scale parameter and a dummy target as the desired output. The technique suggests the "right" scale at which the given data set should be clustered, thereby providing a solution to the problem of determining the number of RBF units and the widths required to get a good network solution. The network compares favorably with other standard techniques on benchmark clustering examples. Properties that are required of non-gaussian basis functions, if they are to serve in alternative clustering networks, are identified. The work on the whole points out an important role played by the width parameter in RBFN, when observed over several scales, and provides a fundamental link to the scale space theory developed in computational vision. The work described here is supported in part by the National Science Foundation under grant ECS-9307632 and in part by ONR Contract N...
Image compression with neural networks - A survey
- Signal Processing: Image Communication 14
, 1999
"... Apart from the existing technology on image compression represented by series of JPEG, MPEG and H.26x standards, new technology such as neural networks and genetic algorithms are being developed to explore the future of image coding. Successful applications of neural networks to vector quantization ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Apart from the existing technology on image compression represented by series of JPEG, MPEG and H.26x standards, new technology such as neural networks and genetic algorithms are being developed to explore the future of image coding. Successful applications of neural networks to vector quantization have now become well established, and other aspects of neural network involvement in this area are stepping up to play signi"cant roles in assisting with those traditional technologies. This paper presents an extensive survey on the development of neural networks for image compression which covers three categories: direct image compression by neural networks; neural network implementation of existing techniques, and neural network based technology which provide improvement over traditional algorithms. # 1999 Elsevier Science B.V. All rights reserved.
Simplified ART: A new class of ART algorithms
, 1998
"... The Simplified Adaptive Resonance Theory (SART) class of networks is proposed to handle problems encountered in Adaptive Resonance Theory 1 (ART 1)-based algorithms when detection of binary and analog patterns is performed. The basic idea of SART is to substitute ART 1-based "unidirectional" (asym ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
The Simplified Adaptive Resonance Theory (SART) class of networks is proposed to handle problems encountered in Adaptive Resonance Theory 1 (ART 1)-based algorithms when detection of binary and analog patterns is performed. The basic idea of SART is to substitute ART 1-based "unidirectional" (asymmetric) activation and match functions with "bidirectional" (symmetric) function pairs. This substitution makes the class of SART algorithms potentially more robust and less time-consuming than ART 1based systems. One SART algorithm, termed Fuzzy SART, is discussed. Fuzzy SART employs probabilistic and possibilistic fuzzy membership functions to combine soft competitive learning with outlier detection. Its soft competitive strategy relates Fuzzy SART to the well-known Self-Organizing Map and Neural Gas clustering algorithm. A new Normalized Vector Distance, which can be employed by Fuzzy SART, is also presented. Fuzzy SART performs better than ART 1-based Carpenter-Grossberg-Rosen Fuzzy ART ...
An Algorithm for Unsupervised Learning via Normal Mixture Models
- In ISIS: Information, Statistics and Induction in Science
, 1996
"... : We consider the approach to unsupervised learning whereby a normal mixture model is fitted to the data by maximum likelihood. An algorithm called NMM is presented that enables the normal mixture model with either restricted or unrestricted component covariance matrices to be fitted to a given data ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
: We consider the approach to unsupervised learning whereby a normal mixture model is fitted to the data by maximum likelihood. An algorithm called NMM is presented that enables the normal mixture model with either restricted or unrestricted component covariance matrices to be fitted to a given data set. The algorithm automatically handles the problem of the specification of initial values for the parameters in the iterative fitting of the model within the framework of the EM algorithm. The algorithm also has the provision to carry a test for the number of components on the basis of the likelihood ratio statistic. Keywords: Mixture models, Maximum likelihood, EM algorithm, Likelihood ratio test. Area of Interest: Concept Formation and Classification. 1 Introduction In this paper we consider the development of an algorithm for the fitting of a normal mixture model in the absence of data on entities that have been classified with respect to the components of the mixture. This is usual...

