## Parametric and Non-parametric Unsupervised Cluster Analysis (1996)

Venue: | Pattern Recognition |

Citations: | 55 - 6 self |

### BibTeX

@ARTICLE{Roberts96parametricand,

author = {Stephen J. Roberts},

title = {Parametric and Non-parametric Unsupervised Cluster Analysis},

journal = {Pattern Recognition},

year = {1996},

volume = {30},

pages = {261--272}

}

### Years of Citing Articles

### OpenURL

### Abstract

Much work has been published on methods for assessing the probable number of clusters or structures within unknown data sets. This paper aims to look in more detail at two methods, a broad parametric method, based around the assumption of Gaussian clusters and the other a non-parametric method which utilises methods of scale-space filtering to extract robust structures within a data set. It is shown that, whilst both methods are capable of determining cluster validity for data sets in which clusters tend towards a multivariate Gaussian distribution, the parametric method inevitably fails for clusters which have a non-Gaussian structure whilst the scale-space method is more robust. Key words : Cluster analysis, maximum likelihood methods, scale-space filtering, probability density estimation. 1 Introduction Most scientific disciplines generate experimental data from an observed system about which we have may have little understanding of the data generating function. The notion that com...

### Citations

9193 | Maximum likelihood from incomplete data via the EM algorithm - Dempster, Laird, et al. - 1977 |

3867 |
Fuzzy sets
- Zadeh
- 1965
(Show Context)
Citation Context ...uster with a large number of members more rather than recognising one with a small number [1] . The use of `fuzzy' cluster methods (whereby a datum's membership may be distributed over many clusters) =-=[6]-=- may be incorporated into the genre of partitional methods with relative ease and has proved popular [7; 8; 9; 10; 11; 12] Such an extension to the modelling process is certainly more representative o... |

2365 | Algorithms for Clustering Data - Jain, Dubes - 1988 |

1713 |
Clustering Algorithms
- Hartigan
- 1975
(Show Context)
Citation Context ...methods must often be tried from the analysis `toolbox' before data structure may be inferred. Excellent reviews of many methods may be found in, for example, Jain and Dubes [2] , Jain [1] , Hartigan =-=[3]-=- and Everitt [4] . 1 up to 1982. On a broad, descriptive level cluster analysis algorithms can be broken into two distinct phases. 2 Firstly, a model fitting phase, whereby some partition hypothesis o... |

1477 |
Pattern Recognition with Fuzzy Objective Function Algorithms
- Bezdek
- 1981
(Show Context)
Citation Context ...of `fuzzy' cluster methods (whereby a datum's membership may be distributed over many clusters) [6] may be incorporated into the genre of partitional methods with relative ease and has proved popular =-=[7; 8; 9; 10; 11; 12]-=- Such an extension to the modelling process is certainly more representative of `real' data. There are still, however, difficulties encountered. The `optimal' number of clusters must be estimated, the... |

560 |
Cluster Analysis
- Everitt
- 1993
(Show Context)
Citation Context ...en be tried from the analysis `toolbox' before data structure may be inferred. Excellent reviews of many methods may be found in, for example, Jain and Dubes [2] , Jain [1] , Hartigan [3] and Everitt =-=[4]-=- . 1 up to 1982. On a broad, descriptive level cluster analysis algorithms can be broken into two distinct phases. 2 Firstly, a model fitting phase, whereby some partition hypothesis of complexity K, ... |

504 | Unsupervised texture segmentation using Gabor filteres
- Jain, Farrokhnia
- 1991
(Show Context)
Citation Context ...nts within the image is unknown (in the majority of cases) a priori. To this end, much effort has been directed at the application of unsupervised clustering methods to features extracted from images =-=[22; 23; 24; 25; 26]-=- . It is unreasonable to expect, however, for feature-space data sets, constructed from the image, to have a simple hyper-ellipsoidal cluster structure, and hence the use of algorithms such as K-means... |

452 |
A Nonlinear Mapping for Data Structure Analysis
- Sammon
- 1969
(Show Context)
Citation Context ...paper on a simple data set consisting of five Gaussian clusters each of 1000 data. Each x is 5-dimensional, but for ease of visualisation the data is projected to a 2-D space using the Sammon mapping =-=[19]-=- . Figure (3a) shows the data set used in this experiment. Plot (b) shows the variation of the likelihood density parameter, as defined in Equation (15), for both the ML (o) and K-means algorithms (+)... |

282 |
Digital Image Processing
- Pratt
- 1978
(Show Context)
Citation Context ... image segments. The second test image is the (well-known) `house' image, configured here as a 128 \Theta 128 greyscale image and shown in Figure (9a). Two of the Laws microstructure texture measures =-=[27; 29; 30]-=- were obtained for each pixel within the image calculated via application of the following 3 \Theta 3 masks. L 1 = 2 6 4 1 2 1 2 4 2 1 2 1 3 7 5L 3 = 2 6 4 \Gamma1 2 \Gamma1 \Gamma2 4 \Gamma2 \Gamma1 ... |

266 |
Unsupervised Optimal Fuzzy Clustering
- Geva
- 1989
(Show Context)
Citation Context ...of `fuzzy' cluster methods (whereby a datum's membership may be distributed over many clusters) [6] may be incorporated into the genre of partitional methods with relative ease and has proved popular =-=[7; 8; 9; 10; 11; 12]-=- Such an extension to the modelling process is certainly more representative of `real' data. There are still, however, difficulties encountered. The `optimal' number of clusters must be estimated, the... |

254 |
On the Estimation of a Probability Density Function and
- Parzen
- 1962
(Show Context)
Citation Context ...nctions (kernels), f , sited at each x i 2 X .sp(x) = N X i=1 w (i) f (i) (x) (19) where the superscript \Delta (i) implies the basis function is sited at x i . If we take the Parzen-windows approach =-=[17]-=- , whereby the weighting of each basis function is independent of its position we may write Equation (19) assp(x) = w N X i=1 f (i) (x) (20) whence w becomes a normalising factor ensuring thatsp(x) in... |

156 | T.Poggio, "Fingerprints Theorems For Zero Crossings
- Yuille
- 1985
(Show Context)
Citation Context ...ons may be equated to solutions of Equation (26) by letting h(x; s) = w N X i=1 OE (i) s (x) (31) Parameterising the evolution of a curve in scale-space by an arbitrary parameter, t say, we may write =-=[18]-=- dh dt = @h @x \Delta @x @t + @h @s @s @t = d X n=1 @h @x n @x n @t + @h @s @s @t (32) At a point of merging or splitting (as in Figure (1)) so dh dt = 0 and setting t = x l (the l-th component of x) ... |

145 |
Textured image segmentation
- Laws
- 1980
(Show Context)
Citation Context ... image segments. The second test image is the (well-known) `house' image, configured here as a 128 \Theta 128 greyscale image and shown in Figure (9a). Two of the Laws microstructure texture measures =-=[27; 29; 30]-=- were obtained for each pixel within the image calculated via application of the following 3 \Theta 3 masks. L 1 = 2 6 4 1 2 1 2 4 2 1 2 1 3 7 5L 3 = 2 6 4 \Gamma1 2 \Gamma1 \Gamma2 4 \Gamma2 \Gamma1 ... |

110 |
A review of recent texture segmentation and feature extraction techniques
- Reed, Buf, et al.
- 1993
(Show Context)
Citation Context ...nts within the image is unknown (in the majority of cases) a priori. To this end, much effort has been directed at the application of unsupervised clustering methods to features extracted from images =-=[22; 23; 24; 25; 26]-=- . It is unreasonable to expect, however, for feature-space data sets, constructed from the image, to have a simple hyper-ellipsoidal cluster structure, and hence the use of algorithms such as K-means... |

91 |
A comparison of neural network and fuzzy clustering techniques in segmenting magnetic resonance images of the brain
- Hall, Bensaid, et al.
- 1992
(Show Context)
Citation Context ...nts within the image is unknown (in the majority of cases) a priori. To this end, much effort has been directed at the application of unsupervised clustering methods to features extracted from images =-=[22; 23; 24; 25; 26]-=- . It is unreasonable to expect, however, for feature-space data sets, constructed from the image, to have a simple hyper-ellipsoidal cluster structure, and hence the use of algorithms such as K-means... |

63 |
Image Segmentation by Clustering
- Coleman
- 1979
(Show Context)
Citation Context |

55 |
A probabilistic resource allocating network for novelty detection
- Roberts, Tarassenko
- 1994
(Show Context)
Citation Context ...traint that K X k=1 p(k) = 1 (7) If we let the free parameters of each component, k, of the GMM be specified by a parameter vector ` k then combining Equations (5,6) we obtain, for the k-th component =-=[14; 15] N X-=- i=1 p(k j x i ) @ @` k log p(x i j k; ` k ) = 0 (8) For a GMM, each kernel is a multivariate Gaussian whose free parameters are completely specified by its mean, �� k , and covariance matrix, \Si... |

54 |
A Neural Network Approach to Statistical Pattern Classification by “Semiparametric” Estimation of Probability Density Functions
- Traven
- 1991
(Show Context)
Citation Context ...traint that K X k=1 p(k) = 1 (7) If we let the free parameters of each component, k, of the GMM be specified by a parameter vector ` k then combining Equations (5,6) we obtain, for the k-th component =-=[14; 15] N X-=- i=1 p(k j x i ) @ @` k log p(x i j k; ` k ) = 0 (8) For a GMM, each kernel is a multivariate Gaussian whose free parameters are completely specified by its mean, �� k , and covariance matrix, \Si... |

38 |
Computer and Robot Vision Volume 1
- Haralick, Shapiro
- 1992
(Show Context)
Citation Context ...e top left hand corner of the image in Figure (7a). Two texture measures are evaluated from a 7 \Theta 7 sliding mask applied to the image. Firstly correlation estimated from the co-occurrence matrix =-=[27]-=- and secondly the first grey-scale moment of the grey-scale run length matrix (GSRLM) as proposed in [28] . Full details of both these texture measures may be found in (a) 0 5 10 15 20 2100 2200 2300 ... |

37 |
Efficient Implementation of the Fuzzy c-mean Clustering Algorithms
- Cannon, Daveand, et al.
- 1986
(Show Context)
Citation Context ...of `fuzzy' cluster methods (whereby a datum's membership may be distributed over many clusters) [6] may be incorporated into the genre of partitional methods with relative ease and has proved popular =-=[7; 8; 9; 10; 11; 12]-=- Such an extension to the modelling process is certainly more representative of `real' data. There are still, however, difficulties encountered. The `optimal' number of clusters must be estimated, the... |

35 |
Validity studies in clustering methodologies
- Dubes, Jain
- 1976
(Show Context)
Citation Context ...alidity of data models of differing complexity (number of partitions) have been proposed. Most, however, rely (implicitly or explicitly) upon estimates of within- and between-cluster scatter matrices =-=[1; 5; 2]-=- . The major problem with this approach, of course, is that if data do not conform to the assumptions made by the technique then the latter may impose structure on the data and not disclose the `true'... |

34 | A quad-tree approach to image segmentation which combines statistical and spatial information,” Pattern Recognition
- Spann, Wilson
- 1985
(Show Context)
Citation Context |

19 |
Cluster Validity for the Fuzzy c-Means Clustering Algorithm
- Windham
- 1982
(Show Context)
Citation Context |

19 |
A new approach to clustering
- Wilson, Spann
- 1990
(Show Context)
Citation Context ...ed to the well-known K-means algorithm as well. The second method may be seen as falling within the hierarchichal clustering genre or as a method of scale-space (multiresolution) parameter estimation =-=[13]-=- . Results from both methods are compared on test data and the scale-space method on examples from image and signal processing. 2 Maximum Likelihood & K-means Algorithms 2.1 Theory We consider the cas... |

17 |
Self-organized formation of topographically correct feature maps
- Kohonen
- 1982
(Show Context)
Citation Context ...(a), decay of ��(s) with s -- (b) masking of the `house' image using a posteriori probabilities for four partitions -- (c) to (f ) respectively. manifold using, for example, Kohonen's topographic =-=map [32]-=- or Sammon's non-linear mapping [19] . It is noted that more efficient algorithms than a simple grid search exist for evaluating the zeroes of multi-dimensional functions [33] and their use is an area... |

5 |
Comments on the Performance of Maximum Entropy Algorithms
- Andersen
- 1978
(Show Context)
Citation Context ...bility and muscle tremor [20] . In a separate study, the time-domain VMG signal was parameterised over 0.1-second segments using an 8 th -order AR model (parameters estimated using the Burg algorithm =-=[21]-=- ). Figure (6a) shows a plot of the first two partial correlation (reflection) coefficients for data accumulated from one subject over eight muscle-force levels. We see that there is a clustering of t... |

2 |
Classification, Pattern Recognition and Reduction of Dimensionality -- editors Krishnaiah and Kanal, volume 2, chapter 2
- Jain
- 1982
(Show Context)
Citation Context ...ute this -- indeed the fact that data structure is multifarious will inevitably prompt the development of a large number of clustering approaches. Commenting on a survey of clustering techniques Jain =-=[1]-=- states that some 40 books alone had been published on the subject. 1 It is important that this breadth of approach is acknowledged, as the problem is complex and many methods must often be tried from... |

2 |
Unsupervised Learning Algorithm for Fuzzy Clustering
- Urahama
- 1993
(Show Context)
Citation Context |

2 |
New results in fuzzy clustering based on the concept of indistinguishability relation
- Mantaras, R, et al.
- 1988
(Show Context)
Citation Context |

2 |
The analysis of natural textures using run length features
- Loh, Leu, et al.
- 1988
(Show Context)
Citation Context ...sliding mask applied to the image. Firstly correlation estimated from the co-occurrence matrix [27] and secondly the first grey-scale moment of the grey-scale run length matrix (GSRLM) as proposed in =-=[28]-=- . Full details of both these texture measures may be found in (a) 0 5 10 15 20 2100 2200 2300 2400 K rho(K) (b) 0 5 10 15 0 5 10 15 20 s (c) (d) Figure 5: Data set of three non-Gaussian clusters -- (... |

1 |
Analysis of the Vibromyogram in the Assessment of Brain Injured Patients
- Outten
- 1995
(Show Context)
Citation Context ...(VMG) is a non-invasive measurement of muscle sounds. It contains information regarding muscular activity as a function of force and is of importance in the assessment of disability and muscle tremor =-=[20]-=- . In a separate study, the time-domain VMG signal was parameterised over 0.1-second segments using an 8 th -order AR model (parameters estimated using the Burg algorithm [21] ). Figure (6a) shows a p... |