Results 1  10
of
159
Clustering of the SelfOrganizing Map
, 2000
"... The selforganizing map (SOM) is an excellent tool in exploratory phase of data mining. It projects input space on prototypes of a lowdimensional regular grid that can be effectively utilized to visualize and explore properties of the data. When the number of SOM units is large, to facilitate quant ..."
Abstract

Cited by 269 (1 self)
 Add to MetaCart
(Show Context)
The selforganizing map (SOM) is an excellent tool in exploratory phase of data mining. It projects input space on prototypes of a lowdimensional regular grid that can be effectively utilized to visualize and explore properties of the data. When the number of SOM units is large, to facilitate quantitative analysis of the map and the data, similar units need to be grouped, i.e., clustered. In this paper, different approaches to clustering of the SOM are considered. In particular, the use of hierarchical agglomerative clustering and partitive clustering usingmeans are investigated. The twostage procedurefirst using SOM to produce the prototypes that are then clustered in the second stageis found to perform well when compared with direct clustering of the data and to reduce the computation time.
Data Exploration Using SelfOrganizing Maps
 ACTA POLYTECHNICA SCANDINAVICA: MATHEMATICS, COMPUTING AND MANAGEMENT IN ENGINEERING SERIES NO. 82
, 1997
"... Finding structures in vast multidimensional data sets, be they measurement data, statistics, or textual documents, is difficult and timeconsuming. Interesting, novel relations between the data items may be hidden in the data. The selforganizing map (SOM) algorithm of Kohonen can be used to aid the ..."
Abstract

Cited by 115 (4 self)
 Add to MetaCart
Finding structures in vast multidimensional data sets, be they measurement data, statistics, or textual documents, is difficult and timeconsuming. Interesting, novel relations between the data items may be hidden in the data. The selforganizing map (SOM) algorithm of Kohonen can be used to aid the exploration: the structures in the data sets can be illustrated on special map displays. In this work, the methodology of using SOMs for exploratory data analysis or data mining is reviewed and developed further. The properties of the maps are compared with the properties of related methods intended for visualizing highdimensional multivariate data sets. In a set of case studies the SOM algorithm is applied to analyzing electroencephalograms, to illustrating structures of the standard of living in the world, and to organizing fulltext document collections. Measures are proposed for evaluating the quality of different types of maps in representing a given data set, and for measuring the robu...
Tradeoff Between Source and Channel Coding
 IEEE TRANS. INFORM. THEORY
, 1997
"... A fundamental problem in the transmission of analog information across a noisy discrete channel is the choice of channel code rate that optimally allocates the available transmission rate between lossy source coding and block channel coding. We establish tight bounds on the channel code rate that mi ..."
Abstract

Cited by 80 (5 self)
 Add to MetaCart
A fundamental problem in the transmission of analog information across a noisy discrete channel is the choice of channel code rate that optimally allocates the available transmission rate between lossy source coding and block channel coding. We establish tight bounds on the channel code rate that minimizes the average distortion of a vector quantizer cascaded with a channel coder and a binarysymmetric channel. Analytic expressions are derived in two cases of interest: small biterror probability and arbitrary source vector dimension; arbitrary biterror probability and large source vector dimension. We demonstrate that the optimal channel code rate is often substantially smaller than the channel capacity, and obtain a noisychannel version of the Zador highresolution distortion formula.
Vector Quantization with Complexity Costs
, 1993
"... Vector quantization is a data compression method where a set of data points is encoded by a reduced set of reference vectors, the codebook. We discuss a vector quantization strategy which jointly optimizes distortion errors and the codebook complexity, thereby, determining the size of the codebook. ..."
Abstract

Cited by 63 (20 self)
 Add to MetaCart
Vector quantization is a data compression method where a set of data points is encoded by a reduced set of reference vectors, the codebook. We discuss a vector quantization strategy which jointly optimizes distortion errors and the codebook complexity, thereby, determining the size of the codebook. A maximum entropy estimation of the cost function yields an optimal number of reference vectors, their positions and their assignment probabilities. The dependence of the codebook density on the data density for different complexity functions is investigated in the limit of asymptotic quantization levels. How different complexity measures influence the efficiency of vector quantizers is studied for the task of image compression, i.e., we quantize the wavelet coefficients of gray level images and measure the reconstruction error. Our approach establishes a unifying framework for different quantization methods like Kmeans clustering and its fuzzy version, entropy constrained vector quantizati...
A WaveletBased Analysis of Fractal Image Compression
 IEEE Trans. Image Processing
, 1997
"... Why does fractal image compression work? What is the implicit image model underlying fractal block coding? How can we characterize the types of images for which fractal block coders will work well? These are the central issues we address. We introduce a new waveletbased framework for analyzing block ..."
Abstract

Cited by 60 (2 self)
 Add to MetaCart
(Show Context)
Why does fractal image compression work? What is the implicit image model underlying fractal block coding? How can we characterize the types of images for which fractal block coders will work well? These are the central issues we address. We introduce a new waveletbased framework for analyzing blockbased fractal compression schemes. Within this framework we are able to draw upon insights from the wellestablished transform coder paradigm in order to address the issue of why fractal block coders work. We show that fractal block coders of the form introduced by Jacquin[1] are a Haar wavelet subtree quantization scheme. We examine a generalization of this scheme to smooth wavelets with additional vanishing moments. The performance of our generalized coder is comparable to the best results in the literature for a Jacquinstyle coding scheme. Our wavelet framework gives new insight into the convergence properties of fractal block coders, and leads us to develop an unconditionally convergen...
Fast Approximate Spectral Clustering
, 2009
"... Spectral clustering refers to a flexible class of clustering procedures that can produce highquality clusterings on small data sets but which has limited applicability to largescale problems due to its computational complexity of O(n 3), with n the number of data points. We extend the range of spe ..."
Abstract

Cited by 58 (1 self)
 Add to MetaCart
(Show Context)
Spectral clustering refers to a flexible class of clustering procedures that can produce highquality clusterings on small data sets but which has limited applicability to largescale problems due to its computational complexity of O(n 3), with n the number of data points. We extend the range of spectral clustering by developing a general framework for fast approximate spectral clustering in which a distortionminimizing local transformation is first applied to the data. This framework is based on a theoretical analysis that provides a statistical characterization of the effect of local distortion on the misclustering rate. We develop two concrete instances of our general framework, one based on local kmeans clustering (KASP) and one based on random projection trees (RASP). Extensive experiments show that these algorithms can achieve significant speedups with little degradation in clustering accuracy. Specifically, our algorithms outperform kmeans by a large margin in terms of accuracy, and run several times faster than approximate spectral clustering based on the Nyström method, with comparable accuracy and significantly smaller memory footprint. Remarkably, our algorithms make it possible for a single machine to spectral cluster data sets with a million observations within several minutes. 1
A quantization algorithm for solving multidimensional Optimal Stopping problems
 Bernoulli
, 2001
"... A new grid method for computing the Snell envelop of a function of a R valued Markov chain (X k ) 0#k#n is proposed. (This problem is typically non linear and cannot be solved by the standard Monte Carlo method.) Every X k is replaced by a "quantized approximation" X k taking its valu ..."
Abstract

Cited by 56 (4 self)
 Add to MetaCart
A new grid method for computing the Snell envelop of a function of a R valued Markov chain (X k ) 0#k#n is proposed. (This problem is typically non linear and cannot be solved by the standard Monte Carlo method.) Every X k is replaced by a "quantized approximation" X k taking its values in a grid # k of size N k . The n grids and their transition probability matrices make up a discrete tree on which a pseudoSnell envelop is devised by mimicking the regular dynamic programming formula. We show, using Quantization Theory of probability distributions the existence of a set of optimal grids, given the total number N of elementary R valued vector quantizers. A recursive stochastic algorithm, based on some simulations of (X k ) 0#k#n , yields the optimal grids and their transition probability matrices. Some a priori error estimates based on the quantization errors are established. These results are applied to the computation of the Snell envelop of a di#usion (assuming that it can be directly simulated or using its Euler scheme). We show how this approach yields a discretization method for Reflected Backward Stochastic Di#erential Equation. Finally, some first numerical tests are carried out on a 2dimensional American option pricing problem.
Theoretic aspects of the SOM algorithm
 in: Proceedings of Workshop on SelfOrganising Maps (WSOM’97
, 1997
"... ..."
(Show Context)
Waveletbased image coding: An overview
 Applied and Computational Control, Signals, and Circuits
, 1998
"... ABSTRACT This paper presents an overview of waveletbased image coding. We develop the basics of image coding with a discussion of vector quantization. We motivate the use of transform coding in practical settings,and describe the properties of various decorrelating transforms. We motivate the use o ..."
Abstract

Cited by 49 (3 self)
 Add to MetaCart
ABSTRACT This paper presents an overview of waveletbased image coding. We develop the basics of image coding with a discussion of vector quantization. We motivate the use of transform coding in practical settings,and describe the properties of various decorrelating transforms. We motivate the use of the wavelet transform in coding using ratedistortion considerations as well as approximationtheoretic considerations. Finally,we give an overview of current coders in the literature. 1