Results 1 - 10
of
20
Controling the Magnification Factor of Self-Organizing Feature Maps
, 1995
"... The magnification exponents ¯ occuring in adaptive map formation algorithms like Kohonen's self-organizing feature map deviate for the information theoretically optimal value ¯ = 1 as well as from the values which optimize, e.g., the mean square distortion error (¯ = 1=3 for one-dimensional maps). A ..."
Abstract
-
Cited by 34 (7 self)
- Add to MetaCart
The magnification exponents ¯ occuring in adaptive map formation algorithms like Kohonen's self-organizing feature map deviate for the information theoretically optimal value ¯ = 1 as well as from the values which optimize, e.g., the mean square distortion error (¯ = 1=3 for one-dimensional maps). At the same time, models for categorical perception such as the "perceptual magnet" effect which are based on topographic maps require negative magnification exponents ¯ ! 0. We present an extension of the self-organizing feature map algorithm which utilizes adaptive local learning step sizes to actually control the magnification properties of the map. By change of a single parameter, maps with optimal information transfer, with various minimal reconstruction errors, or with an inverted magnification can be generated. Analytic results on this new algorithm are complemented by numerical simulations. 1. Introduction The representation of information in topographic maps is a common property of...
Neural Maps and Topographic Vector Quantization
, 1999
"... Neural maps combine the representation of data by codebook vectors, like a vector quantizer, with the property of topography, like a continuous function. While the quantization error is simple to compute and to compare between different maps, topography of a map is difficult to define and to quantif ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
Neural maps combine the representation of data by codebook vectors, like a vector quantizer, with the property of topography, like a continuous function. While the quantization error is simple to compute and to compare between different maps, topography of a map is difficult to define and to quantify. Yet, topography of a neural map is an advantageous property, e.g. in the presence of noise in a transmission channel, in data visualization, and in numerous other applications. In this paper we review some conceptual aspects of definitions of topography, and some recently proposed measures to quantify topography. We apply the measures first to neural maps trained on synthetic data sets, and check the measures for properties like reproducability, scalability, systematic dependence of the value of the measure on the topology of the map etc. We then test the measures on maps generated for four real-world data sets, a chaotic time series, speech data, and two sets of image data. The measures ...
The Enhanced LBG Algorithm
, 2001
"... Clustering applications cover several elds such as audio and video data compression, pattern recognition, computer vision, medical image recognition, etc. In this paper we present a new clustering algorithm called Enhanced LBG (ELBG). It belongs to the hard and K-means vector quantization groups an ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Clustering applications cover several elds such as audio and video data compression, pattern recognition, computer vision, medical image recognition, etc. In this paper we present a new clustering algorithm called Enhanced LBG (ELBG). It belongs to the hard and K-means vector quantization groups and derives directly from the simpler LBG. The basic idea we have developed is the concept of utility of a codeword, a powerful instrument to overcome one of the main drawbacks of clustering algorithms: generally, the results achieved are not good in the case of a bad choice of the initial codebook. We will present our experimental results showing that ELBG is able to nd better codebooks than previous clustering techniques and the computational complexity is virtually the same as the simpler LBG.
Initialization of Adaptive Parameters in Density Networks
- 3RD CONF. ON NEURAL NETWORKS, KULE
, 1997
"... Initialization of adaptive parameters in neural networks is of crucial importance to the speed of convergence of the learning procedure. Methods of initialization for the density networks are reviewed and two new methods, based on decision trees and dendrograms, presented. These two methods were app ..."
Abstract
-
Cited by 13 (12 self)
- Add to MetaCart
Initialization of adaptive parameters in neural networks is of crucial importance to the speed of convergence of the learning procedure. Methods of initialization for the density networks are reviewed and two new methods, based on decision trees and dendrograms, presented. These two methods were applied in the Feature Space Mapping framework to artificial and real world datasets. Results show superiority of the dendrogram-based method including rotation.
Characterizing Computer Systems' Workloads
, 2002
"... The performance of any system cannot be determined without knowing the workload, that is, the set of requests presented to the system. Workload characterization is the process by which we produce models that are capable of describing and reproducing the behavior of a workload. Such models are imp ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
The performance of any system cannot be determined without knowing the workload, that is, the set of requests presented to the system. Workload characterization is the process by which we produce models that are capable of describing and reproducing the behavior of a workload. Such models are imperative to any performance related studies such as capacity planning, workload balancing, performance prediction and system tuning. In this paper, we survey workload characterization techniques used for several types of computer systems. We identify significant issues and concerns encountered during the characterization process and propose an augmented methodology for workload characterization as a framework.
The Impact of Workload Clustering on Transaction Routing
"... The qualitative and quantitative description of the workload of a system is very important for capacity planning and performance management. In large-scale transaction processing systems, dynamic workload control algorithms are applied to optimize system performance. Such algorithms can benefit from ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
The qualitative and quantitative description of the workload of a system is very important for capacity planning and performance management. In large-scale transaction processing systems, dynamic workload control algorithms are applied to optimize system performance. Such algorithms can benefit from the results of workload clustering algorithms that partition the workload into classes consisting of units of work exhibiting similar characteristics. This paper presents CLUE, a clustering environment for OLTP workload characterization. CLUE provides a library of clustering algorithms that classify transactions into classes, according to their database reference patterns. This paper introduces HALC, a new batch-mode heuristic clustering algorithm, designed to cope with the large volume of input data that is typical for real-life applications. Next, an on the y clustering algorithm based on neural networks is described. This algorithm can be used in an on-line fashion in systems whose characteristics change through time. This paper provides an evaluation of the performance of HALC and the on the fly algorithms in terms of execution times and statistical metrics related to the quality of clusters that they compute, for both synthetic and real-life workload traces. Finally, this paper quantifies the impact of workload clustering on the performance of three dynamic transaction routing algorithms for Shared-Nothing transaction processing systems.
Fully Automatic Clustering System
"... In this paper the Fully Automatic Clustering System (FACS) is presented. It is a technique for clustering and vector quantization whose objective is the automatic calculation of the codebook of the right dimension, the desired error (or target) being fixed. At each iteration, FACS tries to improve t ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In this paper the Fully Automatic Clustering System (FACS) is presented. It is a technique for clustering and vector quantization whose objective is the automatic calculation of the codebook of the right dimension, the desired error (or target) being fixed. At each iteration, FACS tries to improve the setting of the existing codewords and, if necessary, some elements are removed from or added to the codebook. In order to save on the number of computations per iteration, greedy techniques are adopted. It has been demonstrated, from a heuristic point of view, that the number of the codewords determined by FACS is very low and that the algorithm quickly converges towards the final solution.
Optimal Magnification Factors in Self-Organizing Feature Maps
- In Proc. ICANN'95
, 1995
"... Introduction Kohonen's self-organizing feature maps (SOFMs) [8] usually exhibit a selective magnification of often stimulated regions of their input space. This amounts to a larger transmission of information about the stimulus ensemble than in maps with a constant resolution. Such a selective magn ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Introduction Kohonen's self-organizing feature maps (SOFMs) [8] usually exhibit a selective magnification of often stimulated regions of their input space. This amounts to a larger transmission of information about the stimulus ensemble than in maps with a constant resolution. Such a selective magnification is not only observed in biological maps, but is also often regarded as a desirable design objective in technical contexts. For at least three reasons, the magnification properties of SOFMs deserve further investigation: 1. An analysis by Ritter and Schulten [10] demonstrated that the SOFM algorithm does not yield a maximum entropy map (i.e. does not transmit the maximum amount of information). 2. As a related argument we observe that it depends on the error criterion one applies which magnification properties are to be regarded as optimal. For example, a minimal worst case error is achieved by maps with all receptive fields (or Voronoy polygons) being of equal extension, i.
A Modified K-Means Clustering with a Density-Sensitive Distance Metric
"... Abstract. The K-Means clustering is by far the most widely used method for discovering clusters in data. It has a good performance on the data with compact super-sphere distributions, but tends to fail in the data organized in more complex and unknown shapes. In this paper, we analyze in detail the ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. The K-Means clustering is by far the most widely used method for discovering clusters in data. It has a good performance on the data with compact super-sphere distributions, but tends to fail in the data organized in more complex and unknown shapes. In this paper, we analyze in detail the characteristic property of data clustering and propose a novel dissimilarity measure, named density-sensitive distance metric, which can describe the distribution characteristic of data clustering. By using this dissimilarity measure, a density-sensitive K-Means clustering algorithm is given, which has the ability to identify complex non-convex clusters compared with the original K-Means algorithm. The experimental results on both artificial data sets and real-world problems assess the validity of the algorithm. Keywords: K-Means clustering, distance metric, dissimilarity measure. 1
A Fast and Stable Incremental Clustering Algorithm
- in 2010 Seventh International Conference on Information Technology. IEEE
"... Abstract — Clustering is a pivotal building block in many data mining applications and in machine learning in general. Most clustering algorithms in the literature pertain to off-line (or batch) processing, in which the clustering process repeatedly sweeps through a set of data samples in an attempt ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract — Clustering is a pivotal building block in many data mining applications and in machine learning in general. Most clustering algorithms in the literature pertain to off-line (or batch) processing, in which the clustering process repeatedly sweeps through a set of data samples in an attempt to capture its underlying structure in a compact and ef cient way. However, many recent applications require that the clustering algorithm be online, or incremental, in the that there is no a priori set of samples to process but rather samples are provided one iteration at a time. Accordingly, the clustering algorithm is expected to gradually improve its prototype (or centroid) constructs. Several problems emerge in this context, particularly relating to the stability of the process and its speed of convergence. In this paper, we present a fast and stable incremental clustering algorithm, which is computationally modest and imposes minimal memory requirements. Simulation results clearly demonstrate the advantages of the proposed framework in a variety of practical scenarios. I.

