Results 1 -
3 of
3
Self Organization of a Massive Document Collection
- IEEE Transactions on Neural Networks
"... This article describes the implementation of a system that is able to organize vast document collections according to textual similarities. It is based on the Self-Organizing Map (SOM) algorithm. As the feature vectors for the documents we use statistical representations of their vocabularies. The m ..."
Abstract
-
Cited by 183 (14 self)
- Add to MetaCart
This article describes the implementation of a system that is able to organize vast document collections according to textual similarities. It is based on the Self-Organizing Map (SOM) algorithm. As the feature vectors for the documents we use statistical representations of their vocabularies. The main goal in our work has been to scale up the SOM algorithm to be able to deal with large amounts of high-dimensional data. In a practical experiment we mapped 6,840,568 patent abstracts onto a 1,002,240-node SOM. As the feature vectors we used 500-dimensional vectors of stochastic figures obtained as random projections of weighted word histograms. Keywords Data mining, exploratory data analysis, knowledge discovery, large databases, parallel implementation, random projection, Self-Organizing Map (SOM), textual documents. I. Introduction A. From simple searches to browsing of self-organized data collections Locating documents on the basis of keywords and simple search expressions is a c...
Energy Functions for Self-Organizing Maps
, 1999
"... This paper is about the last issue. After people started to realize that there is no energy function for the Kohonen learning rule (in the continuous case), many attempts have been made to change the algorithm such that an energy can be defined, without drastically changing its properties. Here we w ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
This paper is about the last issue. After people started to realize that there is no energy function for the Kohonen learning rule (in the continuous case), many attempts have been made to change the algorithm such that an energy can be defined, without drastically changing its properties. Here we will review a simple suggestion, which has been proposed 2 and generalized in several different contexts. The advantage over some other attempts is its simplicity: we only need to redefine the determination of the winning ("best matching") unit. The energy function and corresponding learning algorithm are introduced in Section 2. We give two proofs that there is indeed a proper energy function. The first one, in Section 3, is based on explicit computation of derivatives. The second one, in Section 4 follows from a limiting case of a more general (free) energy function derived in a probabilistic setting. The energy formalism allows for a direct interpretation of disordered configurations in terms of local minima, two examples of which are treated in Section 5.
Self-Organizing Maps, Vector Quantization, and Mixture Modeling
- IEEE Transactions on Neural Networks
, 2001
"... |Self-organizing maps are popular algorithms for unsupervised learning and data visualization. Exploiting the link between vector quantization and mixture modeling, we derive EM algorithms for self-organizing maps with and without missing values. We compare self-organizing maps with the elastic-net ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
|Self-organizing maps are popular algorithms for unsupervised learning and data visualization. Exploiting the link between vector quantization and mixture modeling, we derive EM algorithms for self-organizing maps with and without missing values. We compare self-organizing maps with the elastic-net approach and explain why the former is better suited for the visualization of high-dimensional data. Several extensions and improvements are discussed. As an illustration we apply a self-organizing map based on a multinomial distribution to market basket analysis. I. Introduction Self-organizing maps are popular tools for clustering and visualization of high-dimensional data [1], [2]. The wellknown Kohonen learning algorithm can be interpreted as a variant of vector quantization with additional lateral interactions [3], [4]. The addition of lateral interaction between units introduces a sense of topology, such that neighboring units represent inputs that are close together in input space [...

