Results 1 
4 of
4
Self Organization of a Massive Document Collection
 IEEE Transactions on Neural Networks
"... This article describes the implementation of a system that is able to organize vast document collections according to textual similarities. It is based on the SelfOrganizing Map (SOM) algorithm. As the feature vectors for the documents we use statistical representations of their vocabularies. The m ..."
Abstract

Cited by 207 (14 self)
 Add to MetaCart
This article describes the implementation of a system that is able to organize vast document collections according to textual similarities. It is based on the SelfOrganizing Map (SOM) algorithm. As the feature vectors for the documents we use statistical representations of their vocabularies. The main goal in our work has been to scale up the SOM algorithm to be able to deal with large amounts of highdimensional data. In a practical experiment we mapped 6,840,568 patent abstracts onto a 1,002,240node SOM. As the feature vectors we used 500dimensional vectors of stochastic figures obtained as random projections of weighted word histograms. Keywords Data mining, exploratory data analysis, knowledge discovery, large databases, parallel implementation, random projection, SelfOrganizing Map (SOM), textual documents. I. Introduction A. From simple searches to browsing of selforganized data collections Locating documents on the basis of keywords and simple search expressions is a c...
Energy Functions for SelfOrganizing Maps
, 1999
"... This paper is about the last issue. After people started to realize that there is no energy function for the Kohonen learning rule (in the continuous case), many attempts have been made to change the algorithm such that an energy can be defined, without drastically changing its properties. Here we w ..."
Abstract

Cited by 41 (1 self)
 Add to MetaCart
This paper is about the last issue. After people started to realize that there is no energy function for the Kohonen learning rule (in the continuous case), many attempts have been made to change the algorithm such that an energy can be defined, without drastically changing its properties. Here we will review a simple suggestion, which has been proposed 2 and generalized in several different contexts. The advantage over some other attempts is its simplicity: we only need to redefine the determination of the winning ("best matching") unit. The energy function and corresponding learning algorithm are introduced in Section 2. We give two proofs that there is indeed a proper energy function. The first one, in Section 3, is based on explicit computation of derivatives. The second one, in Section 4 follows from a limiting case of a more general (free) energy function derived in a probabilistic setting. The energy formalism allows for a direct interpretation of disordered configurations in terms of local minima, two examples of which are treated in Section 5.
SelfOrganizing Maps, Vector Quantization, and Mixture Modeling
 IEEE Transactions on Neural Networks
, 2001
"... Selforganizing maps are popular algorithms for unsupervised learning and data visualization. Exploiting the link between vector quantization and mixture modeling, we derive EM algorithms for selforganizing maps with and without missing values. We compare selforganizing maps with the elasticnet ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
Selforganizing maps are popular algorithms for unsupervised learning and data visualization. Exploiting the link between vector quantization and mixture modeling, we derive EM algorithms for selforganizing maps with and without missing values. We compare selforganizing maps with the elasticnet approach and explain why the former is better suited for the visualization of highdimensional data. Several extensions and improvements are discussed. As an illustration we apply a selforganizing map based on a multinomial distribution to market basket analysis. I. Introduction Selforganizing maps are popular tools for clustering and visualization of highdimensional data [1], [2]. The wellknown Kohonen learning algorithm can be interpreted as a variant of vector quantization with additional lateral interactions [3], [4]. The addition of lateral interaction between units introduces a sense of topology, such that neighboring units represent inputs that are close together in input space [...