Results 1 -
4 of
4
Growing a Hypercubical Output Space in a Self-Organizing Feature Map
- IEEE Transactions on Neural Networks
, 1995
"... Neural maps project data given in a (possibly high-dimensional) input space onto a neuron position in a (usually low-dimensional) output space grid. An important property of this projection is the preservation of neighborhoods; neighboring neurons in output space respond to neighboring data points i ..."
Abstract
-
Cited by 47 (10 self)
- Add to MetaCart
Neural maps project data given in a (possibly high-dimensional) input space onto a neuron position in a (usually low-dimensional) output space grid. An important property of this projection is the preservation of neighborhoods; neighboring neurons in output space respond to neighboring data points in input space. To achieve this preservation in an optimal way during learning, the topology of the output space has to roughly match the effective structure of the data in the input space. We here present a growth algorithm, called the GSOM, which enhances a widespread map self-organization process, Kohonen's Self-Organizing Feature Map (SOFM), by an adaptation of the output space grid during learning. During the procedure the output space structure is restricted to a general hypercubical shape, with the overall dimensionality of the grid and its extensions along the different directions being subject of the adaptation. This constraint distinguishes the present algorithm from other, less or ...
Visualizing High-Dimensional Structure with the Incremental Grid Growing Neural Network
- In Proc Int'l Conference on Machine Learning, Lake Tahoe, NV
, 1995
"... Understanding high-dimensional real world data usually requires learning the structure of the data space. The structure may contain high-dimensional clusters that are related in complex ways. Methods such as merge clustering and self-organizing maps are designed to aid the visualization and interpre ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Understanding high-dimensional real world data usually requires learning the structure of the data space. The structure may contain high-dimensional clusters that are related in complex ways. Methods such as merge clustering and self-organizing maps are designed to aid the visualization and interpretation of such data. However, these methods often fail to capture critical structural properties of the input. Although self-organizing maps capture high-dimensional topology, they do not represent cluster boundaries or discontinuities. Merge clustering extracts clusters, but it does not capture local or global topology. This paper proposes an algorithm that combines the topology-preserving characteristics of self-organizing maps with a flexible, adaptive structure that learns the cluster boundaries in the data. 1 INTRODUCTION Real world data is often very high-dimensional, and often has a structure that is difficult both to recognize and describe. For instance, human blood can be tested fo...
Prototype based Machine Learning for Clinical Proteomics
, 2006
"... Clinical proteomics opens the way towards new insights into many diseases on a level of detail not available before. One of the most promising measurement techniques supporting this approach is mass spectrometry based clinical proteomics. The analysis of the high dimensional data obtained from mass ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Clinical proteomics opens the way towards new insights into many diseases on a level of detail not available before. One of the most promising measurement techniques supporting this approach is mass spectrometry based clinical proteomics. The analysis of the high dimensional data obtained from mass spectrometry asks for sophisticated, problem adequate preprocessing and data analysis approaches. Ideally, automatic analysis tools provide insight into their behavior and the ability to extract further information, relevant for an understanding of the clinical data or applications such as biomarker discovery. Prototype based algorithms constitute efficient, intuitive and powerful machine learning methods which are very well suited to deal with high dimensional data and which allow good insight into their behavior by means of prototypical data locations. They have already successfully been applied to various problems in bioinformatics. The goal of this thesis is to extend prototype based methods, in such a way that they become suitable machine learning tools for typical problems in clinical proteomics. To achieve better adapted classification borders, tailored to the specific data distributions
Overtraining and Model Selection With the Self-Organizing Map
, 1999
"... We discuss the importance of finding the correct model complexity, or regularization level, in the self-organizing map (SOM) algorithm. The complexity of the SOM is determined mainly by the width of the final neighborhood, which is usually chosen ad hoc or set to zero for optimal quantization error. ..."
Abstract
- Add to MetaCart
We discuss the importance of finding the correct model complexity, or regularization level, in the self-organizing map (SOM) algorithm. The complexity of the SOM is determined mainly by the width of the final neighborhood, which is usually chosen ad hoc or set to zero for optimal quantization error. However, if the SOM is used for visualizing the joint probability distribution of the data, then care must be taken not to overfit the model to the data sample, similarly as with any statistical model. We propose a heuristic criterion for model selection in SOM, and demonstrate by simulations that the criterion can be used for selecting the neighborhood that suppresses overfitting.

