Results 1  10
of
48
Generalized Relevance Learning Vector Quantization
 Neural Networks
, 2002
"... We propose a new scheme for enlarging generalized learning vector quantization (GLVQ) with weighting factors for the input dimensions. The factors allow an appropriate scaling of the input dimensions according to their relevance. They are adapted automatically during training according to the specif ..."
Abstract

Cited by 49 (20 self)
 Add to MetaCart
We propose a new scheme for enlarging generalized learning vector quantization (GLVQ) with weighting factors for the input dimensions. The factors allow an appropriate scaling of the input dimensions according to their relevance. They are adapted automatically during training according to the specific classification task whereby training can be interpreted as stochastic gradient descent on an appropriate error function. This method leads to a more powerful classifier and to an adaptive metric with little extra cost compared to standard GLVQ. Moreover, the size of the weighting factors indicates the relevance of the input dimensions. This proposes a scheme for automatically pruning irrelevant input dimensions. The algorithm is verified on artificial data sets and the iris data from the UCI repository.
Supervised Neural Gas with General Similarity Measure
 Neural Processing Letters
, 2003
"... Prototype based classi cation oers intuitive and sparse models with excellent generalization ability. However, these models usually crucially depend on the underlying Euclidian metric; moreover, online variants likely suer from the problem of local optima. We here propose a generalization of learni ..."
Abstract

Cited by 29 (20 self)
 Add to MetaCart
Prototype based classi cation oers intuitive and sparse models with excellent generalization ability. However, these models usually crucially depend on the underlying Euclidian metric; moreover, online variants likely suer from the problem of local optima. We here propose a generalization of learning vector quantization with three additional features: (I) it directly integrates neighborhood cooperation, hence is less aected by local optima; (II) the method can be combined with any dierentiable similarity measure whereby metric parameters such as relevance factors of the input dimensions can automatically be adapted according to the given data; (III) it obeys a gradient dynamics hence shows very robust behavior, and the chosen objective is related to margin optimization.
A general framework for unsupervised processing of structured data
 NEUROCOMPUTING
, 2004
"... ..."
Neural Gas for Sequences
 Proceedings of the Workshop on SelfOrganizing Networks (WSOM), pages 53–58, Kyushu Institute of Technology
, 2003
"... For unsupervised sequence processing, standard self organizing maps can be naturally extended by using recurrent connections and explicit context representations. Models thereof are the temporal Kohonen map (TKM), recursive SOM, SOM for structured data (SOMSD), and HSOM for sequences (HSOMS). Here, ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
For unsupervised sequence processing, standard self organizing maps can be naturally extended by using recurrent connections and explicit context representations. Models thereof are the temporal Kohonen map (TKM), recursive SOM, SOM for structured data (SOMSD), and HSOM for sequences (HSOMS). Here, we discuss and compare the capabilities of exemplary approaches to store different types of sequences. We propose a new efficient model, the MergeSOM (MSOM), which combines ideas of TKM and SOMSD and which is particularly suited for processing sequences with dynamic multimodal densities.
On the Generalization Ability of GRLVQ networks
 NEURAL PROCESSING LETTERS
"... We derive a generalization bound for prototypebased classifiers with adaptive metric. The bound depends on the margin of the classifier and is independent of the dimensionality of the data. It holds for classifiers based on the Euclidean metric extended by adaptive relevance terms. In particular, ..."
Abstract

Cited by 16 (13 self)
 Add to MetaCart
We derive a generalization bound for prototypebased classifiers with adaptive metric. The bound depends on the margin of the classifier and is independent of the dimensionality of the data. It holds for classifiers based on the Euclidean metric extended by adaptive relevance terms. In particular, the result holds for relevance learning vector quantization [3] and generalized relevance learning vector quantization [11].
Can Relevance Be Inferred from Eye Movements in Information Retrieval?
, 2003
"... We investigate whether it is possible to infer from implicit feedback what is relevant for a user in an information retrieval task. Eye movement signals are measured; they are very noisy but potentially contain rich hints about the current state and focus of attention of the user. In the experimenta ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
We investigate whether it is possible to infer from implicit feedback what is relevant for a user in an information retrieval task. Eye movement signals are measured; they are very noisy but potentially contain rich hints about the current state and focus of attention of the user. In the experimental setting relevance is controlled by giving the user a specific search task, and the modeling goal is to predict from eye movements which of the given titles are relevant. We extract a set of standard features from the signal, and explore the data with statistical information visualization methods including standard selforganizing maps (SOMs) and SOMs that learn metrics. Relevance of document titles to the processing task can be predicted with reasonable accuracy from only a few features, whereas prediction of relevance of specific words will require new features and methods.
Discriminative Clustering
, 2004
"... A distributional clustering model for continuous data is reviewed and new methods for optimizing and regularizing it are introduced and compared. Based on samples of discretevalued auxiliary data associated to samples of the continuous primary data, the continuous data space is partitioned into Vor ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
A distributional clustering model for continuous data is reviewed and new methods for optimizing and regularizing it are introduced and compared. Based on samples of discretevalued auxiliary data associated to samples of the continuous primary data, the continuous data space is partitioned into Voronoi regions that are maximally homogeneous in terms of the discrete data. Then only variation in the primary data associated to variation in the discrete data a#ects the clustering; the discrete data "supervises" the clustering. Because the whole continuous space is partitioned, new samples can be easily clustered by the continuous part of the data alone. In experiments, the approach is shown to produce more homogeneous clusters than alternative methods. Two regularization methods are demonstrated to further improve the results: an entropytype penalty for unequal cluster sizes, and the inclusion of a model for the marginal density of the primary data. The latter is also interpretable as special kind of joint distribution modeling with tunable emphasis for Preprint submitted to Neurocomputing 23 November 2004 discrimination and the marginal density.
Discriminative components of data
 IEEE Transactions on Neural Networks
, 2005
"... for publication. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University’s products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish thi ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
for publication. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Helsinki University’s products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubspermissions@ieee.org. By choosing to view this document, you agree to all provisions of the copyright laws protecting it. Thank you.
Improved learning of Riemannian metrics for exploratory analysis
, 2004
"... We have earlier introduced a principle for learning metrics, which shows how metricbased methods can be made to focus on discriminative properties of data. The main applications are in supervising unsupervised learning to model interesting variation in data, instead of modeling all variation as pl ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
We have earlier introduced a principle for learning metrics, which shows how metricbased methods can be made to focus on discriminative properties of data. The main applications are in supervising unsupervised learning to model interesting variation in data, instead of modeling all variation as plain unsupervised learning does. The metrics are derived by approximations to an informationgeometric formulation. In this paper, we review the theory, introduce better approximations to the distances, and show how to apply them in two different kinds of unsupervised methods: prototypebased and pairwise distancebased. The two examples are selforganizing maps and multidimensional scaling (Sammon’s mapping).
Informative Discriminant Analysis
 In: Proceedings of the Twentieth International Conference on Machine Learning (ICML2003). AAAI Press, Menlo Park, CA
, 2003
"... We introduce a probabilistic model that generalizes classical linear discriminant analysis and gives an interpretation for the components as informative or relevant components of data. The components maximize the predictability of class distribution which is asymptotically equivalent to (i) ma ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
We introduce a probabilistic model that generalizes classical linear discriminant analysis and gives an interpretation for the components as informative or relevant components of data. The components maximize the predictability of class distribution which is asymptotically equivalent to (i) maximizing mutual information with the classes, and (ii) nding principal components in the socalled learning or Fisher metrics. The Fisher metric measures only distances that are relevant to the classes, that is, distances that cause changes in the class distribution. The components have applications in data exploration, visualization, and dimensionality reduction.