Results 1 - 10
of
56
Discriminant Analysis by Gaussian Mixtures
- Journal of the Royal Statistical Society, Series B
, 1996
"... Fisher-Rao linear discriminant analysis (LDA) is a valuable tool for multigroup classification. LDA is equivalent to maximum likelihood classification assuming Gaussian distributions for each class. In this paper, we fit Gaussian mixtures to each class to facilitate effective classification in non-n ..."
Abstract
-
Cited by 124 (9 self)
- Add to MetaCart
Fisher-Rao linear discriminant analysis (LDA) is a valuable tool for multigroup classification. LDA is equivalent to maximum likelihood classification assuming Gaussian distributions for each class. In this paper, we fit Gaussian mixtures to each class to facilitate effective classification in non-normal settings, especially when the classes are clustered. Low dimensional views are an important by-product of LDA---our new techniques inherit this feature. We are able to control the within-class spread of the subclass centers relative to the between-class spread. Our technique for fitting these models permits a natural blend with nonparametric versions of LDA. Keywords: Classification, Pattern Recognition, Clustering, Nonparametric, Penalized. 1 Introduction In the generic classification or discrimination problem, the outcome of interest G falls into J unordered classes, which for convenience we denote by the set J = f1; 2; 3; \Delta \Delta \Delta Jg. We wish to build a rule for pred...
Data Exploration Using Self-Organizing Maps
- ACTA POLYTECHNICA SCANDINAVICA: MATHEMATICS, COMPUTING AND MANAGEMENT IN ENGINEERING SERIES NO. 82
, 1997
"... Finding structures in vast multidimensional data sets, be they measurement data, statistics, or textual documents, is difficult and time-consuming. Interesting, novel relations between the data items may be hidden in the data. The selforganizing map (SOM) algorithm of Kohonen can be used to aid the ..."
Abstract
-
Cited by 93 (4 self)
- Add to MetaCart
Finding structures in vast multidimensional data sets, be they measurement data, statistics, or textual documents, is difficult and time-consuming. Interesting, novel relations between the data items may be hidden in the data. The selforganizing map (SOM) algorithm of Kohonen can be used to aid the exploration: the structures in the data sets can be illustrated on special map displays. In this work, the methodology of using SOMs for exploratory data analysis or data mining is reviewed and developed further. The properties of the maps are compared with the properties of related methods intended for visualizing highdimensional multivariate data sets. In a set of case studies the SOM algorithm is applied to analyzing electroencephalograms, to illustrating structures of the standard of living in the world, and to organizing full-text document collections. Measures are proposed for evaluating the quality of different types of maps in representing a given data set, and for measuring the robu...
Merging and Splitting Eigenspace Models
, 2000
"... We present new deterministic methods that given two eigenspace models, each representing a set of n-dimensional observations will: (1) merge the models to yield a representation of the union of the sets; (2) split one model from another to represent the difference between the sets; as this is done, ..."
Abstract
-
Cited by 50 (0 self)
- Add to MetaCart
We present new deterministic methods that given two eigenspace models, each representing a set of n-dimensional observations will: (1) merge the models to yield a representation of the union of the sets; (2) split one model from another to represent the difference between the sets; as this is done, we accurately keep track of the mean.
Topography And Ocular Dominance: A Model Exploring Positive Correlations
, 1993
"... The map from eye to brain in vertebrates is topographic, i.e. neighbouring points in the eye map to neighbouring points in the brain. In addition, when two eyes innervate the same target structure, the two sets of fibres segregate to form ocular dominance stripes. Experimental evidence from the frog ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
The map from eye to brain in vertebrates is topographic, i.e. neighbouring points in the eye map to neighbouring points in the brain. In addition, when two eyes innervate the same target structure, the two sets of fibres segregate to form ocular dominance stripes. Experimental evidence from the frog and goldfish suggests that these two phenomena may be subserved by the same mechanisms. We present a computational model that addresses the formation of both topography and ocular dominance. The model is based on a form of competitive learning with subtractive enforcement of a weight normalization rule. Inputs to the model are distributed patterns of activity presented simultaneously in both eyes. An important aspect of this model is that ocular dominance segregation can occur when the two eyes are positively correlated, whereas previous models have tended to assume zero or negative correlations between the eyes. This allows investigation of the dependence of the pattern of stripes on the d...
Clustering methods for the analysis of DNA microarray data
, 1999
"... It is now possible to simultaneously measure the expression of thousands of genes during cellular di erentiation and response, through the use of DNA microarrays. A major statistical task is to understand the structure in the data that arise from this technology. In this paper we review various meth ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
It is now possible to simultaneously measure the expression of thousands of genes during cellular di erentiation and response, through the use of DNA microarrays. A major statistical task is to understand the structure in the data that arise from this technology. In this paper we review various methods of clustering, and illustrate how they can be used to arrange both the genes and cell lines from a set of DNA microarray experiments. The methods discussed are global clustering techniques including hierarchical, K-means, and block clustering, and tree-structured vector quantization. Finally, we propose a new method for identifying structure in subsets of both genes and cell lines that are potentially obscured by the global clustering approaches. 1
Learning Prototype Models for Tangent Distance
- Advances in Neural Information Processing Systems 7
, 1995
"... Simard, LeCun & Denker #1993# showed that the performance of near-neighbor classi#cation schemes for handwritten character recognition can be improved by incorporating invariance to speci #c transformations in the underlying distance metric --- the so called tangent distance. The resulting class ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
Simard, LeCun & Denker #1993# showed that the performance of near-neighbor classi#cation schemes for handwritten character recognition can be improved by incorporating invariance to speci #c transformations in the underlying distance metric --- the so called tangent distance. The resulting classi#er, however, can be prohibitively slow and memory intensive due to the large amountof prototypes that need to be stored and used in the distance comparisons. In this paper we develop rich models for representing large subsets of the prototypes. These models are either used singly per class, or as basic building blocks in conjunction with the K-means clustering algorithm. # After September 1, 1994: Statistics Department, Sequoia Hall, Stanford University, CA94305. Email: trevor@playfair.stanford.edu 1 INTRODUCTION Local algorithms such as K-nearest neighbor #NN# perform well in pattern recognition, even though they often assume the simplest distance on the pattern space. It has re...
A Connectionist View on Document Classification
- In Proc Australasian Database Conf
, 1995
"... Properly structured software libraries are crucial for the success of software reuse. Specifically, the structure of the software library ought to reflect the functional similarity of the stored software components in order to facilitate the retrieval process. We propose the application of artificia ..."
Abstract
-
Cited by 26 (22 self)
- Add to MetaCart
Properly structured software libraries are crucial for the success of software reuse. Specifically, the structure of the software library ought to reflect the functional similarity of the stored software components in order to facilitate the retrieval process. We propose the application of artificial neural network technology to achieve such a structured library. In more detail, we rely on full-text indexing of the software manual in order to obtain the software representation. This software representation is further used as the input data during the training process of an artificial neural network adhering to the unsupervised learning paradigm. The distinctive feature of this very model is to make the semantic relationship between the stored software components geographically explicit. Thus, the actual user of the software library gets a notion of the semantic relationship between the components in terms of their geographical closeness. 1 Introduction Software reuse is concerned with...
Speech Recognition using Neural Networks
, 1995
"... This thesis examines how artificial neural networks can benefit a large vocabulary, speaker independent, continuous speech recognition system. Currently, most speech recognition systems are based on hidden Markov models (HMMs), a statistical framework that supports both acoustic and temporal modelin ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
This thesis examines how artificial neural networks can benefit a large vocabulary, speaker independent, continuous speech recognition system. Currently, most speech recognition systems are based on hidden Markov models (HMMs), a statistical framework that supports both acoustic and temporal modeling. Despite their state-of-the-art performance, HMMs make a number of suboptimal modeling assumptions that limit their potential effectiveness. Neural networks avoid many of these assumptions, while they can also learn complex functions, generalize effectively, tolerate noise, and support parallelism. While neural networks can readily be applied to acoustic modeling, it is not yet clear how they can be used for temporal modeling. Therefore, we explore a class of systems called NN-HMM hybrids, in which neural networks perform acoustic modeling, and HMMs perform temporal modeling. We argue that a NN-HMM hybrid has several theoretical advantages over a pure HMM system, including better acoustic ...
How hallucinations may arise from brain mechanisms of learning, attention, and volition
- Journal of the International Neuropsychological Society
, 1999
"... Invited article for the ..."
Learning the Semantic Similarity of Reusable Software Components
, 1994
"... Properly structured software libraries are crucial for the success of software reuse. Specifically, the structure of the software library ought to reflect the functional similarity of the stored software components in order to facilitate the retrieval process. We propose the application of artificia ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
Properly structured software libraries are crucial for the success of software reuse. Specifically, the structure of the software library ought to reflect the functional similarity of the stored software components in order to facilitate the retrieval process. We propose the application of artificial neural network technology to achieve such a structured library. In more detail, we utilize an artificial neural network adhering to the unsupervised learning paradigm. The distinctive feature of this very model is to make the semantic relationship between the stored software components geographically explicit. Thus, the actual user of the software library gets a notion of the semantic relationship between the components in terms of their geographical closeness. 1. Introduction Software reuse is concerned with the technological and organizational background of using already existing software components to build new applications. Software reuse is supposed to increase both the productivity ...

