Results 1  10
of
81
Hierarchical mixtures of experts and the EM algorithm
 Neural Computation
, 1994
"... We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood ..."
Abstract

Cited by 764 (20 self)
 Add to MetaCart
(Show Context)
We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood problem; in particular, we present an ExpectationMaximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an online learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain. 1
Hidden Markov models in computational biology: applications to protein modeling
 JOURNAL OF MOLECULAR BIOLOGY
, 1994
"... Hidden.Markov Models (HMMs) are applied t.0 the problems of statistical modeling, database searching and multiple sequence alignment of protein families and protein domains. These methods are demonstrated the on globin family, the protein kinase catalytic domain, and the EFhand calcium binding moti ..."
Abstract

Cited by 562 (38 self)
 Add to MetaCart
Hidden.Markov Models (HMMs) are applied t.0 the problems of statistical modeling, database searching and multiple sequence alignment of protein families and protein domains. These methods are demonstrated the on globin family, the protein kinase catalytic domain, and the EFhand calcium binding motif. In each case the parameters of an HMM are estimated from a training set of unaligned sequences. After the HMM is built, it is used to obtain a multiple alignment of all the training sequences. It is also used to search the. SWISSPROT 22 database for other sequences. that are members of the given protein family, or contain the given domain. The Hi " produces multiple alignments of good quality that agree closely with the alignments produced by programs that incorporate threedimensional structural information. When employed in discrimination tests (by examining how closely the sequences in a database fit the globin, kinase and EFhand HMMs), the '\ HMM is able to distinguish members of these families from nonmembers with a high degree of accuracy. Both the HMM and PROFILESEARCH (a technique used to search for relationships between a protein sequence and multiply aligned sequences) perform better in these tests than PROSITE (a dictionary of sites and patterns in proteins). The HMM appecvs to have a slight advantage over PROFILESEARCH in terms of lower rates of false
A Unifying Review of Linear Gaussian Models
, 1999
"... Factor analysis, principal component analysis, mixtures of gaussian clusters, vector quantization, Kalman filter models, and hidden Markov models can all be unified as variations of unsupervised learning under a single basic generative model. This is achieved by collecting together disparate observa ..."
Abstract

Cited by 277 (18 self)
 Add to MetaCart
(Show Context)
Factor analysis, principal component analysis, mixtures of gaussian clusters, vector quantization, Kalman filter models, and hidden Markov models can all be unified as variations of unsupervised learning under a single basic generative model. This is achieved by collecting together disparate observations and derivations made by many previous authors and introducing a new way of linking discrete and continuous state models using a simple nonlinearity. Through the use of other nonlinearities, we show how independent component analysis is also a variation of the same basic generative model. We show that factor analysis and mixtures of gaussians can be implemented in autoencoder neural networks and learned using squared error plus the same regularization term. We introduce a new model for static data, known as sensible principal component analysis, as well as a novel concept of spatially adaptive observation noise. We also review some of the literature involving global and local mixtures of the basic models and provide pseudocode for inference and learning for all the basic models.
Dirichlet Mixtures: A Method for Improving Detection of Weak but Significant Protein Sequence Homology
, 1996
"... This paper presents the mathematical foundations of Dirichlet mixtures, which have been used to improve database search results for homologous sequences, when a variable number of sequences from a protein family or domain are known. We present a method for condensing the information in a protein dat ..."
Abstract

Cited by 136 (22 self)
 Add to MetaCart
(Show Context)
This paper presents the mathematical foundations of Dirichlet mixtures, which have been used to improve database search results for homologous sequences, when a variable number of sequences from a protein family or domain are known. We present a method for condensing the information in a protein database into a mixture of Dirichlet densities. These mixtures are designed to be combined with observed amino acid frequencies, to form estimates of expected amino acid probabilities at each position in a profile, hidden Markov model, or other statistical model. These estimates give a statistical model greater generalization capacity, such that remotely related family members can be more reliably recognized by the model. Dirichlet mixtures have been shown to outperform substitution matrices and other methods for computing these expected amino acid distributions in database search, resulting in fewer false positives and false negatives for the families tested. This paper corrects a previously p...
Conjunctive Representations in Learning and Memory: Principles of Cortical and Hippocampal Function
 PSYCHOLOGICAL REVIEW
, 2001
"... We present a theoretical framework for understanding the roles of the hippocampus and neocortex in learning and memory. This framework incorporates a theme found in many theories of hippocampal function, that the hippocampus is responsible for developing conjunctive representations binding together ..."
Abstract

Cited by 116 (12 self)
 Add to MetaCart
We present a theoretical framework for understanding the roles of the hippocampus and neocortex in learning and memory. This framework incorporates a theme found in many theories of hippocampal function, that the hippocampus is responsible for developing conjunctive representations binding together stimulus elements into a unitary rep resentation that can later be recalled from partial input cues. This idea appears problematic, however, because it is contradicted by the fact that hippocampally lesioned rats can learn nonlinear discrimination problems that require conjunctive representations. Our framework accommodates this finding by establishing a principled division of labor between the cortex and hippocampus, where the cortex is responsible for slow learning that integrates over multiple experiences to extract generalities, while the hippocampus performs rapid learning of the arbitrary contents of individual experiences. This framework shows that nonlinear discrimination problems are not good tests of hippocampal function, and suggests that tasks involving rapid, incidental conjunctive learning are better. We implement this framework in a computational neural network model, and show that it can account for a wide range of data in animal learning, thus validating our theoretical ideas, and providing a number of insights and predictions about these learning phenomena.
Modeling hippocampal and neocortical contributions to recognition memory: A complementarylearningsystems approach
 PSYCHOLOGICAL REVIEW
, 2003
"... We present a computational neural network model of recognition memory based on the biological structures of the hippocampus and medial temporal lobe cortex (MTLC), which perform complementary learning functions. The hippocampal component of the model contributes to recognition by recalling specific ..."
Abstract

Cited by 114 (17 self)
 Add to MetaCart
(Show Context)
We present a computational neural network model of recognition memory based on the biological structures of the hippocampus and medial temporal lobe cortex (MTLC), which perform complementary learning functions. The hippocampal component of the model contributes to recognition by recalling specific studied details. MTLC can not support recall, but it is possible to extract a scalar familiarity signal from MTLC that tracks how well the test item matches studied items. We present simulations that establish key qualitative differences in the operating characteristics of the hippocampal recall and MTLC familiarity signals, and we identify several manipulations (e.g., targetlure similarity, interference) that differentially affect the two signals. We also use the model to address the stochastic relationship between recall and familiarity (i.e., are they independent), and the effects of partial vs. complete hippocampal
Novelty Detection and Neural Network Validation
, 1994
"... One of the key factors limiting the use of neural networks in many industrial applications has been the difficulty of demonstrating that a trained network will continue to generate reliable outputs once it is in routine use. An important potential source of errors arises from novel input data, that ..."
Abstract

Cited by 92 (2 self)
 Add to MetaCart
One of the key factors limiting the use of neural networks in many industrial applications has been the difficulty of demonstrating that a trained network will continue to generate reliable outputs once it is in routine use. An important potential source of errors arises from novel input data, that is input data which differ significantly from the data used to train the network. In this paper we investigate the relationship between the degree of novelty of input data and the corresponding reliability of the outputs from the network. We describe a quantitative procedure for assessing novelty, and we demonstrate its performance using an application involving the monitoring of oil flow in multiphase pipelines. 1 Introduction Neural networks have been shown to have a useful degree of performance in a wide range of industrial and medical applications. However, a key factor limiting the widespread implementation of neural network solutions in many areas has been the difficulty of demonstr...
Clustering Based on Conditional Distributions in an Auxiliary Space
 Neural Computation
, 2001
"... We study the problem of learning groups or categories that are local ..."
Abstract

Cited by 80 (22 self)
 Add to MetaCart
We study the problem of learning groups or categories that are local
Object indexing using an iconic sparse distributed memory
, 1995
"... A generalpurpose object indexing technique is described that combines the virtues of principal component analysis with the favorable matching properties of highdimensional spaces to achieve high precision recognition. An object is represented by a set of highdimensional iconic feature vectors com ..."
Abstract

Cited by 62 (9 self)
 Add to MetaCart
A generalpurpose object indexing technique is described that combines the virtues of principal component analysis with the favorable matching properties of highdimensional spaces to achieve high precision recognition. An object is represented by a set of highdimensional iconic feature vectors comprised of the responses of derivative of Gaussian filters at a range of orientations and scales. Since these filters can be shown to form the eigenvectors of arbitrary images containing both natural and manmade structures, they are wellsuited for indexing in disparate domains. The indexing algorithm uses an active vision system in conjunction with a modified form of Kanerva’s sparse distributed memory which facilitates interpolation between views and provides a convenient platform for learning the association between an object’s appearance and its identity. The robustness of the indexing method was experimentally confirmed by subjecting the method to a range of viewing conditions and the accuracy was verified using a wellknown model database containing a number of complex 3D objects under varying pose. 1