Results 1 - 10
of
39
Finding structure in time
- COGNITIVE SCIENCE
, 1990
"... Time underlies many interesting human behaviors. Thus, the question of how to represent time in connectionist models is very important. One approach is to represent time implicitly by its effects on processing rather than explicitly (as in a spatial representation). The current report develops a pro ..."
Abstract
-
Cited by 1313 (17 self)
- Add to MetaCart
Time underlies many interesting human behaviors. Thus, the question of how to represent time in connectionist models is very important. One approach is to represent time implicitly by its effects on processing rather than explicitly (as in a spatial representation). The current report develops a proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory. In this approach, hidden unit patterns are fed back to themselves; the internal representations which develop thus reflect task demands in the context of prior internal states. A set of simulations is reported which range from relatively simple problems (temporal version of XOR) to discovering syntactic/semantic features for words. The networks are able to learn interesting internal representations which incorporate task demands with memory demands; indeed, in this approach the notion of memory is inextricably bound up with task processing. These representations reveal a rich structure, which allows them to be highly context-dependent while also expressing generalizations across classes of items. These representations suggest a method for representing lexical categories and the type/token distinction.
Connectionist Learning Procedures
- ARTIFICIAL INTELLIGENCE
, 1989
"... A major goal of research on networks of neuron-like processing units is to discover efficient learning procedures that allow these networks to construct complex internal representations of their environment. The learning procedures must be capable of modifying the connection strengths in such a way ..."
Abstract
-
Cited by 290 (6 self)
- Add to MetaCart
A major goal of research on networks of neuron-like processing units is to discover efficient learning procedures that allow these networks to construct complex internal representations of their environment. The learning procedures must be capable of modifying the connection strengths in such a way that internal units which are not part of the input or output come to represent important features of the task domain. Several interesting gradient-descent procedures have recently been discovered. Each connection computes the derivative, with respect to the connection strength, of a global measure of the error in the performance of the network. The strength is then adjusted in the direction that decreases the error. These relatively simple, gradient-descent learning procedures work well for small tasks and the new challenge is to find ways of improving their convergence rate and their generalization abilities so that they can be applied to larger, more realistic tasks.
Neural Net Architectures for Temporal Sequence Processing
, 1994
"... I present a general taxonomy of neural net architectures for processing time-varying patterns. This taxonomy subsumes many existing architectures in the literature, and points to several promising architectures that have yet to be examined. Any architecture that processes timevarying patterns requir ..."
Abstract
-
Cited by 103 (0 self)
- Add to MetaCart
I present a general taxonomy of neural net architectures for processing time-varying patterns. This taxonomy subsumes many existing architectures in the literature, and points to several promising architectures that have yet to be examined. Any architecture that processes timevarying patterns requires two conceptually distinct components: a short-term memory that holds on to relevant past events and an associator that uses the short-term memory to classify or predict. My taxonomy is based on a characterization of short-term memory models along the dimensions of form, content, and adaptability. Experiments on predicting future values of a financial time series (US dollar--Swiss franc exchange rates) are presented using several alternative memory models. The results of these experiments serve as a baseline against which more sophisticated architectures can be compared. Neural networks have proven to be a promising alternative to traditional techniques for nonlinear temporal prediction t...
On The Problem Of Local Minima In Backpropagation
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1992
"... Supervised Learning in Multi-Layered Neural Networks (MLNs) has been recently proposed through the well-known Backpropagation algorithm. This is a gradient method which can get stuck in local minima, as simple examples can show. In this paper, some conditions on the network architecture and the lear ..."
Abstract
-
Cited by 60 (16 self)
- Add to MetaCart
Supervised Learning in Multi-Layered Neural Networks (MLNs) has been recently proposed through the well-known Backpropagation algorithm. This is a gradient method which can get stuck in local minima, as simple examples can show. In this paper, some conditions on the network architecture and the learning environment are proposed which ensure the convergence of the Backpropagation algorithm. It is proven in particular that the convergence holds if the classes are linearly-separable. In this case, the experience gained in several experiments shows that MLNs exceed perceptrons in generalization to new examples. Index Terms- Multi-Layered Networks, learning environment, Backpropagation, pattern recognition, linearly-separable classes. I. Introduction Supervised learning in Multi-Layered Networks can be accomplished thanks to Backpropagation (BP ) ([19, 25, 31]). Its application to several different subjects [25], and, particularly, to pattern recognition ([3, 6, 8, 20, 27, 29]), has bee...
Learning as Extraction of Low-Dimensional Representations
- Mechanisms of Perceptual Learning
, 1996
"... Psychophysical findings accumulated over the past several decades indicate that perceptual tasks such as similarity judgment tend to be performed on a low-dimensional representation of the sensory data. Low dimensionality is especially important for learning, as the number of examples required for a ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
Psychophysical findings accumulated over the past several decades indicate that perceptual tasks such as similarity judgment tend to be performed on a low-dimensional representation of the sensory data. Low dimensionality is especially important for learning, as the number of examples required for attaining a given level of performance grows exponentially with the dimensionality of the underlying representation space. In this chapter, we argue that, whereas many perceptual problems are tractable precisely because their intrinsic dimensionality is low, the raw dimensionality of the sensory data is normally high, and must be reduced by a nontrivial computational process, which, in itself, may involve learning. Following a survey of computational techniques for dimensionality reduction, we show that it is possible to learn a low-dimensional representation that captures the intrinsic low-dimensional nature of certain classes of visual objects, thereby facilitating further learning of tasks...
Unsupervised Neural Network Learning Procedures . . .
, 1996
"... In this article, we review unsupervised neural network learning procedures which can be applied to the task of preprocessing raw data to extract useful features for subsequent classification. The learning algorithms reviewed here are grouped into three sections: information-preserving methods, densi ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
In this article, we review unsupervised neural network learning procedures which can be applied to the task of preprocessing raw data to extract useful features for subsequent classification. The learning algorithms reviewed here are grouped into three sections: information-preserving methods, density estimation methods, and feature extraction methods. Each of these major sections concludes with a discussion of successful applications of the methods to real-world problems.
Speech Recognition using Neural Networks
, 1995
"... This thesis examines how artificial neural networks can benefit a large vocabulary, speaker independent, continuous speech recognition system. Currently, most speech recognition systems are based on hidden Markov models (HMMs), a statistical framework that supports both acoustic and temporal modelin ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
This thesis examines how artificial neural networks can benefit a large vocabulary, speaker independent, continuous speech recognition system. Currently, most speech recognition systems are based on hidden Markov models (HMMs), a statistical framework that supports both acoustic and temporal modeling. Despite their state-of-the-art performance, HMMs make a number of suboptimal modeling assumptions that limit their potential effectiveness. Neural networks avoid many of these assumptions, while they can also learn complex functions, generalize effectively, tolerate noise, and support parallelism. While neural networks can readily be applied to acoustic modeling, it is not yet clear how they can be used for temporal modeling. Therefore, we explore a class of systems called NN-HMM hybrids, in which neural networks perform acoustic modeling, and HMMs perform temporal modeling. We argue that a NN-HMM hybrid has several theoretical advantages over a pure HMM system, including better acoustic ...
Unified Integration of Explicit Knowledge and Learning by Example in Recurrent Networks
- IEEE Transactions on Knowledge and Data Engineering
, 1992
"... We propose a novel unified approach for integrating explicit knowledge and learning by example in recurrent networks. The explicit knowledge is represented by automaton rules, which are directly injected into the connections of a network. This can be accomplished by using a technique based on linear ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
We propose a novel unified approach for integrating explicit knowledge and learning by example in recurrent networks. The explicit knowledge is represented by automaton rules, which are directly injected into the connections of a network. This can be accomplished by using a technique based on linear programming, instead of learning from random initial weights. Learning is conceived as a refinement process and is mainly responsible of uncertain information management. We present preliminary results for problems of automatic speech recognition. Index Terms - Recurrent neural networks, learning automata, automatic speech recognition. I Introduction The resurgence of interest in connectionist models has led several researchers to investigate their application to the building of "intelligent systems". Unlike symbolic models proposed in artificial intelligence, learning plays a central role in connectionist models. Many successful applications have mainly concerned perceptual tasks (see e....
Concept-Learning In The Absence Of Counter-Examples: An Autoassociation-Based Approach To Classification
, 1999
"... The overwhelming majority of research currently pursued within the framework of concept-learning concentrates on discrimination-based learning, an inductive learning paradigm that relies on both examples and counter-examples of the concept. This emphasis, however, can present a practical problem: th ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
The overwhelming majority of research currently pursued within the framework of concept-learning concentrates on discrimination-based learning, an inductive learning paradigm that relies on both examples and counter-examples of the concept. This emphasis, however, can present a practical problem: there are real-world engineering problems for which counter-examples are both scarce and difficult to gather. For these problems, recognition-based learning systems are much more appropriate because they do not use counter-examples in the conceptlearning phase. The purpose of this dissertation is to analyze a connectionist recognition-based learning system---autoassociation-based classification---and answer the following questions: ffl What features of the autoassociator make it ca...

