Results 1  10
of
367
Finding structure in time
 COGNITIVE SCIENCE
, 1990
"... Time underlies many interesting human behaviors. Thus, the question of how to represent time in connectionist models is very important. One approach is to represent time implicitly by its effects on processing rather than explicitly (as in a spatial representation). The current report develops a pro ..."
Abstract

Cited by 1533 (21 self)
 Add to MetaCart
Time underlies many interesting human behaviors. Thus, the question of how to represent time in connectionist models is very important. One approach is to represent time implicitly by its effects on processing rather than explicitly (as in a spatial representation). The current report develops a proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory. In this approach, hidden unit patterns are fed back to themselves; the internal representations which develop thus reflect task demands in the context of prior internal states. A set of simulations is reported which range from relatively simple problems (temporal version of XOR) to discovering syntactic/semantic features for words. The networks are able to learn interesting internal representations which incorporate task demands with memory demands; indeed, in this approach the notion of memory is inextricably bound up with task processing. These representations reveal a rich structure, which allows them to be highly contextdependent while also expressing generalizations across classes of items. These representations suggest a method for representing lexical categories and the type/token distinction.
Solving multiclass learning problems via errorcorrecting output codes
 Journal of Artificial Intelligence Research
, 1995
"... Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k>2values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hx i;f(x i)i. Existing approaches to multiclass learning ..."
Abstract

Cited by 564 (9 self)
 Add to MetaCart
Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k>2values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hx i;f(x i)i. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decisiontree algorithms C4.5 and CART, application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and application of binary concept learning algorithms with distributed output representations. This paper compares these three approaches to a new technique in which errorcorrecting codes are employed as a distributed output representation. We show that these output representations improve the generalization performance of both C4.5 and backpropagation on a wide range of multiclass learning tasks. We also demonstrate that this approach is robust with respect to changes in the size of the training sample, the assignment of distributed representations to particular classes, and the application of over tting avoidance techniques such as decisiontree pruning. Finally,we show thatlike the other methodsthe errorcorrecting code technique can provide reliable class probability estimates. Taken together, these results demonstrate that errorcorrecting output codes provide a generalpurpose method for improving the performance of inductive learning programs on multiclass problems. 1.
Understanding Normal and Impaired Word Reading: Computational Principles in QuasiRegular Domains
 PSYCHOLOGICAL REVIEW
, 1996
"... We develop a connectionist approach to processing in quasiregular domains, as exemplified by English word reading. A consideration of the shortcomings of a previous implementation (Seidenberg & McClelland, 1989, Psych. Rev.) in reading nonwords leads to the development of orthographic and phonologi ..."
Abstract

Cited by 362 (86 self)
 Add to MetaCart
We develop a connectionist approach to processing in quasiregular domains, as exemplified by English word reading. A consideration of the shortcomings of a previous implementation (Seidenberg & McClelland, 1989, Psych. Rev.) in reading nonwords leads to the development of orthographic and phonological representations that capture better the relevant structure among the written and spoken forms of words. In a number of simulation experiments, networks using the new representations learn to read both regular and exception words, including lowfrequency exception words, and yet are still able to read pronounceable nonwords as well as skilled readers. A mathematical analysis of the effects of word frequency and spellingsound consistency in a related but simpler system serves to clarify the close relationship of these factors in influencing naming latencies. These insights are verified in subsequent simulations, including an attractor network that reproduces the naming latency data directly in its time to settle on a response. Further analyses of the network's ability to reproduce data on impaired reading in surface dyslexia support a view of the reading system that incorporates a graded divisionoflabor between semantic and phonological processes. Such a view is consistent with the more general Seidenberg and McClelland framework and has some similarities withbut also important differences fromthe standard dualroute account.
Recursive Distributed Representations
 Artificial Intelligence
, 1990
"... A longstanding difficulty for connectionist modeling has been how to represent variablesized recursive data structures, such as trees and lists, in fixedwidth patterns. This paper presents a connectionist architecture which automatically develops compact distributed representations for such compo ..."
Abstract

Cited by 339 (9 self)
 Add to MetaCart
A longstanding difficulty for connectionist modeling has been how to represent variablesized recursive data structures, such as trees and lists, in fixedwidth patterns. This paper presents a connectionist architecture which automatically develops compact distributed representations for such compositional structures, as well as efficient accessing mechanisms for them. Patterns which stand for the internal nodes of fixedvalence trees are devised through the recursive use of backpropagation on threelayer autoassociative encoder networks. The resulting representations are novel, in that they combine apparently immiscible aspects of features, pointers, and symbol structures. They form a bridge between the data structures necessary for highlevel cognitive tasks and the associative, pattern recognition machinery provided by neural networks. 2 J. B. Pollack 1. Introduction One of the major stumbling blocks in the application of Connectionism to higherlevel cognitive tasks, such as Na...
Connectionist Learning Procedures
 ARTIFICIAL INTELLIGENCE
, 1989
"... A major goal of research on networks of neuronlike processing units is to discover efficient learning procedures that allow these networks to construct complex internal representations of their environment. The learning procedures must be capable of modifying the connection strengths in such a way ..."
Abstract

Cited by 339 (6 self)
 Add to MetaCart
A major goal of research on networks of neuronlike processing units is to discover efficient learning procedures that allow these networks to construct complex internal representations of their environment. The learning procedures must be capable of modifying the connection strengths in such a way that internal units which are not part of the input or output come to represent important features of the task domain. Several interesting gradientdescent procedures have recently been discovered. Each connection computes the derivative, with respect to the connection strength, of a global measure of the error in the performance of the network. The strength is then adjusted in the direction that decreases the error. These relatively simple, gradientdescent learning procedures work well for small tasks and the new challenge is to find ways of improving their convergence rate and their generalization abilities so that they can be applied to larger, more realistic tasks.
Distributed representations, simple recurrent networks, and grammatical structure
 Machine Learning
, 1991
"... Abstract. In this paper three problems for a connectionist account of language are considered: 1. What is the nature of linguistic representations? 2. How can complex structural relationships such as constituent structure be represented? 3. How can the apparently openended nature of language be acc ..."
Abstract

Cited by 314 (16 self)
 Add to MetaCart
Abstract. In this paper three problems for a connectionist account of language are considered: 1. What is the nature of linguistic representations? 2. How can complex structural relationships such as constituent structure be represented? 3. How can the apparently openended nature of language be accommodated by a fixedresource system? Using a prediction task, a simple recurrent network (SRN) is trained on multiclausal sentences which contain multiplyembedded relative clauses. Principal component analysis of the hidden unit activation patterns reveals that the network solves the task by developing complex distributed representations which encode the relevant grammatical relations and hierarchical constituent structure. Differences between the SRN state representations and the more traditional pushdown store are discussed in the final section.
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 309 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
An Evolutionary Algorithm that Constructs Recurrent Neural Networks
 IEEE Transactions on Neural Networks
, 1994
"... Standard methods for inducing both the structure and weight values of recurrent neural networks fit an assumed class of architectures to every task. This simplification is necessary because the interactions between network structure and function are not well understood. Evolutionary computation, whi ..."
Abstract

Cited by 218 (15 self)
 Add to MetaCart
Standard methods for inducing both the structure and weight values of recurrent neural networks fit an assumed class of architectures to every task. This simplification is necessary because the interactions between network structure and function are not well understood. Evolutionary computation, which includes genetic algorithms and evolutionary programming, is a populationbased search method that has shown promise in such complex tasks. This paper argues that genetic algorithms are inappropriate for network acquisition and describes an evolutionary program, called GNARL, that simultaneously acquires both the structure and weights for recurrent networks. This algorithm's empirical acquisition method allows for the emergence of complex behaviors and topologies that are potentially excluded by the artificial architectural constraints imposed in standard network induction methods. To Appear in: IEEE Transactions on Neural Networks January The Ohio State University January 17, 1996 1 ...
Training a 3node neural network is NPComplete
 In Proceedings of the 1988 Workshop on Computational Learning Theory
, 1988
"... rivest~theory.lcs.mit.edu ..."
The Extraction of Refined Rules from KnowledgeBased Neural Networks
 Machine Learning
, 1993
"... Neural networks, despite their empiricallyproven abilities, have been little used for the refinement of existing knowledge because this task requires a threestep process. First, knowledge in some form must be inserted into a neural network. Second, the network must be refined. Third, knowledge mus ..."
Abstract

Cited by 198 (4 self)
 Add to MetaCart
Neural networks, despite their empiricallyproven abilities, have been little used for the refinement of existing knowledge because this task requires a threestep process. First, knowledge in some form must be inserted into a neural network. Second, the network must be refined. Third, knowledge must be extracted from the network. We have previously described a method for the first step of this process. Standard neural learning techniques can accomplish the second step. In this paper, we propose and empirically evaluate a method for the final, and possibly most difficult, step. This method efficiently extracts symbolic rules from trained neural networks. The four major results of empirical tests of this method are that the extracted rules: (1) closely reproduce (and can even exceed) the accuracy of the network from which they are extracted; (2) are superior to the rules produced by methods that directly refine symbolic rules; (3) are superior to those produced by previous techniques fo...