Results 1 - 10
of
100
Optimality Theory: Constraint interaction in Generative Grammar
, 1993
"... ~ ROA Version, 8/2002. Essentially identical to the Tech Report, with new pagination (but the same footnote and example numbering); correction of typos, oversights & outright errors; improved typography; and occasional small-scale clarificatory rewordings. Citation should include reference to this ..."
Abstract
-
Cited by 789 (23 self)
- Add to MetaCart
~ ROA Version, 8/2002. Essentially identical to the Tech Report, with new pagination (but the same footnote and example numbering); correction of typos, oversights & outright errors; improved typography; and occasional small-scale clarificatory rewordings. Citation should include reference to this version.
A distributed, developmental model of word recognition and naming
- Psychological Review
, 1989
"... A parallel distributed processing model of visual word recognition and pronunciation is described. The model consists of sets of orthographic and phonologlc ~ units and an interlevel of hidden units. Weights on connections between units were modified during a training phase using the back-propa-gati ..."
Abstract
-
Cited by 302 (35 self)
- Add to MetaCart
A parallel distributed processing model of visual word recognition and pronunciation is described. The model consists of sets of orthographic and phonologlc ~ units and an interlevel of hidden units. Weights on connections between units were modified during a training phase using the back-propa-gation learning algorithm. The model simulates many aspects of human performance, including (a) differences bet~n~.'n words in terms of processing difficulty, (b) pronunciation of novel items, (c) differences between readers in terms of word recognition skill, (d) transitions from beginning to skilled reading, and (e) differences in performance on lexieal decision and naming tasks. The model's behavior early in the learning phase corresponds to that of children acquiring word recognition skills. Training with a smaller number of hidden units produces output characteristic of many dys-lexic readers. Naming is simulated without pronunciation rules, and lexical decisions are simulated without accessing word-level representations. The performance of the model is largely determined by three factors: the nature of the input, a significant fragment of written English; the learning rule, which encodes the implicit structure of the orthography in the weights on connections; and the architecture of the system, which influences the scope of what can be learned. The recognition and pronunciation of words is one of the cen-
Recursive Distributed Representations
- Artificial Intelligence
, 1990
"... A long-standing difficulty for connectionist modeling has been how to represent variable-sized recursive data structures, such as trees and lists, in fixed-width patterns. This paper presents a connectionist architecture which automatically develops compact distributed representations for such compo ..."
Abstract
-
Cited by 299 (9 self)
- Add to MetaCart
A long-standing difficulty for connectionist modeling has been how to represent variable-sized recursive data structures, such as trees and lists, in fixed-width patterns. This paper presents a connectionist architecture which automatically develops compact distributed representations for such compositional structures, as well as efficient accessing mechanisms for them. Patterns which stand for the internal nodes of fixed-valence trees are devised through the recursive use of back-propagation on three-layer autoassociative encoder networks. The resulting representations are novel, in that they combine apparently immiscible aspects of features, pointers, and symbol structures. They form a bridge between the data structures necessary for high-level cognitive tasks and the associative, pattern recognition machinery provided by neural networks. 2 J. B. Pollack 1. Introduction One of the major stumbling blocks in the application of Connectionism to higherlevel cognitive tasks, such as Na...
Natural Language Processing with Modular PDP Networks and Distributed Lexicon
- Cognitive Science
, 1991
"... An approach to connectionist natural language processing is proposed, which is based on hierarchically organized modular Parallel Distributed Processing (PDP) networks and a central lexicon of distributed input/output representations. The modules communicate using these representations, which are gl ..."
Abstract
-
Cited by 77 (13 self)
- Add to MetaCart
An approach to connectionist natural language processing is proposed, which is based on hierarchically organized modular Parallel Distributed Processing (PDP) networks and a central lexicon of distributed input/output representations. The modules communicate using these representations, which are global and publicly available in the system. The representations are developed automatically by all networks while they are learning their processing tasks. The resulting representations reflect the regularities in the subtasks, which facilitates robust processing in the face of noise and damage, supports improved generalization, and provides expectations about possible contexts. The lexicon can be extended by cloning new instances of the items, that is, by generating a number of items with known processing properties and distinct identities. This technique combinatorially increases the processing power of the system. The recurrent FGREP module, together with a central lexicon, is used as a ba...
Learning Semantic Grammars with Constructive Inductive Logic Programming
- In Proceedings of the Eleventh National Conference on Artificial Intelligence
, 1993
"... Automating the construction of semantic grammars is a difficult and interesting problem for machine learning. This paper shows how the semantic-grammar acquisition problem can be viewed as the learning of search-control heuristics in a logic program. Appropriate control rules are learned using a new ..."
Abstract
-
Cited by 63 (13 self)
- Add to MetaCart
Automating the construction of semantic grammars is a difficult and interesting problem for machine learning. This paper shows how the semantic-grammar acquisition problem can be viewed as the learning of search-control heuristics in a logic program. Appropriate control rules are learned using a new first-order induction algorithm that automatically invents useful syntactic and semantic categories. Empirical results show that the learned parsers generalize well to novel sentences and out-perform previous approaches based on connectionist techniques. Introduction Designing computer systems to "understand" natural language input is a difficult task. The laboriously hand-crafted computational grammars supporting natural language applications are often inefficient, incomplete and ambiguous. The difficulty in constructing adequate grammars is an example of the "knowledge acquisition bottleneck" which has motivated much research in machine learning. While numerous researchers have studied ...
Script Recognition with Hierarchical Feature Maps
- Connection Science
, 1990
"... The hierarchical feature map system recognizes an input story as an instance of a particular script by classifying it at three levels: scripts, tracks and role bindings. The recognition taxonomy, i.e. the breakdown of each script into the tracks and roles, is extracted automatically and independentl ..."
Abstract
-
Cited by 59 (8 self)
- Add to MetaCart
The hierarchical feature map system recognizes an input story as an instance of a particular script by classifying it at three levels: scripts, tracks and role bindings. The recognition taxonomy, i.e. the breakdown of each script into the tracks and roles, is extracted automatically and independently for each script from examples of script instantiations in an unsupervised self-organizing process. The process resembles human learning in that the differentiation of the most frequently encountered scripts become gradually the most detailed. The resulting structure is a hierachical pyramid of feature maps. The hierarchy visualizes the taxonomy and the maps lay out the topology of each level. The number of input lines and the self-organization time are considerably reduced compared to the ordinary single-level feature mapping. The system can recognize incomplete stories and recover the missing events. The taxonomy also serves as memory organization for script-based episodic memory. The map...
Distributed Representations and Nested Compositional Structure
, 1994
"... Distributed representations are attractive for a number of reasons. They offer the possibility of representing concepts in a continuous space, they degrade gracefully with noise, and they can be processed in a parallel network of simple processing elements. However, the problem of representing neste ..."
Abstract
-
Cited by 54 (11 self)
- Add to MetaCart
Distributed representations are attractive for a number of reasons. They offer the possibility of representing concepts in a continuous space, they degrade gracefully with noise, and they can be processed in a parallel network of simple processing elements. However, the problem of representing nested structure in distributed representations has been for some time a prominent concern of both proponents and critics of connectionism [Fodor and Pylyshyn 1988; Smolensky 1990; Hinton 1990]. The lack of connectionist representations for complex structure has held back progress in tackling higher-level cognitive tasks such as language understanding and reasoning. In this thesis I review connectionist representations and propose a method for the distributed representation of nested structure, which I call "Holographic Reduced Representations " (HRRs). HRRs provide an implementation of Hinton's [1990] "reduced descriptions". HRRs use circular convolution to associate atomic items, which are rep...
Word Space
- Advances in Neural Information Processing Systems 5
, 1993
"... Representations for semantic information about words are necessary for many applications of neural networks in natural language processing. This paper describes an efficient, corpus-based method for inducing distributed semantic representations for a large number of words (50,000) from lexical coccu ..."
Abstract
-
Cited by 53 (0 self)
- Add to MetaCart
Representations for semantic information about words are necessary for many applications of neural networks in natural language processing. This paper describes an efficient, corpus-based method for inducing distributed semantic representations for a large number of words (50,000) from lexical coccurrence statistics by means of a large-scale linear regression. The representations are successfully applied to word sense disambiguation using a nearest neighbor method.
Subsymbolic case-role analysis of sentences with embedded clauses
- Cognitive Science
, 1996
"... A distributed neural network model called SPEC for processing sentences with recursive relative clauses is described. The model is based on separating the tasks of segmenting the input word sequence into clauses, forming the case-role representations, and keeping track of the recursive embeddings in ..."
Abstract
-
Cited by 48 (6 self)
- Add to MetaCart
A distributed neural network model called SPEC for processing sentences with recursive relative clauses is described. The model is based on separating the tasks of segmenting the input word sequence into clauses, forming the case-role representations, and keeping track of the recursive embeddings into di erent modules. The system needs to be trained only with the basic sentence constructs, and it generalizes not only to new instances of familiar relative clause structures, but to novel structures as well. SPEC exhibits plausible memory degradation as the depth of the center embeddings increases, its memory is primed by earlier constituents, and its performance is aided by semantic constraints between the constituents. The ability to process structure is largely due to a central executive network that monitors and controls the execution of the entire system. This way, in contrast to earlier subsymbolic systems, parsing is modeled as a controlled high-level process rather than one based on automatic re ex responses. 1

