Results 1 - 10
of
42
Doing without schema hierarchies: A recurrent connectionist approach to normal and impaired routine sequential action
- Psychological Review
, 2004
"... In everyday tasks, selecting actions in the proper sequence requires a continuously updated representation of temporal context. Many existing models address this problem by positing a hierarchy of processing units, mirroring the roughly hierarchical structure of naturalistic tasks themselves. Such a ..."
Abstract
-
Cited by 33 (8 self)
- Add to MetaCart
In everyday tasks, selecting actions in the proper sequence requires a continuously updated representation of temporal context. Many existing models address this problem by positing a hierarchy of processing units, mirroring the roughly hierarchical structure of naturalistic tasks themselves. Such an approach has led to a number of difficulties, including a reliance on overly rigid sequencing mechanisms, an inability to account for context sensitivity in behavior, and a failure to address learning. We consider here an alternative framework, according to which the representation of temporal context is facilitated by recurrent connections within a network mapping from environmental inputs to actions. Applying this approach to a specific, and in many ways prototypical, everyday task (coffee-making), we examine its ability to account for several central characteristics of normal and impaired human performance. The model we consider learns to deal flexibly with a complex set of sequencing constraints, encoding contextual information at multiple time-scales within a single, distributed internal representation. Mildly degrading this context representation leads
A Connectionist Model of Sentence Comprehension and Production. Unpublished
, 2002
"... The most predominant language processing theories have, for some time, been based largely on structured knowledge and relatively simple rules. These symbolic models intentionally segregate syntactic information processing from statistical information as well as semantic, pragmatic, and discourse inf ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
The most predominant language processing theories have, for some time, been based largely on structured knowledge and relatively simple rules. These symbolic models intentionally segregate syntactic information processing from statistical information as well as semantic, pragmatic, and discourse influences, thereby minimizing the importance of these potential constraints in learning and processing language. While such models have the advantage of being relatively simple and explicit, they are inadequate to account for learning and validated ambiguity resolution phenomena. In recent years, interactive constraint-based theories of sentence processing have gained increasing support, as a growing body of empirical evidence demonstrates early influences of various factors on comprehension performance. Connectionist networks are one form of model that naturally reflect many properties of constraint-based theories, and thus provide a form in which those theories may be instantiated. Unfortunately, most of the connectionist language models implemented until now have involved severe limitations, restricting the phenomena they could address. Comprehension and production models have, by and large, been limited to simple sentences with small vocabularies (cf. St. John & McClelland, 1990). Most models that have addressed the problem of complex, multi-clausal sentence processing have been prediction networks (cf. Elman, 1991; Christiansen & Chater, 1999a). Although a useful component of a language processing system, prediction does not get at the heart of language: the interface between syntax and semantics.
Becoming Syntactic
"... Psycholinguistic research has shown that the influence of abstract syntactic knowledge on performance is shaped by particular sentences that have been experienced. To explore this idea, the authors applied a connectionist model of sentence production to the development and use of abstract syntax. Th ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Psycholinguistic research has shown that the influence of abstract syntactic knowledge on performance is shaped by particular sentences that have been experienced. To explore this idea, the authors applied a connectionist model of sentence production to the development and use of abstract syntax. The model makes use of (a) error-based learning to acquire and adapt sequencing mechanisms and (b) meaning–form mappings to derive syntactic representations. The model is able to account for most of what is known about structural priming in adult speakers, as well as key findings in preferential looking and elicited production studies of language acquisition. The model suggests how abstract knowledge and concrete experience are balanced in the development and use of syntax.
Architectural Bias in Recurrent Neural Networks - Fractal Analysis
- IEEE Transactions on Neural Networks
, 1931
"... We have recently shown that when initialized with "small" weights, recurrent neural networks (RNNs) with standard sigmoid-type activation functions are inherently biased towards Markov models, i.e. even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
We have recently shown that when initialized with "small" weights, recurrent neural networks (RNNs) with standard sigmoid-type activation functions are inherently biased towards Markov models, i.e. even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer & Tino, 2002; Tino, Cernansky & Benuskova, 2002; Tino, Cernansky & Benuskova, 2002a). Following Christiansen and Chater (1999), we refer to this phenomenon as the architectural bias of RNNs. In this paper we further extend our work on the architectural bias in RNNs by performing a rigorous fractal analysis of recurrent activation patterns. We assume the network is driven by sequences obtained by traversing an underlying finite-state transition diagram -- a scenario that has been frequently considered in the past e.g. when studying RNN-based learning and implementation of regular grammars and finite-state transducers. We obtain lower and upper bounds on various types of fractal dimensions, such as box-counting and Hausdor# dimensions. It turns out that not only can the recurrent activations inside RNNs with small initial weights be explored to build Markovian predictive models, but also the activations form fractal clusters the dimension of which can be bounded by the scaled entropy of the underlying driving source. The scaling factors are fixed and are given by the RNN parameters.
Context-Free and Context-Sensitive Dynamics in Recurrent Neural Networks
, 2000
"... Continuous-valued recurrent neural networks can learn mechanisms for processing context-free languages. The dynamics of such networks is usually based on damped oscillation around fixed points in state space and requires that the dynamical components are arranged in certain ways. It is shown tha ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
Continuous-valued recurrent neural networks can learn mechanisms for processing context-free languages. The dynamics of such networks is usually based on damped oscillation around fixed points in state space and requires that the dynamical components are arranged in certain ways. It is shown that qualitatively similar dynamics with similar constraints hold for a n b n c n , a context-sensitive language. The additional difficulty with a n b n c n , compared with the context-free language a n b n , consists of "counting up" and "counting down" letters simultaneously. The network solution is to oscillate in two principal dimensions, one for counting up and one for counting down. This study focuses on the dynamics employed by the Sequential Cascaded Network, in contrast with the Simple Recurrent Network, and the use of Backpropagation Through Time. Found solutions generalize well beyond training data, however, learning is not reliable. The contribution of this ...
Curriculum Learning
"... Humans and animals learn much better when the examples are not randomly presented but organized in a meaningful order which illustrates gradually more concepts, and gradually more complex ones. Here, we formalize such training strategies in the context of machine learning, and call them “curriculum ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
Humans and animals learn much better when the examples are not randomly presented but organized in a meaningful order which illustrates gradually more concepts, and gradually more complex ones. Here, we formalize such training strategies in the context of machine learning, and call them “curriculum learning”. In the context of recent research studying the difficulty of training in the presence of non-convex training criteria (for deep deterministic and stochastic neural networks), we explore curriculum learning in various set-ups. The experiments show that significant improvements in generalization can be achieved. We hypothesize that curriculum learning has both an effect on the speed of convergence of the training process to a minimum and, in the case of non-convex criteria, on the quality of the local minima obtained: curriculum learning can be seen as a particular form of continuation method (a general strategy for global optimization of non-convex functions). 1.
Symbolically speaking: a connectionist model of sentence production
- Cognitive Science
, 2002
"... The ability to combine words into novel sentences has been used to argue that humans have symbolic language production abilities. Critiques of connectionist models of language often center on the inability of these models to generalize symbolically (Fodor & Pylyshyn, 1988; Marcus, 1998). To address ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
The ability to combine words into novel sentences has been used to argue that humans have symbolic language production abilities. Critiques of connectionist models of language often center on the inability of these models to generalize symbolically (Fodor & Pylyshyn, 1988; Marcus, 1998). To address these issues, a connectionist model of sentence production was developed. The model had variables (role-concept bindings) that were inspired by spatial representations (Landau & Jackendoff, 1993). In order to take advantage of these variables, a novel dual-pathway architecture with event semantics is proposed and shown to be better at symbolic generalization than several variants. This architecture has one pathway for mapping message content to words and a separate pathway that enforces sequencing constraints. Analysis of the model’s hidden units demonstrated that the model learned different types of information in each pathway, and that the model’s compositional behavior arose from the combination of these two pathways. The model’s ability to balance symbolic and statistical behavior in syntax acquisition and to model aphasic double dissociations provided independent support for the dual-pathway architecture.
From Baby Steps to Leapfrog: How “Less is More” in unsupervised dependency parsing
- IN NAACL-HLT
"... We present three approaches for unsupervised grammar induction that are sensitive to data complexity and apply them to Klein and Manning’s Dependency Model with Valence. The first, Baby Steps, bootstraps itself via iterated learning of increasingly longer sentences and requires no initialization. Th ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
We present three approaches for unsupervised grammar induction that are sensitive to data complexity and apply them to Klein and Manning’s Dependency Model with Valence. The first, Baby Steps, bootstraps itself via iterated learning of increasingly longer sentences and requires no initialization. This method substantially exceeds Klein and Manning’s published scores and achieves 39.4 % accuracy on Section 23 (all sentences) of the Wall Street Journal corpus. The second, Less is More, uses a low-complexity subset of the available data: sentences up to length 15. Focusing on fewer but simpler examples trades off quantity against ambiguity; it attains 44.1% accuracy, using the standard linguisticallyinformed prior and batch training, beating state-of-the-art. Leapfrog, our third heuristic, combines Less is More with Baby Steps by mixing their models of shorter sentences, then rapidly ramping up exposure to the full training set, driving up accuracy to 45.0%. These trends generalize to the Brown corpus; awareness of data complexity may improve other parsing models and unsupervised algorithms.
How phonological structures can be culturally selected for learnability
- ADAPTIVE BEHAVIOR
, 2005
"... ..."
Connectionist models of development
, 2003
"... How have connectionist models informed the study of development? This paper considers three contributions from specific models. First, connectionist models have proven useful for exploring nonlinear dynamics and emergent properties, and their role in nonlinear developmental trajectories, critical pe ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
How have connectionist models informed the study of development? This paper considers three contributions from specific models. First, connectionist models have proven useful for exploring nonlinear dynamics and emergent properties, and their role in nonlinear developmental trajectories, critical periods and developmental disorders. Second, connectionist models have informed the study of the representations that lead to behavioral dissociations. Third, connectionist models have provided insight into neural mechanisms, and why different brain regions are specialized for different functions. Connectionist and dynamic systems approaches to development have differed, with connectionist approaches focused on learning processes and representations in cognitive tasks, and dynamic systems approaches focused on mathematical characterizations of physical elements of the system and their interactions with the environment. The two approaches also share much in common, such as their emphasis on continuous, nonlinear processes and their broad application to a range of behaviors.

