Results 1 - 10
of
29
Learning words from sights and sounds: a computational model
, 2002
"... This paper presents an implemented computational model of word acquisition which learns directly from raw multimodal sensory input. Set in an information theoretic framework, the model acquires a lexicon by finding and statistically modeling consistent cross-modal structure. The model has been imple ..."
Abstract
-
Cited by 182 (29 self)
- Add to MetaCart
This paper presents an implemented computational model of word acquisition which learns directly from raw multimodal sensory input. Set in an information theoretic framework, the model acquires a lexicon by finding and statistically modeling consistent cross-modal structure. The model has been implemented in a system using novel speech processing, computer vision, and machine learning algorithms. In evaluations the model successfully performed speech segmentation, word discovery and visual categorization from spontaneous infant-directed speech paired with video images of single objects. These results demonstrate the possibility of using state-of-the-art techniques from sensory pattern recognition and machine learning to implement cognitive models which can process raw sensor data without the need for human transcription or labeling.
Semiotic Schemas: A Framework for Grounding Language in Action and Perception
, 2005
"... A theoretical framework for grounding language is introduced that provides a computational path from sensing and motor action to words and speech acts. The approach combines concepts from semiotics and schema theory to develop a holistic approach to linguistic meaning. Schemas serve as structured be ..."
Abstract
-
Cited by 58 (10 self)
- Add to MetaCart
A theoretical framework for grounding language is introduced that provides a computational path from sensing and motor action to words and speech acts. The approach combines concepts from semiotics and schema theory to develop a holistic approach to linguistic meaning. Schemas serve as structured beliefs that are grounded in an agent’s physical environment through a causal-predictive cycle of action and perception. Words and basic speech acts are interpreted in terms of grounded schemas. The framework reflects lessons learned from implementations of several language processing robots. It provides a basis for the analysis and design of situated, multimodal communication systems that straddle symbolic and non-symbolic realms.
The agent-based approach: A new direction for computational models of development
- Developmental Review
, 2001
"... The agent-based approach emphasizes the importance of learning through organism-environment interaction. This approach is part of a recent trend in computational models of learning and development toward studying autonomous organisms that are embedded in virtual or real environments. In this paper w ..."
Abstract
-
Cited by 36 (7 self)
- Add to MetaCart
The agent-based approach emphasizes the importance of learning through organism-environment interaction. This approach is part of a recent trend in computational models of learning and development toward studying autonomous organisms that are embedded in virtual or real environments. In this paper we introduce the concepts of online and offline sampling and highlight the role of online sampling in agent-based models. After comparing the strengths of each approach for modeling particular developmental phenomena and research questions, we describe a recent agent-based model of infant causal perception. We conclude by discussing some of the present limitations of agent-based models and suggesting how these challenges may be addressed. © 2001 Academic Press Computational models of learning and development are playing an increasingly critical role in child development research (Cassidy, 1990;
Using Speakers’ Referential Intentions to Model Early Cross-Situational Word Learning
- PSYCHOLOGICAL SCIENCE
, 2009
"... Word learning is a ‘‘chicken and egg’’ problem. If a child could understand speakers ’ utterances, it would be easy to learn the meanings of individual words, and once a child knows what many words mean, it is easy to infer speakers’ intended meanings. To the beginning learner, however, both indivi ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Word learning is a ‘‘chicken and egg’’ problem. If a child could understand speakers ’ utterances, it would be easy to learn the meanings of individual words, and once a child knows what many words mean, it is easy to infer speakers’ intended meanings. To the beginning learner, however, both individual word meanings and speakers ’ intentions are unknown. We describe a computational model of word learning that solves these two inference problems in parallel, rather than relying exclusively on either the inferred meanings of utterances or cross-situational word-meaning associations. We tested our model using annotated corpus data and found that it inferred pairings between words and object concepts with higher precision than comparison models. Moreover, as the result of making probabilistic inferences about speakers’ intentions, our model explains a variety of behavioral phenomena described in the word-learning literature. These phenomena include mutual exclusivity, one-trial learning, cross-situational learning, the role of words in object individuation, and the use of inferred intentions to disambiguate reference.
The emergence of links between lexical acquisition and object categorization: A computational study
- Connection Science
, 2005
"... Language is about symbols, and those symbols must be grounded in the physical world. Children learn to associate language with sensorimotor experiences during their development. In light of this, we first provide a computational account of how words are mapped to their perceptually grounded meanings ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Language is about symbols, and those symbols must be grounded in the physical world. Children learn to associate language with sensorimotor experiences during their development. In light of this, we first provide a computational account of how words are mapped to their perceptually grounded meanings. Moreover, the main part of this work proposes and implements a computational model of how word learning influences the formation of object categories to which those words refer. This model simulates the bi-directional relationship between word and object category learning: (1) object categorization provides mental representations of meanings that are mapped to words to form lexical items; (2) linguistic labels help object categorization by providing additional teaching signals; and (3) these two learning processes interplay with each other and form a developmental feedback loop. Compared with the method that performs these two tasks separately, our model shows promising improvements in both word-to-world mapping and perceptual categorization, suggesting a unified view of lexical and category learning in an integrative framework. Most importantly, this work provides a cognitively plausible explanation of the mechanistic nature of early word learning and object learning from co-occurring multisensory data.
Learning Nouns and Adjectives: A Connectionist Account
- Language and Cognitive Processes
, 1998
"... Why do children learn nouns such as cup faster than dimensional adjectives such as big? Most explanations of this phenomenon rely on prior knowledge of the noun-adjective distinction or on the logical priority of nouns as the arguments of predicates. In this paper we examine an alternative account, ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Why do children learn nouns such as cup faster than dimensional adjectives such as big? Most explanations of this phenomenon rely on prior knowledge of the noun-adjective distinction or on the logical priority of nouns as the arguments of predicates. In this paper we examine an alternative account, one which relies instead on properties of the semantic categories to be learned and of the word learning task itself. We isolate four such properties: the relative size, the relative compactness, and the degree of overlap of the regions in representational space associated with the categories and the presence or absence of lexical dimensions (what color? ) in the linguistic context of a word. In a set of five experiments, we trained a simple connectionist network to label input objects in particular linguistic contexts. The network learned categories resembling nouns with respect to the four properties faster than it learned categories resembling adjectives. Young children learn nouns more ...
The Emergence of Words
, 2001
"... Children change in their word-learning abilities sometime during the second year of life. The nature of this behavioral change has been taken to suggest an underlying change in mechanism, from associative learning to a more purely symbolic form of learning. We present a simple associative compu ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Children change in their word-learning abilities sometime during the second year of life. The nature of this behavioral change has been taken to suggest an underlying change in mechanism, from associative learning to a more purely symbolic form of learning. We present a simple associative computational model that accounts for these developmental shifts without any underlying change in mechanism. Thus, there may be no need to posit a qualitative mechanistic change in the word-learning of young children. More generally, words, as symbols, may emerge from associative beginnings. Overview Word-learning is likely to rely heavily on associative learning, such that the child comes to associate the sound "dog" with dogs, the sound "cat" with cats, and so on. However, children's word-learning abilities change significantly during the second year of life, and some have proposed that this behavioral change reflects an underlying mechanistic shift away from a purely associative base. I...
Recognising Embedded Words in Connected Speech: Context and Competition
- IN J. BULLINARIA, D. MATT DAVIS LEXICAL SEGMENTATION AND VOCABULARY ACQUISITION GLASSPOOL, & G. HOUGHTON (EDS), PROCEEDINGS OF THE FOURTH NEURAL COMPUTATION IN PSYCHOLOGY WORKSHOP
, 1997
"... Onset-embedded words (e.g. cap in captain) present a problem for accounts of spoken word recognition since information coming after the offset of the embedded word may be required for identification. We demonstrate that training a simple recurrent network to activate a representation of all the ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Onset-embedded words (e.g. cap in captain) present a problem for accounts of spoken word recognition since information coming after the offset of the embedded word may be required for identification. We demonstrate that training a simple recurrent network to activate a representation of all the words in a sequence allows the network to learn to recognise onset-embedded words without requiring a training set that is already lexically segmented. We discuss the relationship between our model and other accounts of lexical segmentation and word recognition, and compare the model's performance to psycholinguistic data on the recognition of onset-embedded words.
Spatial prepositions and vague quantifiers: Implementing the functional geometric framework
- IV. Reasoning, Action and Interaction (Lecture notes in Computer Science
, 2005
"... Abstract. There is much empirical evidence showing that factors other than the relative positions of objects in Euclidean space are important in the comprehension of a wide range of spatial prepositions in English and other languages. We first the overview the functional geometric framework (Coventr ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract. There is much empirical evidence showing that factors other than the relative positions of objects in Euclidean space are important in the comprehension of a wide range of spatial prepositions in English and other languages. We first the overview the functional geometric framework (Coventry & Garrod, 2004) which puts “what ” and “where ” information together to underpin the situation specific meaning of spatial terms. We then outline an implementation of this framework. The computational model for the processing of visual scenes and the identification of the appropriate spatial preposition consists of three main modules: (1) Vision Processing, (2) Elman Network, (3) Dual-Route Network. Mirroring data from experiments with human participants, we show that the model is both able to predict what will happen to objects in a scene, and use these judgements to influence the appropriateness of over/under/above/below to describe where objects are located in the scene. Extensions of the model to other prepositions and quantifiers are discussed. 1
Sensorimotor cognition and natural language syntax
, 2010
"... This book is about the interface between natural language and the sensorimotor system. It is obvious that there is an interface between language and sensorimotor cognition, because we can talk about what we see and do. The main proposal in the book is that the interface is more direct than is common ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
This book is about the interface between natural language and the sensorimotor system. It is obvious that there is an interface between language and sensorimotor cognition, because we can talk about what we see and do. The main proposal in the book is that the interface is more direct than is commonly assumed. To argue for this proposal I focus on a simple concrete episode—a man grabbing a cup—which can be reported in a simple transitive sentence (e.g. the English sentence The man grabbed a cup). In the first part of the book I present a detailed model of the sensorimotor processes involved in experiencing this episode, both as the agent bringing it about and as an observer watching it happen. The model draws on a large body of research in neuroscience and psychology. I also present a model of the syntactic structure of the associated transitive sentence, developed within the entirely separate discipline of theoretical linguistics. This latter model is a version of Chomsky’s ‘Minimalist ’ syntactic theory, which assumes that a sentence reporting the episode has the same underlying syntactic structure (called ‘logical form’) regardless of which language it is in. My main proposal is that these two independently motivated models are in fact closely

