Results 1 - 10
of
34
Learning words from sights and sounds: a computational model
, 2002
"... This paper presents an implemented computational model of word acquisition which learns directly from raw multimodal sensory input. Set in an information theoretic framework, the model acquires a lexicon by finding and statistically modeling consistent cross-modal structure. The model has been imple ..."
Abstract
-
Cited by 182 (29 self)
- Add to MetaCart
This paper presents an implemented computational model of word acquisition which learns directly from raw multimodal sensory input. Set in an information theoretic framework, the model acquires a lexicon by finding and statistically modeling consistent cross-modal structure. The model has been implemented in a system using novel speech processing, computer vision, and machine learning algorithms. In evaluations the model successfully performed speech segmentation, word discovery and visual categorization from spontaneous infant-directed speech paired with video images of single objects. These results demonstrate the possibility of using state-of-the-art techniques from sensory pattern recognition and machine learning to implement cognitive models which can process raw sensor data without the need for human transcription or labeling.
An efficient, probabilistically sound algorithm for segmentation and word discovery
- MACHINE LEARNING
, 1999
"... This paper presents a model-based, unsupervised algorithm for recovering word boundaries in a natural-language text from which they have been deleted. The algorithm is derived from a probability model of the source that generated the text. The fundamental structure of the model is specified abstract ..."
Abstract
-
Cited by 103 (2 self)
- Add to MetaCart
This paper presents a model-based, unsupervised algorithm for recovering word boundaries in a natural-language text from which they have been deleted. The algorithm is derived from a probability model of the source that generated the text. The fundamental structure of the model is specified abstractly so that the detailed component models of phonology, word-order, and word frequency can be replaced in a modular fashion. The model yields a language-independent, prior probability distribution on all possible sequences of all possible words over a given alphabet, based on the assumption that the input was generated by concatenating words from a fixed but unknown lexicon. The model is unusual in that it treats the generation of a complete corpus, regardless of length, as a single event in the probability space. Accordingly, the algorithm does not estimate a probability distribution on words; instead, it attempts to calculate the prior probabilities of various word sequences that could underlie the observed text. Experiments on phonemic transcripts of spontaneous speech by parents to young children suggest that our algorithm is more effective than other proposed algorithms, at least when utterance boundaries are given and the text includes a substantial number of short utterances.
Distributional Regularity and Phonotactic Constraints are Useful for Segmentation
- Cognition
, 1996
"... In order to acquire a lexicon, young children must segment speech into words, even though most words are unfamiliar to them. This is a non-trivial task because speech lacks any acoustic analog of the blank spaces between printed words. Two sources of information that might be useful for this task ar ..."
Abstract
-
Cited by 81 (1 self)
- Add to MetaCart
In order to acquire a lexicon, young children must segment speech into words, even though most words are unfamiliar to them. This is a non-trivial task because speech lacks any acoustic analog of the blank spaces between printed words. Two sources of information that might be useful for this task are distributional regularity and phonotactic constraints. Informally, distributional regularity refers to the intuition that sound sequences that occur frequently and in a variety of contexts are better candidates for the lexicon than those that occur rarely or in few contexts. We express that intuition formally by a class of functions called DR functions. We then put forth three hypotheses: First, that children segment using DR functions. Second, that they exploit phonotactic constraints on the possible pronunciations of words in their language. Specifically, they exploit both the requirement that every word must have a vowel and the constraints that languages impose constraints on word-ini...
Computation of conditional probability statistics by 8-month-old infants
- PSYCHOLOGICAL SCIENCE
, 1998
"... A recent report demonstrated that 8-month-olds can segment a continuous stream of speech syllables, containing no acoustic or prosodic cues to word boundaries, into wordlike units after only 2 min of listening experience (Saffran, Aslin, & Newport, 1996). Thus, a powerful learning mechanism capabl ..."
Abstract
-
Cited by 62 (14 self)
- Add to MetaCart
A recent report demonstrated that 8-month-olds can segment a continuous stream of speech syllables, containing no acoustic or prosodic cues to word boundaries, into wordlike units after only 2 min of listening experience (Saffran, Aslin, & Newport, 1996). Thus, a powerful learning mechanism capable of extracting statistical information from fluent speech is available early in development. The present study extends these results by documenting the particular type of statistical computation—transitional (conditional) probability—used by infants to solve this word-segmentation task. An artificial language corpus, consisting of a continuous stream of trisyllabic nonsense words, was presented to 8-month-olds for 3 min. A postfamiliarization test compared the infants’ responses to words versus part-words (trisyllabic sequences spanning word boundaries). The corpus was constructed so that test words and part-words were matched in frequency, but differed in their transitional probabilities. Infants showed reliable
Learning to Segment Speech Using Multiple Cues: A Connectionist Model
- LANGUAGE AND COGNITIVE PROCESSES
, 1998
"... ..."
The Role of Exposure to Isolated Words in Early Vocabulary Development
- COGNITION
, 2001
"... Fluent speech contains no known acoustic analog of the blank spaces between printed words. Early research presumed that word learning is driven primarily by exposure to isolated words. In the last decade there has been a shift to the view that exposure to isolated words is unreliable and plays lit ..."
Abstract
-
Cited by 42 (0 self)
- Add to MetaCart
Fluent speech contains no known acoustic analog of the blank spaces between printed words. Early research presumed that word learning is driven primarily by exposure to isolated words. In the last decade there has been a shift to the view that exposure to isolated words is unreliable and plays little if any role in early word learning. This study revisits the role of isolated words. The results show (a) that isolated words are a reliable feature of speech to infants, (b) that they include a variety of word types, many of which are repeated in close temporal proximity, (c) that about three fourths of the words infants produce are words that mothers speak in isolation, and (d) that the frequency with which a child hears a word in isolation predicts whether that word will be learned better than the child's total frequency of exposure to that word. Thus, exposure to isolated words may significantly facilitate vocabulary development at its earliest stages.
A Probabilistic Constraints Approach To Language Acquisition And Processing
, 1989
"... This article provides an overview of a probabilistic constraints framework for thinking about language acquisition and processing. The generative approach attempts to characterize knowledge of language (i.e., competence grammar) and then asks how this knowledge is acquired and used. Our approach is ..."
Abstract
-
Cited by 37 (3 self)
- Add to MetaCart
This article provides an overview of a probabilistic constraints framework for thinking about language acquisition and processing. The generative approach attempts to characterize knowledge of language (i.e., competence grammar) and then asks how this knowledge is acquired and used. Our approach is performance oriented: the goal is to explain how people comprehend and produce utterances and how children acquire this skill. Use of language involves exploiting multiple probabilistic constraints over various types of linguistic and nonlinguistic information. Acquisition is the process of accumulating this information, which begins in infancy. The constraint satisfaction processes that are central to language use are the same as the bootstrapping processes that provide entry to language for the child. Framing questions about acquisition in terms of models of adult performance unifies the two topics under a set of common principles and has important consequences for arguments concerning lan...
From First Contact to Close Encounters: A Developmentally Deep Perceptual System for a Humanoid Robot
, 2003
"... This thesis presents a perceptual system for a humanoid robot that integrates abilities such as object localization and recognition with the deeper developmental machinery required to forge those competences out of raw physical experiences. It shows that a robotic platform can build up and maintain ..."
Abstract
-
Cited by 35 (6 self)
- Add to MetaCart
This thesis presents a perceptual system for a humanoid robot that integrates abilities such as object localization and recognition with the deeper developmental machinery required to forge those competences out of raw physical experiences. It shows that a robotic platform can build up and maintain a system for object localization, segmentation, and recognition, starting from very little. What the robot starts with is a direct solution to achieving figure/ground separation: it simply `pokes around' in a region of visual ambiguity and watches what happens. If the arm passes through an area, that area is recognized as free space. If the arm collides with an object, causing it to move, the robot can use that motion to segment the object from the background. Once the robot can acquire reliable segmented views of objects, it learns from them, and from then on recognizes and segments those objects without further contact. Both low-level and high-level visual features can also be learned in this way, and examples are presented for both: orientation detection and affordance recognition, respectively.
The Role of Embodied Intention in Early Lexical Acquisition
- In Proceedings the Twenty Fifth Cognitive Science Society Annual Meetings
, 2003
"... We examine the influence of inferring interlocutors' referential intentions from their body movements at the early stage of lexical acquisition. By testing human subjects and comparing their performances in different learning conditions, we find that those embodied intentions facilitate both wo ..."
Abstract
-
Cited by 27 (12 self)
- Add to MetaCart
We examine the influence of inferring interlocutors' referential intentions from their body movements at the early stage of lexical acquisition. By testing human subjects and comparing their performances in different learning conditions, we find that those embodied intentions facilitate both word discovery and word-meaning association.
A bayesian framework for word segmentation: Exploring the effects of context
- In 46th Annual Meeting of the ACL
, 2009
"... Since the experiments of Saffran et al. (1996a), there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we use computational modeling to explore the effects of differen ..."
Abstract
-
Cited by 26 (7 self)
- Add to MetaCart
Since the experiments of Saffran et al. (1996a), there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we use computational modeling to explore the effects of different assumptions the learner might make regarding the nature of words – in particular, how these assumptions affect the kinds of words that are segmented from a corpus of transcribed child-directed speech. We develop several models within a Bayesian ideal observer framework, and use them to examine the consequences of assuming either that words are independent units, or units that help to predict other units. We show through empirical and theoretical results that the assumption of independence causes the learner to undersegment the corpus, with many two- and three-word sequences (e.g. what’s that, do you, in the house) misidentified as individual words. In contrast, when the learner assumes that words are predictive, the resulting segmentation is far more accurate. These results indicate that taking context into account is important for a statistical word segmentation strategy to be successful, and raise the possibility that even young infants may be able to exploit more subtle statistical patterns than have usually been considered. 1

