Results 1 - 10
of
24
A bayesian framework for word segmentation: Exploring the effects of context
- In 46th Annual Meeting of the ACL
, 2009
"... Since the experiments of Saffran et al. (1996a), there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we use computational modeling to explore the effects of differen ..."
Abstract
-
Cited by 26 (7 self)
- Add to MetaCart
Since the experiments of Saffran et al. (1996a), there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we use computational modeling to explore the effects of different assumptions the learner might make regarding the nature of words – in particular, how these assumptions affect the kinds of words that are segmented from a corpus of transcribed child-directed speech. We develop several models within a Bayesian ideal observer framework, and use them to examine the consequences of assuming either that words are independent units, or units that help to predict other units. We show through empirical and theoretical results that the assumption of independence causes the learner to undersegment the corpus, with many two- and three-word sequences (e.g. what’s that, do you, in the house) misidentified as individual words. In contrast, when the learner assumes that words are predictive, the resulting segmentation is far more accurate. These results indicate that taking context into account is important for a statistical word segmentation strategy to be successful, and raise the possibility that even young infants may be able to exploit more subtle statistical patterns than have usually been considered. 1
When cues collide: Use of stress and statistical cues to word boundaries by 7- to 9-month-old infants
- Developmental Psychology
, 2003
"... Prior research suggests that stress cues are particularly important for English-hearing infants ’ detection of word boundaries. It is unclear, though, how infants learn to attend to stress as a cue to word segmentation. This series of experiments was designed to explore infants ’ attention to confli ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
Prior research suggests that stress cues are particularly important for English-hearing infants ’ detection of word boundaries. It is unclear, though, how infants learn to attend to stress as a cue to word segmentation. This series of experiments was designed to explore infants ’ attention to conflicting cues at different ages. Experiment 1 replicated previous findings: When stress and statistical cues indicated different word boundaries, 9-month-old infants used syllable stress as a cue to segmentation while ignoring statistical cues. However, in Experiment 2, 7-month-old infants attended more to statistical cues than to stress cues. These results raise the possibility that infants use their statistical learning abilities to locate words in speech and use those words to discover the regular pattern of stress cues in English. Infants at different ages may deploy different segmentation strategies as a function of their current linguistic experience. To achieve mastery of their native language, infants must identify and learn words. Identifying words in an unfamiliar language is no simple task. Unlike the white spaces that mark the boundaries between words in a written text, speakers do not consistently place silent pauses between words when speaking (e.g., Cole & Jakimik,
Reading acquisition, developmental dyslexia and skilled reading across languages: A psycholinguistic grain size theory
- Psychological Bulletin
, 2005
"... The development of reading depends on phonological awareness across all languages so far studied. Languages vary in the consistency with which phonology is represented in orthography. This results in developmental differences in the grain size of lexical representations and accompanying differences ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
The development of reading depends on phonological awareness across all languages so far studied. Languages vary in the consistency with which phonology is represented in orthography. This results in developmental differences in the grain size of lexical representations and accompanying differences in developmental reading strategies and the manifestation of dyslexia across orthographies. Differences in lexical representations and reading across languages leave developmental “footprints ” in the adult lexicon. The lexical organization and processing strategies that are characteristic of skilled reading in different orthographies are affected by different developmental constraints in different writing systems. The authors develop a novel theoretical framework to explain these cross-language data, which they label a psycholinguistic grain size theory of reading and its development. Reading is the process of understanding speech written down. The goal is to gain access to meaning. To acquire reading, children must learn the code used by their culture for representing speech as a series of visual symbols. Learning to read is thus fundamentally a process of matching distinctive visual symbols to units of sound (phonology). In most languages, the relationship between symbol
Modeling Human Performance in Statistical Word Segmentation
"... What mechanisms support the ability of human infants, adults, and other primates to identify words from fluent speech using distributional regularities? In order to better characterize this ability, we collected data from adults in an artificial language segmentation task similar to Saffran, Newport ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
What mechanisms support the ability of human infants, adults, and other primates to identify words from fluent speech using distributional regularities? In order to better characterize this ability, we collected data from adults in an artificial language segmentation task similar to Saffran, Newport, and Aslin (1996) in which the length of sentences was systematically varied between groups of participants. We then compared the fit of a variety of computational models— including simple statistical models of transitional probability and mutual information, a clustering model based on mutual information by Swingley (2005), PARSER (Perruchet & Vintner, 1998), and a Bayesian model. We found that while all models were able to successfully complete the task, fit to the human data varied considerably, with the Bayesian model achieving the highest correlation with our results.
Unsupervised Lexical Learning as Inductive Inference
, 2000
"... To learn a language, the learners must first learn its words, the essential building blocks for utterances. The difficulty in learning words lies in the unavailability of explicit word boundaries in speech input. The learners have to infer lexical items with some innately endowed learning mechanism( ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
To learn a language, the learners must first learn its words, the essential building blocks for utterances. The difficulty in learning words lies in the unavailability of explicit word boundaries in speech input. The learners have to infer lexical items with some innately endowed learning mechanism(s) for regularity detection- regularities in the speech normally indicate word patterns. With respect to Zipf's least-effort principle and Chomsky's thoughts on the minimality of grammar for human language, we hypothesise a cognitive mechanism underlying language learning that seeks for the least-effort representation for input data. Accordingly, lexical learning is to infer the minimal-cost representation for the input under the constraint of permissible representation for lexical items. The main theme of this thesis is to examine how far this learning mechanism can go in unsupervised lexical learning from real language data without any pre-defined (e.g., prosodic and phonotactic) cues, but entirely resting on statistical induction of structural patterns for the most economic representation for the data. We first review
A computational model of language acquisition: focus on word discovery
- In Interspeech 2008
, 2008
"... Young infants learn words by detecting patterns in the speech signal and by associating these patterns to stimuli presented by non-speech modalities (e.g vision). In this paper, we model this behaviour by designing and testing a computational model of word discovery. The model is able to build word- ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Young infants learn words by detecting patterns in the speech signal and by associating these patterns to stimuli presented by non-speech modalities (e.g vision). In this paper, we model this behaviour by designing and testing a computational model of word discovery. The model is able to build word-like representations on the basis of multimodal input data. The discovery of words (and word-like entities) takes place within a communicative loop between two protagonists, a ’carer ’ and the ’learner’. Experiments carried out on three different European languages (Finnish, Swedish, and Dutch) show that a robust word representation can be learned in using about 50 acoustic tokens (examples) of that word. The model is inspired by the memory structure that is assumed functional for human speech processing. Index Terms: language acquisition, unsupervised word detection, computational modelling 1.
Statistics Learning and Universal Grammar: Modeling Word Segmentation
"... This paper describes a computational model of word segmentation and presents simulation results on realistic acquisition In particular, we explore the capacity and limitations of statistical learning mechanisms that have recently gained prominence in cognitive psychology and linguistics. ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper describes a computational model of word segmentation and presents simulation results on realistic acquisition In particular, we explore the capacity and limitations of statistical learning mechanisms that have recently gained prominence in cognitive psychology and linguistics.
How to handle pronunciation variation in ASR: by storing episodes in memory
- In: Proceedings of the workshop on Speech Recognition and Intrinsic Variation
, 2006
"... Almost all current automatic speech recognition (ASR) systems use a similar paradigm [3, 51, 52], which will be referred to here briefly as the ‘invariant approach’. Despite intensive research, ASR performance is still at least an order of magnitude lower than that of human speech recognition (HSR). ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Almost all current automatic speech recognition (ASR) systems use a similar paradigm [3, 51, 52], which will be referred to here briefly as the ‘invariant approach’. Despite intensive research, ASR performance is still at least an order of magnitude lower than that of human speech recognition (HSR). The difficulties encountered in improving ASR performance, in combination with the awareness that current ASR systems have some shortcomings, have led many to believe that a new paradigm for ASR is needed. In this paper a novel paradigm for ASR is presented. The invariant approach has also dominated (psycho-) linguistics. However, recent findings that indexical and detailed (sub-phonemic) information influence lexical access, have started a debate in (psycho-)linguistics on how these findings could be incorporated in HSR theories and models. On the basis of these findings episodic theories have been proposed. Although the episodic speech recognition (ESR) model is mainly inspired by HSR research, it is also very interesting and promising for ASR, since it has the potential to resolve some shortcomings of the mainstream ASR approach. 1.
Connectionist Modelling of Lexical Segmentation and Vocabulary Acquisition
"... tening to an unfamiliar language we no longer experience sequences of discrete words, but rather hear a continuous stream of speech with boundaries separating individual sentences or utterances. Examination of the physical form of speech confirms the impression given by listening to foreign language ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
tening to an unfamiliar language we no longer experience sequences of discrete words, but rather hear a continuous stream of speech with boundaries separating individual sentences or utterances. Examination of the physical form of speech confirms the impression given by listening to foreign languages. Speech does not contain gaps or other unambiguous markers of word boundaries -- there is no auditory analog of the spaces between words in printed text (Lehiste, 1960). Thus the perceptual experience of native speakers reflects language- Matt Davis Lexical segmentation and vocabulary acquisition 3 specific knowledge of ways in which to divide speech into words. An important set of questions, therefore, concern the sources of information that are used for segmentation and how infants learn to segment the speech stream in order to learn their first words. The continuous nature of speech might not be a problem for infants learning language if they were `spoon-fed' with sin
Audio Speech Segmentation Without Language-Specific Knowledge
- Proceedings of the 28th Annual Meeting of the Cognitive Science Society
, 2006
"... Speech segmentation is the problem of finding word boundaries in spoken language when the underlying vocabulary is still unknown. Here we show that a system with no phonemic knowledge can find word boundaries. The system first subdivides an utterance by recursively clustering similar parts of the si ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Speech segmentation is the problem of finding word boundaries in spoken language when the underlying vocabulary is still unknown. Here we show that a system with no phonemic knowledge can find word boundaries. The system first subdivides an utterance by recursively clustering similar parts of the signal together until the cepstral coefficient variance is low within each new segment. These segments are then used as inputs to a perceptron-like algorithm that finds repeated segments across utterances. With only a few sample utterances, and no previous linguistic knowledge, the system can find the words that were repeated across utterances and identify new utterances that contain those words. The findings show that the assumption of a phoneme classification module is not necessary for a “minimum description length ” (Brent & Cartwright, 1996; de Marcken, 1996) explanation of word segmentation.

