Results 1 -
2 of
2
Word Segmentation as General Chunking
"... During language acquisition, children learn to segment speech into phonemes, syllables, morphemes, and words. We examine word segmentation specifically, and explore the possibility that children might have generalpurpose chunking mechanisms to perform word segmentation. The Voting Experts (VE) and B ..."
Abstract
- Add to MetaCart
During language acquisition, children learn to segment speech into phonemes, syllables, morphemes, and words. We examine word segmentation specifically, and explore the possibility that children might have generalpurpose chunking mechanisms to perform word segmentation. The Voting Experts (VE) and Bootstrapped Voting Experts (BVE) algorithms serve as computational models of this chunking ability. VE finds chunks by searching for a particular information-theoretic signature: low internal entropy and high boundary entropy. BVE adds to VE the ability to incorporate information about word boundaries previously found by the algorithm into future segmentations. We evaluate the general chunking model on phonemicallyencoded corpora of child-directed speech, and show that it is consistent with empirical results in the developmental literature. We argue that it offers a parsimonious alternative to specialpurpose linguistic models. 1
Zipfian word frequencies support statistical word segmentation
"... Word frequencies in natural language follow a Zipfian distribution. Artificial language experiments that are meant to simulate language acquisition generally use uniform word frequency distributions, however. In the present study we examine whether a Zipfian frequency distribution influences adult l ..."
Abstract
- Add to MetaCart
Word frequencies in natural language follow a Zipfian distribution. Artificial language experiments that are meant to simulate language acquisition generally use uniform word frequency distributions, however. In the present study we examine whether a Zipfian frequency distribution influences adult learners ’ word segmentation performance. Using two experimental paradigms (a forced choice task and an orthographic segmentation task), we show that human statistical learning abilities are robust enough to identify words from exposures with widely varying frequency distributions. Additionally, we report a facilitatory effect of Zipfian distributions on word segmentation performance in the orthographic segmentation task, both in segmenting trained material and in generalization to novel material. Zipfian distributions increase the chances for learners to apply their knowledge in processing a new speech stream.

