Results 1 - 10
of
95
The Emergence of Linguistic Structure: An Overview of the Iterated Learning Model
- In
, 2002
"... Introduction As language users humans possess a culturally transmitted system of unparalleled complexity in the natural world. Linguistics has revealed over the past 40 years the degree to which the syntactic structure of language in particular is strikingly complex. Furthermore, as Pinker and Bloo ..."
Abstract
-
Cited by 64 (7 self)
- Add to MetaCart
Introduction As language users humans possess a culturally transmitted system of unparalleled complexity in the natural world. Linguistics has revealed over the past 40 years the degree to which the syntactic structure of language in particular is strikingly complex. Furthermore, as Pinker and Bloom point out in their agenda-setting paper Natural Language and Natural Selection \grammar is a complex mechanism tailored to the transmission of propositional structures through a serial interface" (Pinker and Bloom, 1990, 707). These sorts of observations, along with inuential arguments from linguistics and psychology about the innateness of language (see, e.g. Chomsky, 1986; Pinker, 1994), have led many authors to the conclusion that an explanation for the origin of syntax must invoke neo-Darwinian natural selection. \Evolutionary theory oers clear criteria for when a trait should be attributed to natural selection: complex design for some function, and the absence of alternative proc
The Role of Exposure to Isolated Words in Early Vocabulary Development
- COGNITION
, 2001
"... Fluent speech contains no known acoustic analog of the blank spaces between printed words. Early research presumed that word learning is driven primarily by exposure to isolated words. In the last decade there has been a shift to the view that exposure to isolated words is unreliable and plays lit ..."
Abstract
-
Cited by 42 (0 self)
- Add to MetaCart
Fluent speech contains no known acoustic analog of the blank spaces between printed words. Early research presumed that word learning is driven primarily by exposure to isolated words. In the last decade there has been a shift to the view that exposure to isolated words is unreliable and plays little if any role in early word learning. This study revisits the role of isolated words. The results show (a) that isolated words are a reliable feature of speech to infants, (b) that they include a variety of word types, many of which are repeated in close temporal proximity, (c) that about three fourths of the words infants produce are words that mothers speak in isolation, and (d) that the frequency with which a child hears a word in isolation predicts whether that word will be learned better than the child's total frequency of exposure to that word. Thus, exposure to isolated words may significantly facilitate vocabulary development at its earliest stages.
Choosing words in computer-generated weather forecasts
- Artificial Intelligence
, 2005
"... One of the main challenges in automatically generating textual weather forecasts is choosing appropriate English words to communicate numeric weather data. A corpus-based analysis of how humans write forecasts showed that there were major differences in how individual writers performed this task, th ..."
Abstract
-
Cited by 37 (15 self)
- Add to MetaCart
One of the main challenges in automatically generating textual weather forecasts is choosing appropriate English words to communicate numeric weather data. A corpus-based analysis of how humans write forecasts showed that there were major differences in how individual writers performed this task, that is, in how they translated data into words. These differences included both different preferences between potential near-synonyms that could be used to express information, and also differences in the meanings that individual writers associated with specific words. Because we thought these differences could confuse readers, we built our SumTime-Mousam weather-forecast generator to use consistent data-to-word rules, which avoided words which were only used by a few people, and words which were interpreted differently by different people. An evaluation by forecast users suggested that they preferred SumTime-Mousam’s texts to human-generated texts, in part because of better word choice; this may be the first time that an evaluation has shown that nlg texts are better than human-authored texts. Key words: natural language processing, natural language generation, language and the word, information presentation, weather forecasts, lexical choice, idiolect Preprint submitted to Elsevier Science 2 June 2005
Learning Semantic Correspondences with Less Supervision
"... A central problem in grounded language acquisition is learning the correspondences between a rich world state and a stream of text which references that world state. To deal with the high degree of ambiguity present in this setting, we present a generative model that simultaneously segments the text ..."
Abstract
-
Cited by 25 (3 self)
- Add to MetaCart
A central problem in grounded language acquisition is learning the correspondences between a rich world state and a stream of text which references that world state. To deal with the high degree of ambiguity present in this setting, we present a generative model that simultaneously segments the text into utterances and maps each utterance to a meaning representation grounded in the world state. We show that our model generalizes across three domains of increasing difficulty—Robocup sportscasting, weather forecasts (a new domain), and NFL recaps. 1
Why It Is Hard to Label Our Concepts
- (TO APPEAR IN HALL & WAXMAN (EDS.), WEAVING A LEXICON. CAMBRIDGE, MA: MIT
, 2004
"... ..."
A Computational Theory of Vocabulary Acquisition
- Natural Language Processing and Knowledge Representation: Language for Knowledge and Knowledge for Language (Menlo Park, CA/Cambridge
, 1998
"... As part of an interdisciplinary project to develop a computational cognitive model of a reader of narrative text, we are developing a computational theory of how natural-language-understanding systems can automatically acquire new vocabulary by determining from context the meaning of words that are ..."
Abstract
-
Cited by 22 (11 self)
- Add to MetaCart
As part of an interdisciplinary project to develop a computational cognitive model of a reader of narrative text, we are developing a computational theory of how natural-language-understanding systems can automatically acquire new vocabulary by determining from context the meaning of words that are unknown, misunderstood, or used in a new sense. `Context' includes surrounding text, grammatical information, and background knowledge, but no external sources. Our thesis is that the meaning of such a word can be determined from context, can be revised upon further encounters with the word, "converges" to a dictionary-like definition if enough context has been provided and there have been enough exposures to the word, and eventually "settles down" to a "steady state" that is always subject to revision upon further encounters with the word. The system is being implemented in the SNePS knowledgerepresentation and reasoning system. This essay is forthcoming as a chapter in Iwanska, L/ucja, & S...
Formal grammar and information theory: Together again?
- PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY
, 2000
"... In the last 40 years, research on models of spoken and written language has been split between two seemingly irreconcilable traditions: formal linguistics in the Chomsky tradition, and information theory in the Shannon tradition. Zellig Harris had advocated a close alliance between grammatical and i ..."
Abstract
-
Cited by 22 (0 self)
- Add to MetaCart
In the last 40 years, research on models of spoken and written language has been split between two seemingly irreconcilable traditions: formal linguistics in the Chomsky tradition, and information theory in the Shannon tradition. Zellig Harris had advocated a close alliance between grammatical and information-theoretic principles in the analysis of natural language, and early formal-language theory provided another strong link between information theory and linguistics. Nevertheless, in most research on language and computation, grammatical and information-theoretic approaches had moved far apart. Today, after many years on the defensive, the information-theoretic approach has gained new strength and achieved practical successes in speech recognition, information retrieval, and, increasingly, in language analysis and machine translation. The exponential increase in the speed and storage capacity of computers is the proximate cause of these engineering successes, allowing the automatic estimation of the parameters of probabilistic models of language by counting occurrences of linguistic events in very large bodies of text and speech. However, I will argue that informationtheoretic and computational ideas are also playing an increasing role in the scientific understanding of language, and will help bring together formal-linguistic and information-theoretic perspectives.
Automatic Construction of Semantic Lexicons for Learning Natural Language Interfaces
, 1999
"... This paper describes a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of words paired with meaning representations. Wolfie is part of an integrated system that l ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
This paper describes a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of words paired with meaning representations. Wolfie is part of an integrated system that learns to parse novel sentences into semantic representations, such as logical database queries. Experimental results are presented demonstrating Wolfie's ability to learn useful lexicons for a database interface in four different natural languages. The lexicons learned by Wolfie are compared to those acquired by a similar system developed by Siskind (1996). Content areas: Machine Learning and Discovery, Tasks or Problems, supervised learning; Natural Language Processing, Tasks or Problems, understanding Introduction & Overview The application of learning methods to naturallanguage processing (NLP) has drawn increasing attention in recent years. Using machine learning to help automate the ...
Acquiring word-meaning mappings for natural language interfaces
- Journal of Artificial Intelligence Research
, 2003
"... This paper focuses on a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. Wolfie is part of an integrated system that ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
This paper focuses on a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. Wolfie is part of an integrated system that learns to parse representations such as logical database queries. Experimental results are presented demonstrating Wolfie’s ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by Wolfie are compared to those acquired by a similar system developed by Siskind (1996), with results favorable to Wolfie. A second set of experiments demonstrates Wolfie’s ability to scale to larger and more difficult, albeit artificially generated, corpora. In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods (Cohn, Atlas, & Ladner, 1994) attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance. 1.

