Acquiring wordmeaning mappings for natural language interfaces
 Journal of Artificial Intelligence Research
, 2003
"... This paper focuses on a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. Wolfie is part of an integrated system that ..."
This paper focuses on a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. Wolfie is part of an integrated system that learns to parse representations such as logical database queries. Experimental results are presented demonstrating Wolfie’s ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by Wolfie are compared to those acquired by a similar system developed by Siskind (1996), with results favorable to Wolfie. A second set of experiments demonstrates Wolfie’s ability to scale to larger and more difficult, albeit artificially generated, corpora. In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods (Cohn, Atlas, & Ladner, 1994) attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance. 1.
kValued NonAssociative Lambek Grammars Are Learnable From FunctionArgument Structures
 ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE
, 2003
"... This paper is concerned with learning categorial grammars in the model of Gold. We show that rigid and kvalued nonassociative Lambek grammars are learnable from functionargument structured sentences. In fact, functionargument structures are natural syntactical decompositions of sentences in sub ..."
This paper is concerned with learning categorial grammars in the model of Gold. We show that rigid and kvalued nonassociative Lambek grammars are learnable from functionargument structured sentences. In fact, functionargument structures are natural syntactical decompositions of sentences in subcomponents with the indication of the head of each subcomponent. This result is
The Learning and Emergence of Mildly Context Sensitive Languages
"... This paper describes a framework for studies of the adaptive acquisition and evolution of language, with the following components: language learning begins by associating words with cognitively salient representations ("grounding"); the sentences of each language are determined by prop ..."
This paper describes a framework for studies of the adaptive acquisition and evolution of language, with the following components: language learning begins by associating words with cognitively salient representations ("grounding"); the sentences of each language are determined by properties of lexical items, and so only these need to be transmitted by learning; the learnable languages allow multiple agreements, multiple crossing agreements, and reduplication, as mildly context sensitive and human languages do; infinitely many different languages are learnable; many of the learnable languages include infinitely many sentences; in each language, inferential processes can be defined over succinct representations of the derivations themselves; the languages can be extended by innovative responses to communicative demands. Preliminary analytic results and a robotic implementation are described.
Mathematics of language learning
, 2009
"... This paper surveys prominent mathematical approaches to language learning, with an emphasis on the common fundamental assumptions of various approaches. All approaches adopt some restrictive assumption about the nature of relevant causal influences, with much ongoing work directed to the problem o ..."
This paper surveys prominent mathematical approaches to language learning, with an emphasis on the common fundamental assumptions of various approaches. All approaches adopt some restrictive assumption about the nature of relevant causal influences, with much ongoing work directed to the problem of discovery and justification of these assumptions.
Learning Lambek grammars from proof frames
"... Abstract. In addition to their limpid interface with semantics, the original categorial grammars introduced by Lambek 55 years ago enjoys another important property: learnability. After a short reminder on grammatical inference à la Gold, we provide an algorithm that learns rigid Lambek grammars wit ..."
Abstract. In addition to their limpid interface with semantics, the original categorial grammars introduced by Lambek 55 years ago enjoys another important property: learnability. After a short reminder on grammatical inference à la Gold, we provide an algorithm that learns rigid Lambek grammars with product from proof frames that are name free proof nets a generalisation of functor argument structures to those grammars — that are already known to be unlearnable from strings, as shown by Foret and Le Nir. This result strictly encompasses our previous positive results on learning Lambek grammars without product The result can be extended to kvalued versions of these grammars using kunification although, as expected, algorithmic complexity becomes qui high. Our algorithm combines a proof net version of the principal type scheme algorithm of lambda calculus together with the unification algorithm for syntactic categories, as first explored by Buszkowski and Penn. We thereafter we provide a simple proof of the convergence of this algorithm inspired from the one by Kanazawa. Proof frames may seem complex structures to learn from, but they look like dependency structure that can be found in annotated corpora, and, as we show at the end of the paper, when the product is not used, proof frames exactly correspond to natural deduction frames that extend the functor argument structures that are commonly used for learning basic categorial grammars. We are sad to dedicate the present paper to Philippe Darondeau, with whom we started to study such questions in Rennes at the beginning of the millennium, and who passed away prematurely. We are glad to dedicate the present paper to Jim Lambek for his 90 birthday: he shows that research is an endless learning. 1
Theoretical Informatics and Applications Informatique Théorique et Applications Will be set by the publisher LEARNING DISCRETE CATEGORIAL GRAMMARS FROM STRUCTURES
"... Abstract. We define the class of discrete classical categorial grammars, similar in the spirit to the notion of reversible class of languages introduced by Angluin and Sakakibara. We show that the class of discrete classical categorial grammars is identifiable from positive structured examples. For ..."
Abstract. We define the class of discrete classical categorial grammars, similar in the spirit to the notion of reversible class of languages introduced by Angluin and Sakakibara. We show that the class of discrete classical categorial grammars is identifiable from positive structured examples. For this, we provide an original algorithm, which runs in quadratic time in the size of the examples. This work extends the previous results of Kanazawa. Indeed, in our work, several types can be associated to a word and the class is still identifiable in polynomial time. We illustrate the relevance of the class of discrete classical categorial grammars with linguistic examples. 1991 Mathematics Subject Classification. 68Q32,68T50,03B47. In 1988, I was a student in ”Maîtrise de Mathématques Discrêtes” at Lyon University, and Serge Grigorieff was one of my professors. I was fascinating by the course on computability that he gave. I followed him when he moved to the university Paris 7 and I made a master thesis on Kolmogorov complexity. Then I made a PhD thesis under his supervision. I got a lot out of our discussions and our works in his small office in Jussieu, and in particular the ability to ask questions and to study relationships between information, complexity and computability. We wrote together a paper on Kolmogorov complexity several years later, and I studied computational linguistic and grammatical inference, with the same spirit. This paper with Jérôme, which is my first PhD student, follows this line and is dedicated to him. I owe a lot to him and not only from a scientific point of view...
Rigid Lambek grammars are not learnable from strings
"... This paper is concerned with learning categorial grammars in Gold's model (Gold, 1967). Recently, learning algorithms in this model have been proposed for some particular classes of classical categorial grammars (Kanazawa, 1998). ..."
This paper is concerned with learning categorial grammars in Gold's model (Gold, 1967). Recently, learning algorithms in this model have been proposed for some particular classes of classical categorial grammars (Kanazawa, 1998).
On Intermediate Structures for NonAssociative Lambek Grammars and Learnability
, 2004
"... This paper is concerned with learning categorial grammars in the model of Gold. We show that rigid and kvalued nonassociative Lambek (NL) grammars are not learnable from wellbracketed sentences. In contrast to ..."
This paper is concerned with learning categorial grammars in the model of Gold. We show that rigid and kvalued nonassociative Lambek (NL) grammars are not learnable from wellbracketed sentences. In contrast to