Results 1 - 10
of
12
Novel Estimation Methods for Unsupervised Discovery of Latent Structure in Natural Language Text
, 2006
"... This thesis is about estimating probabilistic models to uncover useful hidden structure in data; specifically, we address the problem of discovering syntactic structure in natural language text. We present three new parameter estimation techniques that generalize the standard approach, maximum likel ..."
Abstract
-
Cited by 20 (7 self)
- Add to MetaCart
This thesis is about estimating probabilistic models to uncover useful hidden structure in data; specifically, we address the problem of discovering syntactic structure in natural language text. We present three new parameter estimation techniques that generalize the standard approach, maximum likelihood estimation, in different ways. Contrastive estimation maximizes the conditional probability of the observed data given a “neighborhood” of implicit negative examples. Skewed deterministic annealing locally maximizes likelihood using a cautious parameter search strategy that starts with an easier optimization problem than likelihood, and iteratively moves to harder problems, culminating in likelihood. Structural annealing is similar, but starts with a heavy bias toward simple syntactic structures and gradually relaxes the bias. Our estimation methods do not make use of annotated examples. We consider their performance in both an unsupervised model selection setting, where models trained under different initialization and regularization settings are compared by evaluating the training objective on a small set of unseen, unannotated development data, and supervised model selection, where the most accurate model on the development set (now with annotations)
Identification in the limit of substitutable context-free languages
- ALT
, 2005
"... This paper formalisms the idea of substitutability introduced by Zellig Harris in the 1950s and makes it the basis for a learning algorithm from positive data only for a subclass of context-free grammars.
We show that there is a polynomial characteristic set, and thus prove polynomial identificatio ..."
Abstract
-
Cited by 11 (6 self)
- Add to MetaCart
This paper formalisms the idea of substitutability introduced by Zellig Harris in the 1950s and makes it the basis for a learning algorithm from positive data only for a subclass of context-free grammars.
We show that there is a polynomial characteristic set, and thus prove polynomial identification in the limit of this class. We discuss the relationship of this class of languages to other common classes discussed in grammatical inference. We also discuss modifications to the algorithm
that produces a reduction system rather than a context-free grammar, that will be much more compact. We discuss the relationship to Angluin’s notion of reversibility for regular languages.
A Computational Model for Early Argument Structure Acquisition
"... How children go about learning the general regularities that govern language, as well as keeping track of the exceptions to them, remains one of the challenging open questions in the cognitive science of language. Computational modeling is an important methodology in research aimed at addressing thi ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
How children go about learning the general regularities that govern language, as well as keeping track of the exceptions to them, remains one of the challenging open questions in the cognitive science of language. Computational modeling is an important methodology in research aimed at addressing this issue. We must determine appropriate learning mechanisms that can grasp generalizations from examples of specific usages, and that exhibit patterns of behaviour over the course of learning similar to those in children. Early learning of verb argument structure is an area of language acquisition that provides an interesting testbed for such approaches due to the complexity of verb usages. A range of linguistic factors interact in determining the felicitous use of a verb in various constructions—associations between syntactic forms and properties of meaning, that form the basis for a number of linguistic and psycholinguistic theories of language. We present a computational model for the representation, acquisition, and use of verbs and constructions. Our Bayesian framework is founded on a novel view of constructions as a probabilistic association between syntactic and semantic features. The computational experiments reported here demonstrate the feasibility of learning general constructions, and their exceptions, from individual usages of verbs. The behaviour of the model over the timecourse of acquisition mimics in relevant aspects the stages of learning exhibited by children. Our proposal thus sheds light on the possible mechanisms at work in forming linguistic generalizations and maintaining knowledge of exceptions. 1
Bridging Computational, Formal and Psycholinguistic Approaches to Language
- IN PROC. OF THE 26TH CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY
, 2004
"... We compare our model of unsupervised learning of linguistic structures, ADIOS [1, 2, 3], to some recent work in computational linguistics and in grammar theory. Our approach resembles the Construction Grammar in its general philosophy (e.g., in its reliance on structural generalizations rather t ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
We compare our model of unsupervised learning of linguistic structures, ADIOS [1, 2, 3], to some recent work in computational linguistics and in grammar theory. Our approach resembles the Construction Grammar in its general philosophy (e.g., in its reliance on structural generalizations rather than on syntax projected by the lexicon, as in the current generative theories) , and the Tree Adjoining Grammar in its computational characteristics (e.g., in its apparent affinity with Mildly Context Sensitive Languages). The representations learned by our algorithm are truly emergent from the (unannotated) corpus data, whereas those found in published works on cognitive and construction grammars and on TAGs are hand-tailored. Thus, our results complement and extend both the computational and the more linguistically oriented research into language acquisition.
Motif extraction and protein classification
- In Proc. Computational Systems Bioinformatics (CSB). 80–85
, 2005
"... We introduce an unsupervised method for extracting meaningful motifs from biological sequence data. This de novo motif extraction (MEX) algorithm is data driven, finding motifs that are not necessarily over-represented in the entire corpus, yet display over-representation within local contexts. We a ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We introduce an unsupervised method for extracting meaningful motifs from biological sequence data. This de novo motif extraction (MEX) algorithm is data driven, finding motifs that are not necessarily over-represented in the entire corpus, yet display over-representation within local contexts. We apply our method to the problem of deriving functional classification of proteins from their sequence information. Applying MEX to amino-acid sequences of a group of enzymes, we obtain a set of motifs that serves as the basis for description of these proteins. This motif-space, derived from sequence data only, is then used as a basis for functional classification by an SVM classifier. Using the set of the oxidoreductase super-family, with about 7000 enzymes, we show that classification based on MEX motifs surpasses that of two other SVM based methods: SVMProt that relies on physical and chemical properties of the protein sequence of amino-acids, and SVM applied to a Smith-Waterman distance matrix. This demonstrates the effectiveness of our MEX algorithm, and the feasibility of sequence-tofunction classification. keywords motif extraction, enzyme classification
Learning Syntactic Constructions from Raw Corpora
- 29TH BOSTON UNIVERSITY CONFERENCE ON LANGUAGE DEVELOPMENT
, 2005
"... ... a lexicon populated by units of various sizes, as envisaged by (Langacker, 1987). Constructions may be specified completely, as in the case of simple morphemes or idioms such as take it to the bank, or partially, as in the expression what’s X doing Y?, where X and Y are slots that admit fillers ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
... a lexicon populated by units of various sizes, as envisaged by (Langacker, 1987). Constructions may be specified completely, as in the case of simple morphemes or idioms such as take it to the bank, or partially, as in the expression what’s X doing Y?, where X and Y are slots that admit fillers of particular types (Kay and Fillmore, 1999). Constructions offer an intriguing alternative to traditional rule-based syntax by hinting at the extent to which the complexity of language can stem from a rich repertoire of stored, more or less entrenched (Harris, 1998) representations that address both syntactic and semantic issues, and encompass, in addition to general rules, “totally idiosyncratic forms and patterns of all intermediate degrees of generality ” (Langacker, 1987, p.46). Because constructions are by their very nature language-specific, the question of acquisition in Construction Grammar is especially poignant. We address this issue by offering an unsupervised algorithm that learns constructions from raw corpora.
Unsupervised language acquisition: syntax from plain corpus
, 2004
"... We describe results of a novel algorithm for grammar induction from a large corpus. The ADIOS (Automatic DIstillation of Structure) algorithm searches for significant patterns, chosen according to context dependent statistical criteria, and builds a hierarchy of such patterns according to a set of r ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We describe results of a novel algorithm for grammar induction from a large corpus. The ADIOS (Automatic DIstillation of Structure) algorithm searches for significant patterns, chosen according to context dependent statistical criteria, and builds a hierarchy of such patterns according to a set of rules leading to structured generalization. The corpus is thus generalized into a context free grammar (CFG), composed of patterns, equivalence classes and words of the initial lexicon. We have evaluated our method both on corpora generated by CFG and on natural language ones. The performance of ADIOS is judged by searching for both good recall (acceptance of correct novel sentences) and good precision (production of correct novel sentences). The results are very encouraging.
A Deterministic Dynamic Associative Memory (DDAM) Model for Concept Space Representation
, 2006
"... ..."
Some Tests of an Unsupervised Model of Language Acquisition
, 2004
"... We outline an unsupervised language acquisition algorithm and offer some psycholinguistic support for a model based on it. Our approach resembles the Construction Grammar in its general philosophy, and the Tree Adjoining Grammar in its computational characteristics. The model is trained on a corpus ..."
Abstract
- Add to MetaCart
We outline an unsupervised language acquisition algorithm and offer some psycholinguistic support for a model based on it. Our approach resembles the Construction Grammar in its general philosophy, and the Tree Adjoining Grammar in its computational characteristics. The model is trained on a corpus of transcribed child-directed speech (CHILDES). The model's ability to process novel inputs makes it capable of taking various standard tests of English that rely on forced-choice judgment and on magnitude estimation of linguistic acceptability. We report encouraging results from several such tests, and discuss the limitations revealed by other tests in our present method of dealing with novel stimuli.

