Results 1 -
5 of
5
Inducing Tree-Substitution Grammars
"... Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring that it can be readily learned from data. The majority of existing work on grammar induction has favoured model simplicity (and thus learnability) over representational capacity by using context free grammars and first order dependency grammars, which are not sufficiently expressive to model many common linguistic constructions. We propose a novel compromise by inferring a probabilistic tree substitution grammar, a formalism which allows for arbitrarily large tree fragments and thereby better represent complex linguistic structures. To limit the model’s complexity we employ a Bayesian non-parametric prior which biases the model towards a sparse grammar with shallow productions. We demonstrate the model’s efficacy on supervised phrase-structure parsing, where we induce a latent segmentation of the training treebank, and on unsupervised dependency grammar induction. In both cases the model uncovers interesting latent linguistic structures while producing competitive results.
Covariance in Unsupervised Learning of Probabilistic Grammars
"... Probabilistic grammars offer great flexibility in modeling discrete sequential data like natural language text. Their symbolic component is amenable to inspection by humans, while their probabilistic component helps resolve ambiguity. They also permit the use of well-understood, generalpurpose learn ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Probabilistic grammars offer great flexibility in modeling discrete sequential data like natural language text. Their symbolic component is amenable to inspection by humans, while their probabilistic component helps resolve ambiguity. They also permit the use of well-understood, generalpurpose learning algorithms. There has been an increased interest in using probabilistic grammars in the Bayesian setting. To date, most of the literature has focused on using a Dirichlet prior. The Dirichlet prior has several limitations, including that it cannot directly model covariance between the probabilistic grammar’s parameters. Yet, various grammar parameters are expected to be correlated because the elements in language they represent share linguistic properties. In this paper, we suggest an alternative to the Dirichlet prior, a family of logistic normal distributions. We derive an inference algorithm for this family of distributions and experiment with the task of dependency grammar induction, demonstrating performance improvements with our priors on a set of six treebanks in different natural languages. Our covariance framework permits soft parameter tying within grammars and across grammars for text in different languages, and we show empirical gains in a novel learning setting using bilingual, non-parallel data.
Modeling Perspective using Adaptor Grammars
"... Strong indications of perspective can often come from collocations of arbitrary length; for example, someone writing get the government out of my X is typically expressing a conservative rather than progressive viewpoint. However, going beyond unigram or bigram features in perspective classification ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Strong indications of perspective can often come from collocations of arbitrary length; for example, someone writing get the government out of my X is typically expressing a conservative rather than progressive viewpoint. However, going beyond unigram or bigram features in perspective classification gives rise to problems of data sparsity. We address this problem using nonparametric Bayesian modeling, specifically adaptor grammars (Johnson et al., 2006). We demonstrate that an adaptive naïve Bayes model captures multiword lexical usages associated with perspective, and establishes a new state-of-the-art for perspective classification results using the Bitter Lemons corpus, a collection of essays about mid-east issues from Israeli and Palestinian points of view. 1
Synergies in learning words and their referents
"... This paper presents Bayesian non-parametric models that simultaneously learn to segment words from phoneme strings and learn the referents of some of those words, and shows that there is a synergistic interaction in the acquisition of these two kinds of linguistic information. The models themselves ..."
Abstract
- Add to MetaCart
This paper presents Bayesian non-parametric models that simultaneously learn to segment words from phoneme strings and learn the referents of some of those words, and shows that there is a synergistic interaction in the acquisition of these two kinds of linguistic information. The models themselves are novel kinds of Adaptor Grammars that are an extension of an embedding of topic models into PCFGs. These models simultaneously segment phoneme sequences into words and learn the relationship between non-linguistic objects to the words that refer to them. We show (i) that modelling inter-word dependencies not only improves the accuracy of the word segmentation but also of word-object relationships, and (ii) that a model that simultaneously learns word-object relationships and word segmentation segments more accurately than one that just learns word segmentation on its own. We argue that these results support an interactive view of language acquisition that can take advantage of synergies such as these. 1
LIF at TAC Multiling: Towards a Truly Language Independent Summarizer
"... This paper presents the LIF system for the TAC’2011 Multilingual pilot track. We followed a language-independent approach to summarization for this task. In particular, we tried to remove the following dependences to language: sentence segmentation, word segmentation, stop-word lists, and word-level ..."
Abstract
- Add to MetaCart
This paper presents the LIF system for the TAC’2011 Multilingual pilot track. We followed a language-independent approach to summarization for this task. In particular, we tried to remove the following dependences to language: sentence segmentation, word segmentation, stop-word lists, and word-level relevance assessment. We applied these modifications to an MMR-based system and observed little degradation on English data. The submitted system had a bug that impeded all official results, therefore we propose in this paper an updated set of results with relevant analysis. 1

