Statistical Parsing Algorithms for Lexicalized Tree Adjoining Grammars
| Citations: | 1 - 0 self |
BibTeX
@MISC{Sarkar_statisticalparsing,
author = {Anoop Sarkar},
title = {Statistical Parsing Algorithms for Lexicalized Tree Adjoining Grammars },
year = {}
}
OpenURL
Abstract
The goal of this dissertation is two-fold: to develop the theory of probabilistic Tree Adjoining Grammars (TAGs) and to present some practical results in the form of efficient parsing and estimation algorithms for probabilistic TAGs. The overall goal of developing the theory of probabilistic TAGs is to provide a simple, mathematically and linguistically well-formed probabilistic framework for statistical parsing. The practical results in parsing and estimation of probabilistic TAGs are developed with a view towards an increasingly unsupervised approach to the training of statistical parsers and language models. In particular, this proposal contains the following results: An algorithm for determining deficiency in a generative model for probabilistic TAGs. Anovel chart based head-corner parsing algorithm for probabilistic TAGs. A probability model for statistical parsing and a co-training method for training this parser which combines labeled and unlabeled data. An algorithm for computing prefix probabilities which can be used to predict the word most likely to occur after an initial substring of the input. The proposed work can be summarized in the following points: A separate evaluation of the co-training algorithm on a larger set of labeled and unlabeled data, in addition to the evaluation presented in this proposal. An evaluation of the pre x probability algorithm by comparing it with a trigram language model. An extension of techniques in learning subcategorization information and verb classes to produce TAG lexicons which can be directly used to improve performance of the co-training algorithm.







