Results 1 - 10
of
55
Principles and implementation of deductive parsing
- JOURNAL OF LOGIC PROGRAMMING
, 1995
"... We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generaliz ..."
Abstract
-
Cited by 150 (4 self)
- Add to MetaCart
We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generalizes easily to parsers for augmented phrase structure formalisms, such as definiteclause grammars and other logic grammar formalisms, and has been used for rapid prototyping of parsing algorithms for a variety of formalisms including variants of tree-adjoining grammars, categorial grammars, and lexicalized context-free grammars.
Parsing Inside-Out
, 1998
"... Probabilistic Context-Free Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probabili ..."
Abstract
-
Cited by 65 (2 self)
- Add to MetaCart
Probabilistic Context-Free Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probability that any given non-terminal covers any piece of the input sentence. The traditional use of these probabilities is to improve the probabilities of grammar rules. In this thesis we show that these values are useful for solving many other problems in Statistical Natural Language Processing. We give a framework for describing parsers. The framework generalizes the inside and outside values to semirings. It makes it easy to describe parsers that compute a wide variety of interesting quantities, including the inside and outside probabilities, as well as related quantities such as Viterbi probabilities and n-best lists. We also present three novel uses for the inside and outside probabilities. T...
Recovering Latent Information in Treebanks
- In Proceedings of COLING 2002
, 2002
"... Many recent statistical parsers rely on a preprocessing step which uses hand-written, corpus-specific rules to augment the training data with extra information. For example, head-finding rules are used to augment node labels with lexical heads. In this paper, we provide machinery to reduce the amoun ..."
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
Many recent statistical parsers rely on a preprocessing step which uses hand-written, corpus-specific rules to augment the training data with extra information. For example, head-finding rules are used to augment node labels with lexical heads. In this paper, we provide machinery to reduce the amount of human e#ort needed to adapt existing models to new corpora: first, we propose a flexible notation for specifying these rules that would allow them to be shared by di#erent models; second, we report on an experiment to see whether we can use ExpectationMaximization to automatically fine-tune a set of hand-written rules to a particular corpus.
Global Thresholding and Multiple-Pass Parsing
, 1997
"... We present a variation on classic beam thresholding techniques that is up to an order of magnitude faster than the traditional method, at the same performance level. We also present a new thresholding technique, global thresholding, which, combined with the new beam thresholding, gives an addi ..."
Abstract
-
Cited by 36 (3 self)
- Add to MetaCart
We present a variation on classic beam thresholding techniques that is up to an order of magnitude faster than the traditional method, at the same performance level. We also present a new thresholding technique, global thresholding, which, combined with the new beam thresholding, gives an additional factor of two improvement, and a novel technique, multiple pass parsing, that can be combined with the others to yield yet another 50% improvement. We use a new search algorithm to simultaneously op- timize the thresholding parameters of the various algorithms.
Statistical Parsing With an Automatically-Extracted Tree Adjoining Grammar
, 2000
"... We discuss the advantages of lexicalized tree-adjoining grammar as an alternative to lexicalized PCFG for statistical parsing, describing the induction of a probabilistic LTAG model from the Penn Treebank and evaluating its parsing performance. We find that this induction method is an improvement ov ..."
Abstract
-
Cited by 33 (1 self)
- Add to MetaCart
We discuss the advantages of lexicalized tree-adjoining grammar as an alternative to lexicalized PCFG for statistical parsing, describing the induction of a probabilistic LTAG model from the Penn Treebank and evaluating its parsing performance. We find that this induction method is an improvement over the EM-based method of [Hwa, 1998], and that the induced model yields results comparable to lexicalized PCFG.
Statistical Parsing With an Automatically Extracted Tree Adjoining Grammar
, 2003
"... Introduction Why use tree adjoining grammars (TAG) for statistical parsing? It might be thought that its added formal power makes parameter estimation unnecessarily di#cult; or that whatever benefits it provides---the ability to model unbounded cross-serial dependencies, for example--- are inconseq ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Introduction Why use tree adjoining grammars (TAG) for statistical parsing? It might be thought that its added formal power makes parameter estimation unnecessarily di#cult; or that whatever benefits it provides---the ability to model unbounded cross-serial dependencies, for example--- are inconsequential for statistical parsing, which is concerned with the probable rather than the possible. But just as TAG is not by itself a complete linguistic theory, but a formalism for specifying linguistic theories, it should not be viewed as a statistical model but a formalism for specifying statistical models. The advantage that TAG has over CFG is that it assigns richer structural descriptions to sentences; specifically, in addition to parse trees, it assigns derivation trees (defined below) on which features of a parsing model can be defined. In this chapter we explore the use of TAG for statistical parsing. We start by examining PCFG-based parsers which use head-lexicalization to capture
Automatic Extraction of Stochastic Lexicalized Tree Grammars from Treebanks
- PROCEEDINGS OF THE 4TH WORKSHOP ON TREE-ADJOINING GRAMMARS AND RELATED FRAMEWORKS
, 1998
"... We present a method for the extraction of stochastic lexicalized tree grammars (SLTG) of different complexities from existing treebanks, which allows us to analyze the relationship of a grammar automatically induced from a treebank wrt. its size, its complexity, and its predictive power on un ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
We present a method for the extraction of stochastic lexicalized tree grammars (SLTG) of different complexities from existing treebanks, which allows us to analyze the relationship of a grammar automatically induced from a treebank wrt. its size, its complexity, and its predictive power on unseen data. Processing
Learning Stochastic Lexicalized Tree Grammars from HPSG
- Computers and Mathematics with Applications, Pergamon Press
, 1999
"... We 1 present a method for automatically extracting a Stochastic Lexicalized Tree Grammar (SLTG) from an HPSG source grammar and a given corpus. Processing of a SLTG is performed by a specialized fast parser. The approach has been tested on a large English grammar and has been shown to achieve a sp ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
We 1 present a method for automatically extracting a Stochastic Lexicalized Tree Grammar (SLTG) from an HPSG source grammar and a given corpus. Processing of a SLTG is performed by a specialized fast parser. The approach has been tested on a large English grammar and has been shown to achieve a speed-up by a factor of better than 10 compared to parsing with a highly tuned HPSG parser. Our approach is simple and transparent, and comes with no magic tuning strategies. The extracted grammars are declaratively represented and have a high degree of practical applicability. 1 Introduction Head Driven Phrase Structure Grammar (HPSG) has proven to be a quite successful formalism for specifying natural language grammars in a highly modular and compact manner [ Pollard and Sag, 1994 ] supporting the definition of complex linguistic information and interactions between information 1 Neumann was supported by a research grant from the German Federal Ministry of Education, Science, Research and...
The Computational Complexity of the Correct-Prefix Property for TAGs
- COMPUTATIONAL LINGUISTICS
, 1999
"... ..."
An Empirical Evaluation of Probabilistic Lexicalized Tree Insertion Grammars
, 1998
"... We present an empirical study of the applicability of Probabilistic Lexicalized Tree Inser- tion Grammars (PLTIG), a lexicalized counter- part to Probabilistic Context-Free Grammars (PCFG), to problems in stochastic naturallanguage processing. Comparing the performance of PLTIGs with non-hierarchica ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
We present an empirical study of the applicability of Probabilistic Lexicalized Tree Inser- tion Grammars (PLTIG), a lexicalized counter- part to Probabilistic Context-Free Grammars (PCFG), to problems in stochastic naturallanguage processing. Comparing the performance of PLTIGs with non-hierarchical N-gram models and PCFGs, we show that PLTIG com- bines the best aspects of both, with language modeling capability comparable to N-grams, and improved parsing performance over its nonlexicalized counterpart. Furthermore, training of PLTIGs displays faster convergence than PCFGs.

