Results 1 - 10
of
223
Discriminative Reranking for Natural Language Parsing
, 2005
"... This article considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon this i ..."
Abstract
-
Cited by 220 (8 self)
- Add to MetaCart
This article considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon this initial ranking, using additional features of the tree as evidence. The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define a derivation or a generative model which takes these features into account. We introduce a new method for the reranking task, based on the boosting approach to ranking problems described in Freund et al. (1998). We apply the boosting method to parsing the Wall Street Journal treebank. The method combined the log-likelihood under a baseline model (that of Collins [1999]) with evidence from an additional 500,000 features over parse trees that were not included in the original model. The new model achieved 89.75 % F-measure, a 13 % relative decrease in F-measure error over the baseline model’s score of 88.2%. The article also introduces a new algorithm for the boosting approach which takes advantage of the sparsity of the feature space in the parsing data. Experiments show significant efficiency gains for the new algorithm over the obvious implementation of the boosting approach. We argue that the method is an appealing alternative—in terms of both simplicity and efficiency—to work on feature selection methods within log-linear (maximum-entropy) models. Although the experiments in this article are on natural language parsing (NLP), the approach should be applicable to many other NLP problems which are naturally framed as ranking tasks, for example, speech recognition, machine translation, or natural language generation.
Principles and implementation of deductive parsing
- JOURNAL OF LOGIC PROGRAMMING
, 1995
"... We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generaliz ..."
Abstract
-
Cited by 150 (4 self)
- Add to MetaCart
We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generalizes easily to parsers for augmented phrase structure formalisms, such as definiteclause grammars and other logic grammar formalisms, and has been used for rapid prototyping of parsing algorithms for a variety of formalisms including variants of tree-adjoining grammars, categorial grammars, and lexicalized context-free grammars.
Parsing Strategies with 'Lexicalized' Grammars: Application to Tree Adjoining Grammars
- IN PROCEEDINGS OF THE 12 TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS (COLING'88
, 1988
"... In this paper we present a general parsing strategy that arose from the development of an Earicy-type parsing algorithm for TAGs (Schabes and Joshi 1988) and from recent linguistic work in TAGs (Abeille 1988). In our approach ..."
Abstract
-
Cited by 111 (15 self)
- Add to MetaCart
In this paper we present a general parsing strategy that arose from the development of an Earicy-type parsing algorithm for TAGs (Schabes and Joshi 1988) and from recent linguistic work in TAGs (Abeille 1988). In our approach
Supertagging: An Approach to Almost Parsing
- Computational Linguistics
, 1999
"... this paper, we have proposed novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated wit ..."
Abstract
-
Cited by 109 (17 self)
- Add to MetaCart
this paper, we have proposed novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated with rich descriptions (Supertags) that impose complex constraints in a local context. The supertags are designed such that only those elements on which the lexical item imposes constraints appear within a given supertag. Further, each lexical item is associated with as many supertags as the number of different syntactic contexts in which the lexical item can appear. This makes the number of different descriptions for each lexical item much larger, than when the descriptions are less complex; thus increasing the local ambiguity for a parser. But this local ambiguity can be resolved by using statistical distributions of supertag co-occurrences collected from a corpus of parses. We have explored these ideas in the context of Lexicalized Tree-Adjoining Grammar (LTAG) framework. The supertags in LTAG combine both phrase structure information and dependency information in a single representation. Supertag disambiguation results in a representation that is effectively a parse (almost parse), and the parser needs `only' combine the individual supertags. This method of parsing can also be used to parse sentence fragments such as in spoken utterances where the disambiguated supertag sequence may not combine into a single structure. 1 Introduction In this paper, we present a robust parsing approach called supertagging that integrates the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. The idea underlying the approach is that the ...
Revision-Based Generation of Natural Language Summaries Providing Historical Background -- Corpus-Based Analysis, Design, Implementation and Evaluation
, 1994
"... Automatically summarizing vast amounts of on-line quantitative data with a short natural language paragraph has a wide range of real-world applications. However, this specific task raises a number of difficult issues that are quite distinct from the generic task of language generation: conciseness, ..."
Abstract
-
Cited by 100 (6 self)
- Add to MetaCart
Automatically summarizing vast amounts of on-line quantitative data with a short natural language paragraph has a wide range of real-world applications. However, this specific task raises a number of difficult issues that are quite distinct from the generic task of language generation: conciseness, complex sentences, floating concepts, historical background, paraphrasing power and implicit content. In this thesis, I address these specific issues by proposing a new generation model in which a first pass builds a draft containing only the essential new facts to report and a second pass incrementally revises this draft to opportunistically add as many background facts as can fit within the space limit. This model requires a new type of linguistic knowledge: revision operations, which specifyies the various ways a draft can...
D-Tree Grammars
"... DTG are designed to share some of the advantages of TAG while overcoming some of its limitations. DTG involve two composition operations called subsertion and sister-adjunction. The most distinctive feature of DTG is that, unlike TAG, there is complete uniformity in the way that the two DTG operatio ..."
Abstract
-
Cited by 96 (16 self)
- Add to MetaCart
DTG are designed to share some of the advantages of TAG while overcoming some of its limitations. DTG involve two composition operations called subsertion and sister-adjunction. The most distinctive feature of DTG is that, unlike TAG, there is complete uniformity in the way that the two DTG operations relate lexical items: subsertion always corresponds to complementation and sister-adjunction to modification. Furthermore, DTG, unlike TAG, can provide a uniform analysis for whmovement in English and Kashmiri, despite the fact that the wh element in Kashmiri appears in sentence-second position, and not sentence-initial position as in English.
Sentence Planning as Description Using Tree Adjoining Grammar
- IN PROCEEDINGS OF ACL
, 1997
"... We present an algorithm for simultaneously constructing both the syntax and semantics of a sentence using a Lexicalized Tree Adjoining Grammar (LTAG). This approach captures naturally and elegantly the interaction between pragmatic and syntactic constraints on descriptions in a sentence, and th ..."
Abstract
-
Cited by 86 (16 self)
- Add to MetaCart
We present an algorithm for simultaneously constructing both the syntax and semantics of a sentence using a Lexicalized Tree Adjoining Grammar (LTAG). This approach captures naturally and elegantly the interaction between pragmatic and syntactic constraints on descriptions in a sentence, and the inferential interactions between multiple descriptions in a sentence. At the same

