Results 1 - 10
of
64
The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts
, 1997
"... This thesis is an inquiry into the nature of the high-level, rhetorical structure of unrestricted natural language texts, computational means to enable its derivation, and two applications (in automatic summarization and natural language generation) that follow from the ability to build such structu ..."
Abstract
-
Cited by 98 (9 self)
- Add to MetaCart
This thesis is an inquiry into the nature of the high-level, rhetorical structure of unrestricted natural language texts, computational means to enable its derivation, and two applications (in automatic summarization and natural language generation) that follow from the ability to build such structures automatically. The thesis proposes a first-order formalization of the high-level, rhetorical structure of text. The formalization assumes that text can be sequenced into elementary units; that discourse relations hold between textual units of various sizes; that some textual units are more important to the writer's purpose than others; and that trees are a good approximation of the abstract structure of text. The formalization also introduces a linguistically motivated compositionality criterion, which is shown to hold for the text structures that are valid. The thesis proposes, analyzes theoretically, and compares empirically four algorithms for determining the valid text structures of ...
Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques
- IN PROCEEDINGS OF THE 40TH MEETING OF THE ACL
, 2002
"... We present a stochastic parsing system consisting of a Lexical-Functional Grammar (LFG), a constraint-based parser and a stochastic disambiguation model. We report on the results of applying this system to parsing the UPenn Wall Street Journal (WSJ) treebank. The model combines full and parti ..."
Abstract
-
Cited by 95 (8 self)
- Add to MetaCart
We present a stochastic parsing system consisting of a Lexical-Functional Grammar (LFG), a constraint-based parser and a stochastic disambiguation model. We report on the results of applying this system to parsing the UPenn Wall Street Journal (WSJ) treebank. The model combines full and partial parsing techniques to reach full grammar coverage on unseen data. The treebank annotations are used to provide partially labeled data for discriminative statistical estimation using exponential models. Disambiguation performance is evaluated by measuring matches of predicate-argument relations on two distinct test sets. On a gold standard of manually annotated f-structures for a subset of the WSJ treebank, this evaluation reaches 79% F-score. An evaluation on a gold standard of dependency relations for Brown corpus data achieves 76% F-score.
The Parallel Grammar Project
, 2002
"... We report on the Parallel Grammar (ParGram) project which uses the XLE parser and grammar development platform for six languages: English, French, German, Japanese, Norwegian, and Urdu. ..."
Abstract
-
Cited by 77 (24 self)
- Add to MetaCart
We report on the Parallel Grammar (ParGram) project which uses the XLE parser and grammar development platform for six languages: English, French, German, Japanese, Norwegian, and Urdu.
The PARC 700 Dependency Bank
- In Proceedings of the 4th International Workshop on Linguistically Interpreted Corpora (LINC-03
, 2003
"... In this paper we discuss the construction, features, and current uses of the PARC 700 DEPBANK. The PARC 700 DEPBANK is a dependency bank containing predicate-argument relations and a wide variety of other grammatical features. ..."
Abstract
-
Cited by 54 (6 self)
- Add to MetaCart
In this paper we discuss the construction, features, and current uses of the PARC 700 DEPBANK. The PARC 700 DEPBANK is a dependency bank containing predicate-argument relations and a wide variety of other grammatical features.
Speed and accuracy in shallow and deep stochastic parsing
- IN PROCEEDINGS OF HLT-NAACL’04
, 2004
"... This paper reports some initial experiments that compare the accuracy and performance of two stochastic parsing systems. The currently popular Collins parser is a shallow parser whose output contains more detailed semantically-relevant information than other such parsers. The XLE parser is a deep-pa ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
This paper reports some initial experiments that compare the accuracy and performance of two stochastic parsing systems. The currently popular Collins parser is a shallow parser whose output contains more detailed semantically-relevant information than other such parsers. The XLE parser is a deep-parsing system that couples a Lexical Functional Grammar to a loglinear disambiguation component and provides the much richer representations of LFG theory. We measured the accuracy of both systems against a gold standard of the PARC 700 dependency bank, and also measured their processing times. We found that the deep-parsing system to be significantly more accurate than the Collins parser with only a slight reduction in parsing speed.
Relating complexity to practical performance in parsing with wide-coverage unification grammars
, 1994
"... The paper demonstrates that exponential complexities with respect to grammar size and input length have little impact on the performance of three unification-based parsing algorithms, using a wide-coverage grammar. The results imply that tile study and optimisation of unification-based parsing must ..."
Abstract
-
Cited by 30 (6 self)
- Add to MetaCart
The paper demonstrates that exponential complexities with respect to grammar size and input length have little impact on the performance of three unification-based parsing algorithms, using a wide-coverage grammar. The results imply that tile study and optimisation of unification-based parsing must rely on empirical data until complexity theory can more accurately predict the practical behaviour of such parsers.
Comparison of Evaluation Metrics for a Broad Coverage Parser LREC Workshop: Beyond PARSEVAL Towards Improved Evaluation Measures for Parsing Systems
, 2002
"... This paper reports on the use of two distinct evaluation metrics for assessing a stochastic parsing model consisting of a broad-coverage Lexical-Functional Grammar (LFG), an efficient constraint-based parser and a stochastic disambiguation model. The first evaluation metric measures matches of predi ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
This paper reports on the use of two distinct evaluation metrics for assessing a stochastic parsing model consisting of a broad-coverage Lexical-Functional Grammar (LFG), an efficient constraint-based parser and a stochastic disambiguation model. The first evaluation metric measures matches of predicate-argument relations in LFG f-structures (henceforth the LFG annotation scheme) to a gold standard of manually annotated f-structures for a subset of the UPenn Wall Street Journal treebank. The other metric maps predicate-argument relations in LFG f-structures to dependency relations (henceforth DR annotations) as proposed by Carroll et al. (Carroll et al., 1999). For evaluation, these relations are matched against Carroll et al.’s gold standard which was manually annnotated on a subset of the Brown corpus. The parser plus stochastic disambiguator gives an F-measure of 79 % (LFG) or 73 % (DR) on the WSJ test set. This shows that the two evaluation schemes are similar in spirit, although accuracy is impaired systematically by mapping one annotation scheme to the other. A systematic loss of accuracy is incurred also by corpus variation: Training the stochastic disambiguation model on WSJ data and testing on Carroll et al.’s Brown corpus data yields an F-score of 74 % (DR) for dependency-relation match. A variant of this measure comparable to the measure reported by Carroll et al. yields an F-measure of 76%. We examine divergences between annotation schemes aiming at a future improvement of methods for assessing parser quality. 1.
Deep Dependencies from Context-Free Statistical Parsers: Correcting the Surface Dependency Approximation
- In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics
, 2004
"... We present a linguistically-motivated algorithm for reconstructing nonlocal dependency in broad-coverage context-free parse trees derived from treebanks. We use an algorithm based on loglinear classifiers to augment and reshape context-free trees so as to reintroduce underlying nonlocal dependencies ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
We present a linguistically-motivated algorithm for reconstructing nonlocal dependency in broad-coverage context-free parse trees derived from treebanks. We use an algorithm based on loglinear classifiers to augment and reshape context-free trees so as to reintroduce underlying nonlocal dependencies lost in the context-free approximation. We find that our algorithm compares favorably with prior work on English using an existing evaluation metric, and also introduce and argue for a new dependency-based evaluation metric. By this new evaluation metric our algorithm achieves 60 % error reduction on gold-standard input trees and 5 % error reduction on state-ofthe-art machine-parsed input trees, when compared with the best previous work. We also present the first results on nonlocal dependency reconstruction for a language other than English, comparing performance on English and German. Our new evaluation metric quantitatively corroborates the intuition that in a language with freer word order, the surface dependencies in context-free parse trees are a poorer approximation to underlying dependency structure. 1
Extensions to Constraint Dependency Parsing for Spoken Language Processing
- COMPUTER SPEECH AND LANGUAGE
, 1995
"... A text-based and spoken language processing framework based on the Constraint Dependency Grammar (CDG) developed by Maruyama [24, 25] is discussed. The scope of CDG is expanded to allow for the analysis of sentences containing lexically ambiguous words, to allow feature analysis in constraints, and ..."
Abstract
-
Cited by 21 (10 self)
- Add to MetaCart
A text-based and spoken language processing framework based on the Constraint Dependency Grammar (CDG) developed by Maruyama [24, 25] is discussed. The scope of CDG is expanded to allow for the analysis of sentences containing lexically ambiguous words, to allow feature analysis in constraints, and to efficiently process multiple sentence candidates that are likely to arise in spoken language processing. The benefits of the CDG parsing approach are summarized. Additionally, the development of CDG grammars using our grammar tools and parser is discussed.
From Parallel Grammar Development towards Machine Translation - A Project Overview -
- In Proceedings of MT Summit VII
, 1999
"... We give an overview of a MT research project jointly undertaken by Xerox PARC and XRCE Grenoble. The project builds on insights and resources in large-scale development of parallel LFG grammars. The research approach towards translation focuses on innovative computational technologies which lead to ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
We give an overview of a MT research project jointly undertaken by Xerox PARC and XRCE Grenoble. The project builds on insights and resources in large-scale development of parallel LFG grammars. The research approach towards translation focuses on innovative computational technologies which lead to a flexible translation architecture. Efficient processing of "packed" ambiguities not only enables ambiguity preserving transfer. It is at the heart of a flexible architectural design, open for various extensions which take the right decisions at the right time. 1 Introduction Most of the existing high-performance MT systems are based on linguistic technology of the 60s, whereas research in NLP has established "higher-level" syntactic formalisms which allow for specification and processing of declarative, reversible grammars that assign rich structures to natural language sentences. Syntactic theories like Lexical-Functional Grammar (LFG), Head-Driven Phrase Structure Grammar ...

