Results 11 - 20
of
88
Building Deep Dependency Structures with a Wide-Coverage CCG Parser
- In Proceedings of the 40th Meeting of the ACL
, 2002
"... This paper describes a wide-coverage statistical parser that uses Combinatory Categorial Grammar (CCG) to derive dependency structures. The parser differs from most existing wide-coverage treebank parsers in capturing the long-range dependencies inherent in constructions such as coordination, extrac ..."
Abstract
-
Cited by 18 (6 self)
- Add to MetaCart
This paper describes a wide-coverage statistical parser that uses Combinatory Categorial Grammar (CCG) to derive dependency structures. The parser differs from most existing wide-coverage treebank parsers in capturing the long-range dependencies inherent in constructions such as coordination, extraction, raising and control, as well as the standard local predicate-argument dependencies. A set of dependency structures used for training and testing the parser is obtained from a treebank of CCG normal-form derivations, which have been derived (semi-) automatically from the Penn Treebank. The parser correctly recovers over 80 % of labelled dependencies, and around 90 % of unlabelled dependencies. 1
An approach to Robust Partial Parsing and Evaluation Metrics
- In Proceedings of the Eight European Summer School In Logic, Language and Information
, 1996
"... In this paper, we present a new technique called LightweightDependency Analysis which in conjunctionwith Supertag disambiguation provides a method for Robust Partial Parsing, called Almost Parsing. An overview is given of the XTAG system in which this technique is being developed. In addition, we ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
In this paper, we present a new technique called LightweightDependency Analysis which in conjunctionwith Supertag disambiguation provides a method for Robust Partial Parsing, called Almost Parsing. An overview is given of the XTAG system in which this technique is being developed. In addition, we propose alternate metrics for evaluation of partial parsers that can also serve to evaluate full parsers.
Lexicalization in crosslinguistic probabilistic parsing: the case of French
- In Proceedings of ACL
, 2005
"... This paper presents the first probabilistic parsing results for French, using the recently released French Treebank. We start with an unlexicalized PCFG as a baseline model, which is enriched to the level of Collins ’ Model 2 by adding lexicalization and subcategorization. The lexicalized sister-hea ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
This paper presents the first probabilistic parsing results for French, using the recently released French Treebank. We start with an unlexicalized PCFG as a baseline model, which is enriched to the level of Collins ’ Model 2 by adding lexicalization and subcategorization. The lexicalized sister-head model and a bigram model are also tested, to deal with the flatness of the French Treebank. The bigram model achieves the best performance: 81 % constituency F-score and 84 % dependency accuracy. All lexicalized models outperform the unlexicalized baseline, consistent with probabilistic parsing results for English, but contrary to results for German, where lexicalization has only a limited effect on parsing performance. 1
A broad-coverage parser for German based on defeasible constraints
- In KONVENS 2004, Beiträge zur 7. Konferenz zur Verarbeitung natürlicher Sprache
, 2004
"... We present a parser for German that achieves a competitive accuracy on unrestricted input while maintaining a coverage of 100%. By writing well-formedness rules as declarative, defeasible constraints that integrate di#erent sources of linguistic knowledge, very high robustness is achieved agains ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
We present a parser for German that achieves a competitive accuracy on unrestricted input while maintaining a coverage of 100%. By writing well-formedness rules as declarative, defeasible constraints that integrate di#erent sources of linguistic knowledge, very high robustness is achieved against all sorts of language error.
Parser Evaluation: Using a Grammatical Relation Annotation Scheme
, 2003
"... We describe a recently developed corpus annotation scheme for evaluating parsers that avoids some of the shortcomings of current methods. The scheme encodes grammatical relations between heads and dependents, and has been used to mark up a new public-domain corpus of naturally occurring English text ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
We describe a recently developed corpus annotation scheme for evaluating parsers that avoids some of the shortcomings of current methods. The scheme encodes grammatical relations between heads and dependents, and has been used to mark up a new public-domain corpus of naturally occurring English text. We show how the corpus can be used to evaluate the accuracy of a robust parser, and relate the corpus to extant resources.
Strictly lexical dependency parsing
- In Proc. IWPT
, 2005
"... We present a strictly lexical parsing model where all the parameters are based on the words. This model does not rely on part-of-speech tags or grammatical categories. It maximizes the conditional probability of the parse tree given the sentence. This is in contrast with most previous models that co ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
We present a strictly lexical parsing model where all the parameters are based on the words. This model does not rely on part-of-speech tags or grammatical categories. It maximizes the conditional probability of the parse tree given the sentence. This is in contrast with most previous models that compute the joint probability of the parse tree and the sentence. Although the maximization of joint and conditional probabilities are theoretically equivalent, the conditional model allows us to use distributional word similarity to generalize the observed frequency counts in the training corpus. Our experiments with the Chinese Treebank show that the accuracy of the conditional model is 13.6 % higher than the joint model and that the strictly lexicalized conditional model outperforms the corresponding unlexicalized model based on part-of-speech tags. 1
Customizable Modular Lexicalized Parsing
- In Proc. of the 6th International Workshop on Parsing Technology, IWPT2000
, 2000
"... Dierent NLP applications have dierent eciency constraints (i.e. quality of the results and throughput) that reect on each core linguistic component. Syntactic processors are basic modules in some NLP application. A customization that permits the performance control of these components enables thei ..."
Abstract
-
Cited by 11 (9 self)
- Add to MetaCart
Dierent NLP applications have dierent eciency constraints (i.e. quality of the results and throughput) that reect on each core linguistic component. Syntactic processors are basic modules in some NLP application. A customization that permits the performance control of these components enables their reuse in dierent application scenarios. Throughput has been commonly improved using partial syntactic processors. On the other hand, specialized lexicons are generally employed to improve the quality of the syntactic material produced by speci c parsing (sub)process (e.g. verb argument detection or PPattachment disambiguation). Building upon the idea of grammar strati cation, in this paper a method to push modularity and lexical sensitivity, in parsing, in view of customizable syntactic analysers is presented. A framework for modular parser design is proposed and its main properties are discussed.
Dependency parsing with an extended finite-state approach
- Computational Linguistics
, 2003
"... This article presents a dependency parsing scheme using an extended finite-state approach. The parser augments input representation with “channels ” so that links representing syntactic dependency relations among words can be accommodated and iterates on the input a number of times to arrive at a fi ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
This article presents a dependency parsing scheme using an extended finite-state approach. The parser augments input representation with “channels ” so that links representing syntactic dependency relations among words can be accommodated and iterates on the input a number of times to arrive at a fixed point. Intermediate configurations violating various constraints of projective dependency representations such as no crossing links and no independent items except sentential head are filtered via finite-state filters. We have applied the parser to dependency parsing of Turkish. 1.
Automatic Transformation of Phrase Treebanks to Dependency Trees
- In Proceedings LREC-04
, 2004
"... Word-to-word dependency structures are useful for consistent representation and comparable evaluation of parsing results. However, most large-scale treebanks contain various variants of phrase structure trees, since automatic parsers usually produce constituent structures. We present a freely availa ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Word-to-word dependency structures are useful for consistent representation and comparable evaluation of parsing results. However, most large-scale treebanks contain various variants of phrase structure trees, since automatic parsers usually produce constituent structures. We present a freely available extensible tool for converting phrase structure to dependencies automatically, and discuss its application to the NEGRA treebank of German.
Modifying Existing Annotated Corpora for General Comparative Evaluation of Parsing
- In Proceedings of the LRE Workshop on Evaluation of Parsing Systems
, 1998
"... We argue that the current dominant paradigm in parser evaluation work, which combines use of the Penn Treebank reference corpus and of the Parseval scoring metrics, is not well-suited to the task of general comparative evaluation of diverse parsing systems. In (Gaizauskas et al., 1998), we propose a ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
We argue that the current dominant paradigm in parser evaluation work, which combines use of the Penn Treebank reference corpus and of the Parseval scoring metrics, is not well-suited to the task of general comparative evaluation of diverse parsing systems. In (Gaizauskas et al., 1998), we propose an alternative approach which has two key components. Firstly, we propose parsed corpora for testing that are much flatter than those currently used, whose "gold standard" parses encode only those grammatical constituents upon which there is broad agreement across a range of grammatical theories. Secondly, we propose modified evaluation metrics that require parser outputs to be `faithful to', rather than mimic, the broadly agreed structure encoded in the flatter gold standard analyses. This paper addresses a crucial issue for the (Gaizauskas et al., 1998) approach, namely, the creation of the evaluation resources that the approach requires, i.e. annotated corpora recording the flatter parse a...

