Results 1 -
4 of
4
Automatic extraction of subcategorization from corpora
- In Proceedings of the 5th ACL Conference on Applied Natural Language Processing
, 1997
"... We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of occurrence of a comprehensive set of subcategorization classes for English. An initial experiment, on a sample of 14 verb ..."
Abstract
-
Cited by 176 (7 self)
- Add to MetaCart
We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of occurrence of a comprehensive set of subcategorization classes for English. An initial experiment, on a sample of 14 verbs which exhibit multiple complementation patterns, demonstrates that the technique achieves accuracy comparable to previous approaches, which are all limited to a highly restricted set of subcategorization classes. We also demonstrate that a subcategorization dictionary built with the system improves the accuracy of a parser by an appreciable amount 1. 1
Developing and evaluating a probabilistic LR parser of part-of-speech and punctuation labels
- In Proceedings of the 4th ACL/SIGPARSE International Workshop on Parsing Technologies
, 1995
"... We describe an approach to robust domain-independent syntactic parsing of unrestricted naturally-occurring (English) input. The technique involves parsing sequences of part-ofspeech and punctuation labels using a unification-based grammar coupled with a probabilistic LR parser. We describe the cover ..."
Abstract
-
Cited by 52 (9 self)
- Add to MetaCart
We describe an approach to robust domain-independent syntactic parsing of unrestricted naturally-occurring (English) input. The technique involves parsing sequences of part-ofspeech and punctuation labels using a unification-based grammar coupled with a probabilistic LR parser. We describe the coverage of several corpora using this grammar and report the results of a parsing experiment using probabilities derived from bracketed training data. We report the first substantial experiments to assess the contribution of punctuation to deriving an accurate syntactic analysis, by parsing identical texts both with and without naturally-occurring punctuation marks. 1
Automatic Extraction of Subcategorization Frames from Corpora -- Improving Filtering with Diathesis Alternations
- IN PROCEEDINGS OF THE ESSLLI 98
, 1998
"... Attempts to extract subcategorization information from textual corpora by shallow parsing followed by statistical filtering of alternatives proposed for specific predicates have met with some success (Briscoe & Carroll, 1997) but are not yet accurate enough. Examination of the errors suggests tha ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Attempts to extract subcategorization information from textual corpora by shallow parsing followed by statistical filtering of alternatives proposed for specific predicates have met with some success (Briscoe & Carroll, 1997) but are not yet accurate enough. Examination of the errors suggests that the filtering of spurious hypotheses is the source of most errors in the system. This paper builds on the framework described in (Briscoe and Carroll, 1997) and proposes a knowledge-based approach for improvement of the filtering phase of the system.
Acquiring Subcategorisation from Textual Corpora
, 1997
"... Manual development of large subcategorised lexicons has proved impossible because predicates change behaviour between sublanguages, domains and across time. Yet current parsers depend crucially on such information, and probabilistic parsers would greatly benefit from accurate information concerni ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Manual development of large subcategorised lexicons has proved impossible because predicates change behaviour between sublanguages, domains and across time. Yet current parsers depend crucially on such information, and probabilistic parsers would greatly benefit from accurate information concerning relative likelihood of different subcategorisation frames of a given predicate. This project suggests that automatic construction of a subcategorisation dictionary from textual corpora is more promising method to apply. The work undertaken builds upon a recent system which extracts subcategorisation information from textual corpora by shallow parsing followed by statistical filtering of alternatives proposed for specific predicates. This system has met with some success but is not accurate enough (Briscoe & Carroll 1997). Examination of the errors suggest that the filtering of spurious hypotheses is the source of most errors in the system. The goal of this project is to construct ...

