Results 1 - 10
of
62
The Proposition Bank: An Annotated Corpus of Semantic Roles
- Computational Linguistics
, 2005
"... The Proposition Bank project takes a practical approach to semantic representation, adding a layer of predicate-argument information, or semantic role labels, to the syntactic structures of the Penn Treebank. The resulting resource can be thought of as shallow, in that it does not represent corefere ..."
Abstract
-
Cited by 256 (8 self)
- Add to MetaCart
The Proposition Bank project takes a practical approach to semantic representation, adding a layer of predicate-argument information, or semantic role labels, to the syntactic structures of the Penn Treebank. The resulting resource can be thought of as shallow, in that it does not represent coreference, quantification, and many other higher-order phenomena, but also broad, in that it covers every instance of every verb in the corpus and allows representative statistics to be calculated. We discuss the criteria used to define the sets of semantic roles used in the annotation process and to analyze the frequency of syntactic/semantic alternations in the corpus. We describe an automatic system for semantic role tagging trained on the corpus and discuss the effect on its performance of various types of information, including a comparison of full syntactic parsing with a flat representation and the contribution of the empty ‘‘trace’ ’ categories of the treebank.
From treebank to propbank
- In Language Resources and Evaluation
, 2002
"... This paper describes our approach to the development of a Proposition Bank, which involves the addition of semantic information to the Penn English Treebank. Our primary goal is the labeling of syntactic nodes with specific argument labels that preserve the similarity of roles such as the window in ..."
Abstract
-
Cited by 164 (8 self)
- Add to MetaCart
This paper describes our approach to the development of a Proposition Bank, which involves the addition of semantic information to the Penn English Treebank. Our primary goal is the labeling of syntactic nodes with specific argument labels that preserve the similarity of roles such as the window in John broke the window and the window broke. After motivating the need for explicit predicate argument structure labels, we briefly discuss the theoretical considerations of predicate argument structure and the need to maintain consistency across syntactic alternations. The issues of consistency of argument structure across both polysemous and synonymous verbs are also discussed and we present our actual guidelines for these types of phenomena, along with numerous examples of tagged sentences and verb frames. Metaframes are introduced as a technique for handling similar frames among near− synonymous verbs. We conclude with a summary of the current status of annotation process. 1.
Class-Based Construction of a Verb Lexicon
, 2000
"... We present an approach to building a verb lexicon compatible with WordNet but with explicitly stated syntactic and semantic information, using Levin verb classes to systematically construct lexical entries. By using verb classes we capture generalizations about verb behavior and reduce the effo ..."
Abstract
-
Cited by 115 (8 self)
- Add to MetaCart
We present an approach to building a verb lexicon compatible with WordNet but with explicitly stated syntactic and semantic information, using Levin verb classes to systematically construct lexical entries. By using verb classes we capture generalizations about verb behavior and reduce the effort needed to construct the lexicon. The syntactic frames for the verb classes are represented by a Lexicalized Tree Adjoining Grammar augmented with semantic predicates, which allows a compositional interpretation. Introduction Despite many different approaches to lexicon development (Pustejovsky 1991), (Copestake & Sanfilippo 1993), (Lowe, Baker, & Fillmore 1997), (Dorr 1997), the field of Natural Language Processing (NLP) has yet to develop a clear consensus on guidelines for computational verb lexicons, which has severely limited their utility in NLP applications. Many approaches make no attempt to associate the semantics of a verb with its possible syntactic frames. Others list too...
Automatic Verb Classification Based on Statistical Distributions of Argument Structure
- Computational Linguistics
, 2001
"... this paper, we focus on argument structure--the thematic roles assigned by a verb to its arguments--as the way in which the relational semantics of the verb is represented at the syntactic level ..."
Abstract
-
Cited by 79 (15 self)
- Add to MetaCart
this paper, we focus on argument structure--the thematic roles assigned by a verb to its arguments--as the way in which the relational semantics of the verb is represented at the syntactic level
Subcategorization Acquisition
, 2002
"... Manual development of large subcategorised lexicons has proved difficult because predicates change behaviour between sublanguages, domains and over time. Yet access to a comprehensive subcategorization lexicon is vital for successful parsing capable of recovering predicate-argument relations, and pr ..."
Abstract
-
Cited by 64 (13 self)
- Add to MetaCart
Manual development of large subcategorised lexicons has proved difficult because predicates change behaviour between sublanguages, domains and over time. Yet access to a comprehensive subcategorization lexicon is vital for successful parsing capable of recovering predicate-argument relations, and probabilistic parsers would greatly benefit from accurate information concerning the relative likelihood of different subcategorisation frames (scfs) of a given predicate. Acquisition of subcategorization lexicons from textual corpora has recently become increasingly popular. Although this work has met with some success, resulting lexicons indicate a need for greater accuracy. One significant source of error lies in the statistical filtering used for hypothesis selection, i.e. for removing noise from automatically acquired scfs. This thesis builds on earlier work in verbal subcategorization acquisition, taking as a starting point the problem with statistical filtering. Our investigation shows that statistical filters tend to work poorly because not only is the underlying distribution zipfian, but there is also very little correlation between conditional distribution of
Acquiring Lexical Generalizations from Corpora: A Case Study for Diathesis Alternations
, 1999
"... This paper examines the extent to which verb aliathesis alternations are empirically attested in corpus data. We automatically acquire alternating verbs from large balanced corpora by using partialparsing methods and taxonomic information, and discuss how corpus data can be used to quantify linguist ..."
Abstract
-
Cited by 44 (2 self)
- Add to MetaCart
This paper examines the extent to which verb aliathesis alternations are empirically attested in corpus data. We automatically acquire alternating verbs from large balanced corpora by using partialparsing methods and taxonomic information, and discuss how corpus data can be used to quantify linguistic generalizations. We estimate the productivity of an alternation and the typicality of its members using type and token frequencies.
Using Subcategorization to Resolve Verb Class Ambiguity
- JOINT SIGDAT CONFERENCE ON EMPIRICAL METHODS IN NLP AND VERY LARGE CORPORA
, 1999
"... Levin's (1993) taxonomy of verbs and their classes is a widely used resource for lexical semantics. In her framework, some verbs, such as give exhibit no class ambiguity. But other verbs, such as write, can inhabit more than one class. In some of these am- biguous cases the appropriate class for a ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
Levin's (1993) taxonomy of verbs and their classes is a widely used resource for lexical semantics. In her framework, some verbs, such as give exhibit no class ambiguity. But other verbs, such as write, can inhabit more than one class. In some of these am- biguous cases the appropriate class for a particular token of a verb is immediately obvious from inspection of the surrounding context, In others it is not, and an application which wants to recover this infor- mation will be forced to rely on some more or less elaborate process of inference. We present a simple statistical model of verb class ambiguity and show how it can be used to carry out such inference.
Verb Class Disambiguation Using Informative Priors
- COMPUTATIONAL LINGUISTICS
, 2004
"... Levin’s (1993) study of verb classes is a widely used resource for lexical semantics. In her framework, some verbs, such as give, exhibit no class ambiguity. But other verbs, such as write, have several alternative classes. We extend Levin’s inventory to a simple statistical model of verb class ambi ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
Levin’s (1993) study of verb classes is a widely used resource for lexical semantics. In her framework, some verbs, such as give, exhibit no class ambiguity. But other verbs, such as write, have several alternative classes. We extend Levin’s inventory to a simple statistical model of verb class ambiguity. Using this model we are able to generate preferences for ambiguous verbs without the use of a disambiguated corpus. We additionally show that these preferences are useful as priors for a verb sense disambiguator.
Clustering Polysemic Subcategorization Frame Distributions Semantically
- IN PROC. OF THE 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 2003
"... Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguated corpus data. We describe a new approach which involves clustering subcategorization frame (SCF) distributions using the Information Bottleneck and nearest neighbour methods. In ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguated corpus data. We describe a new approach which involves clustering subcategorization frame (SCF) distributions using the Information Bottleneck and nearest neighbour methods. In contrast to previous work, we particularly focus on clustering polysemic verbs. A novel evaluation scheme is proposed which accounts for the effect of polysemy on the clusters, offering us a good insight into the potential and limitations of semantically classifying undisambiguated SCF data.
Automatic Verb Classification Using Distributions of Grammatical Features
, 1999
"... We apply machine learning techniques to classify automatically a set of verbs into lexical semantic classes, based on distributional approximations of diatheses, extracted from a very large annotated corpus. Distributions of four gram- matical features are sufficient to reduce error rate by 5 ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
We apply machine learning techniques to classify automatically a set of verbs into lexical semantic classes, based on distributional approximations of diatheses, extracted from a very large annotated corpus. Distributions of four gram- matical features are sufficient to reduce error rate by 50% over chance. We conclude that corpus data is a usable repository of verb class information, and that corpus-driven extraction of grammatical features is a promising methodology for automatic lexical acquisition.

