Results 1 - 10
of
40
Annotating Discourse Connectives and Their Arguments
- In Proceedings of the HLT/NAACL Workshop on Frontiers in Corpus Annotation
, 2004
"... This paper describes a new, large scale discourse-level annotation project -- the Penn Discourse TreeBank (PDTB). We present an approach to annotating a level of discourse structure that is based on identifying discourse connectives and their arguments. The PDTB is being built directly on top ..."
Abstract
-
Cited by 35 (9 self)
- Add to MetaCart
This paper describes a new, large scale discourse-level annotation project -- the Penn Discourse TreeBank (PDTB). We present an approach to annotating a level of discourse structure that is based on identifying discourse connectives and their arguments. The PDTB is being built directly on top of the Penn TreeBank and Propbank, thus supporting the extraction of useful syntactic and semantic features and providing a richer substrate for the development and evaluation of practical algorithms.
The Rhetorical Parsing of Unrestricted Texts: A Surface-based Approach
- Computational Linguistics
, 2000
"... This paper exploresthe extent to which well-formed rhetorical structures can be automatically derived by means of surface-form-based algorithms. These algorithms identify discourse usages of cue phrases and break sentences into clauses, hypothesize rhetorical relations that holdamong textual units, ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
This paper exploresthe extent to which well-formed rhetorical structures can be automatically derived by means of surface-form-based algorithms. These algorithms identify discourse usages of cue phrases and break sentences into clauses, hypothesize rhetorical relations that holdamong textual units, and produce valid rhetorical structure trees for unrestricted natural language texts. The algorithms are empirically grounded in a corpus analysis of cue phrases and rely on a #rst-order formalization of rhetorical structure trees
D-LTAG System: Discourse Parsing with a Lexicalized Tree Adjoining Grammar
- Journal of Logic, Language and Information
, 2002
"... We present an implementation of a discourse parsing system for a lexicalized Tree-Ajoining Grammar for discourse, specifying the integration of sentence and discourse level processing. Our system is based on the assumption that the compositional aspects of semantics at the discourse-level parallel t ..."
Abstract
-
Cited by 28 (9 self)
- Add to MetaCart
We present an implementation of a discourse parsing system for a lexicalized Tree-Ajoining Grammar for discourse, specifying the integration of sentence and discourse level processing. Our system is based on the assumption that the compositional aspects of semantics at the discourse-level parallel those at the sentence-level. This coupling is achieved by factoring away inferential semantics and anaphoric features of discourse connectives. Computationally, this parallelism is achieved because both the sentence and discourse grammar are LTAG-based and the same parser works at both levels. The approach to an LTAG for discourse has been developed by Webber et al. in some recent papers ([33], [35], among others). Our system takes a discourse as input, parses the sentences individually, extracts the basic discourse consituent units from the sentence derivations, and reparses the discourse with reference to the discourse grammar while using the same parser used at the sentence-level.
Representing discourse coherence: A corpus-based study
- Computational Linguistics
, 2005
"... This article aims to present a set of discourse structure relations that are easy to code and to develop criteria for an appropriate data structure for representing these relations. Discourse structure here refers to informational relations that hold between sentences in a discourse. The set of disc ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
This article aims to present a set of discourse structure relations that are easy to code and to develop criteria for an appropriate data structure for representing these relations. Discourse structure here refers to informational relations that hold between sentences in a discourse. The set of discourse relations introduced here is based on Hobbs (1985). We present a method for annotating discourse coherence structures that we used to manually annotate a database of 135 texts from the Wall Street Journal and the AP Newswire. All texts were independently annotated by two annotators. Kappa values of greater than 0.8 indicated good interannotator agreement. We furthermore present evidence that trees are not a descriptively adequate data structure for representing discourse structure: In coherence structures of naturally occurring texts, we found many different kinds of crossed dependencies, as well as many nodes with multiple parents. The claims are supported by statistical results from our hand-annotated database of 135 texts. 1.
Annotation and Data Mining of the Penn Discourse TreeBank
- In ACL Workshop on Discourse Annotation
, 2004
"... The Penn Discourse TreeBank (PDTB) is a new resource built on top of the Penn Wall Street Journal corpus, in which discourse connectives are annotated along with their arguments. Its use of standoff annotation allows integration with a stand-off version of the Penn TreeBank (syntactic structure) and ..."
Abstract
-
Cited by 12 (7 self)
- Add to MetaCart
The Penn Discourse TreeBank (PDTB) is a new resource built on top of the Penn Wall Street Journal corpus, in which discourse connectives are annotated along with their arguments. Its use of standoff annotation allows integration with a stand-off version of the Penn TreeBank (syntactic structure) and PropBank (verbs and their arguments), which adds value for both linguistic discovery and discourse modeling. Here we describe the PDTB and some experiments in linguistic discovery based on the PDTB alone, as well as on the linked PTB and PDTB corpora.
Discourse Semantics of S-Modifying Adverbials
, 2003
"... I wish to thank Bonnie Webber. Without her patience and her seemingly endless depths of insight, I might never have completed this thesis. I am enormously grateful for her guidance. I also owe many thanks to Ellen Prince. She is an intellectual leader at Penn who has helped many, including me, find ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
I wish to thank Bonnie Webber. Without her patience and her seemingly endless depths of insight, I might never have completed this thesis. I am enormously grateful for her guidance. I also owe many thanks to Ellen Prince. She is an intellectual leader at Penn who has helped many, including me, find a way through the jungle of discourse analysis. I am indebted to every professor who has taught me. Special thanks to Robin Clark for being a member of my dissertation committee. I am very lucky to have worked with Aravind Joshi. He is a continual source of knowledge in the DLTAG meetings. The field of computational linguistics has already benefited from his sentencelevel work; I fully expect he and Bonnie will produce similarly useful results with DLTAG. Also in DLTAG, Eleni Miltsakaki and Rashmi Prasad, and later Cassandre Creswell and Jason Teeple all provided stimulation and solace. Their great company and great effort on DLTAG projects taught me to appreciate how much can be done when minds work together. I look forward to the chance to work with them in the future. I am also thankful to Martha Palmer, Paul Kingsbury, and Scott Cotton for allowing me to work with them on the Propbank project and supplement both my income and my work in discourse. On a personal note, the Forbes, Finley, and Riley families deserve thanks for giving me love and diversion and balance and talking me through my education. Most of all, thanks to Enrico Riley, for being everything to me.
Locating Topics in Text Processing
- In Proceedings of CLIN 99
, 1999
"... In this paper we are concerned with the location of topics in text processing and the determination of the update unit in looking up topic continuations and topic shifts. Using key elements of the Centering Model of local discourse coherence and empirical evidence from Modern Greek and Japanese we a ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
In this paper we are concerned with the location of topics in text processing and the determination of the update unit in looking up topic continuations and topic shifts. Using key elements of the Centering Model of local discourse coherence and empirical evidence from Modern Greek and Japanese we argue that the appropriate update unit for topic tracking is the sentence in its traditional sense and not the finite clause, thus providing an account for the status of the subordinate clause in the calculation of topic transitions. We bring forth an argument from English, Modern Greek (MG) and Japanese for keeping topic and information structure distinct. We briefly discuss the significance of the current work to automated essay scoring and coreference-based summarization systems. 1 Introduction This paper is concerned with the issue of identifying the location of topics in text processing. Adopting the framework of the Centering Model, we discuss the importance of defining the appropriat...
The Discourse Anaphoric Properties of Connectives
"... Discourse connectives can be analyzed as encoding predicate-argument relations whose arguments derive from the interpretation of discourse units. These arguments can be anaphoric or structural. Although structural arguments can be encoded in a parse tree, anaphoric arguments must be resolved by othe ..."
Abstract
-
Cited by 9 (9 self)
- Add to MetaCart
Discourse connectives can be analyzed as encoding predicate-argument relations whose arguments derive from the interpretation of discourse units. These arguments can be anaphoric or structural. Although structural arguments can be encoded in a parse tree, anaphoric arguments must be resolved by other means. A study of nine connectives, annotating the location, size, and syntactic type of their arguments, shows connective-specific patterns for each of these features. A preliminary study of inter-annotator consistency shows that it too varies by connective. Results of the corpus study will be used in the development of resolution algorithms for anaphoric connectives.

