Results 11 -
17 of
17
On the Development of the RST Spanish Treebank
"... In this article we present the RST Spanish Treebank, the first corpus annotated with rhetorical relations for this language. We describe the characteristics of the corpus, the annotation criteria, the annotation procedure, the inter-annotator agreement, and other related aspects. Moreover, we show t ..."
Abstract
- Add to MetaCart
In this article we present the RST Spanish Treebank, the first corpus annotated with rhetorical relations for this language. We describe the characteristics of the corpus, the annotation criteria, the annotation procedure, the inter-annotator agreement, and other related aspects. Moreover, we show the interface that we have developed to carry out searches over the corpus’ annotated texts. 1
Multi-Layer Discourse Annotation of a Dutch Text Corpus
"... We have compiled a corpus of 80 Dutch texts from expository and persuasive genres, which we annotated for rhetorical and genre-specific discourse structure, and lexical cohesion with the goal of creating a gold standard for further research. The annotations are based on a segmentation of the text in ..."
Abstract
- Add to MetaCart
We have compiled a corpus of 80 Dutch texts from expository and persuasive genres, which we annotated for rhetorical and genre-specific discourse structure, and lexical cohesion with the goal of creating a gold standard for further research. The annotations are based on a segmentation of the text in elementary discourse units that takes into account cues from syntax and punctuation. During the labor-intensive discourse-structure annotation (RST analysis), we took great care to thoroughly reconcile the initial analyses. That process and the availability of two independent initial analyses for each text allows us to analyze our disagreements and to assess the confusability of RST relations, and thereby improve the annotation guidelines and gather evidence for the classification of these relations into larger groups. We are using this resource for corpus-based studies of discourse relations, discourse markers, cohesion, and genre differences, e.g., the question of how discourse structure and lexical cohesion interact for different genres in the overall organization of texts. We are also exploring automatic text segmentation and semi-automatic discourse annotation.
Contents lists available at SciVerse ScienceDirect Language Sciences
"... journal homepage: www.elsevier.com/locate/langsci The contribution of nonveridical rhetorical relations to evaluation ..."
Abstract
- Add to MetaCart
journal homepage: www.elsevier.com/locate/langsci The contribution of nonveridical rhetorical relations to evaluation
Learning the Structure of Task-Oriented Conversations from the Corpus of In-Domain Dialogs
, 2008
"... ..."
Some Reflections on the Task of Content Determination in the Context of Multi-Document Summarization of Evolving Events
, 710
"... Despite its importance, the task of summarizing evolving events has received small attention by researchers in the field of Multi-document Summarization. In a previous paper [5] we have presented a methodology for the automatic summarization of documents, emitted by multiple sources, which describe ..."
Abstract
- Add to MetaCart
Despite its importance, the task of summarizing evolving events has received small attention by researchers in the field of Multi-document Summarization. In a previous paper [5] we have presented a methodology for the automatic summarization of documents, emitted by multiple sources, which describe the evolution of an event. At the heart of this methodology lies the identification of similarities and differences between the various documents, in two axes: the synchronic and the diachronic. This is achieved by the introduction of the notion of Synchronic and Diachronic Relations. Those relations connect the messages that are found in the documents, resulting thus in a graph which we call grid. Although the creation of the grid completes the Document Planning phase of a typical NLG architecture, it can be the case that the number of messages contained in a grid is very large, exceeding thus the required compression rate. In this paper we provide some initial thoughts on a probabilistic model which can be applied at the Content Determination stage, and which tries to alleviate this problem.
Rhetorical relations in multimodal documents ∗
"... We present a corpus-based study of coherence in multimodal documents. We concern ourselves with the types of relationships between graphs and tables and the text of the document in which they appear. In order to understand and categorize the types of relations across modalities, we are making use of ..."
Abstract
- Add to MetaCart
We present a corpus-based study of coherence in multimodal documents. We concern ourselves with the types of relationships between graphs and tables and the text of the document in which they appear. In order to understand and categorize the types of relations across modalities, we are making use of Rhetorical Structure Theory (Mann & Thompson, 1988), and propose that RST can adequately describe these types of relations. We analyzed a corpus comprising three different genres, and consisting of about 1,500 pages of material and almost 600 figures, tables and graphs. We show that figures stand in both presentational and subject matter relations to the text, and that the relationship between figures and text is one of a small set out of the larger possible rhetorical relations. We also discuss several issues that arise in the treatment of multimodal material, such as the potential for multiple connections between figure and text.
1 2 3 4 5 6 7 8
"... Article Discourse markers and coherence relations: Comparison across markers, languages and modalities Maite Taboada and María de los Ángeles Gómez-González We examine how one particular coherence relation, Concession, is marked across languages and modalities, through an extensive analysis of the C ..."
Abstract
- Add to MetaCart
Article Discourse markers and coherence relations: Comparison across markers, languages and modalities Maite Taboada and María de los Ángeles Gómez-González We examine how one particular coherence relation, Concession, is marked across languages and modalities, through an extensive analysis of the Concession relation, examining the types of discourse markers used to signal it. The analysis is contrastive from three different angles: markers, languages and modalities. We compare different markers within the same language (but, although, however, etc.), and two languages (English and Spanish). We aim to provide a contrastive methodology that can be applied to any language, given that it has as a starting point the abstract notion of coherence relations, which we believe are similar across languages. Finally, we compare two modalities: spoken and written language. In the analysis, we find that the contexts in which concessive relations are used are similar across languages, but that there are clear differences in the two modalities or genres. In the spoken genre, the most common function of concession is to correct misunderstandings and contrast situations. In the written genre, on the other hand, concession is most often used to qualify opinions.

