Results 1 -
9 of
9
Summarizing Scientific Articles - Experiments with Relevance and Rhetorical Status
- Computational Linguistics
, 2002
"... this paper we argue that scientific articles require a different summarization strategy than, for instance, news articles. We propose a strategy which concentrates on the rhetorical status of statements in the article: Material for summaries is selected in such a way that summaries can highlight the ..."
Abstract
-
Cited by 103 (2 self)
- Add to MetaCart
this paper we argue that scientific articles require a different summarization strategy than, for instance, news articles. We propose a strategy which concentrates on the rhetorical status of statements in the article: Material for summaries is selected in such a way that summaries can highlight the new contribution of the source paper and situate it with respect to earlier work. We provide a gold standard for summaries of this kind consisting of a substantial corpus of conference articles in computational linguistics with human judgements of rhetorical status and relevance. We present several experiments measuring our judges' agreement on these annotations. We also present an algorithm which, on the basis of the annotated training material, selects content and classifies it into a fixed set of seven rhetorical categories. The output of this extraction and classification system can be viewed as a single-document summary in its own right; alternatively, it can be used to generate task-oriented and user-tailored summaries designed to give users an overview of a scientific field.
LT TTT - A Flexible Tokenisation Tool
- In Proceedings of Second International Conference on Language Resources and Evaluation
, 2000
"... We describe LT TTT, a recently developed software system which provides tools to perform text tokenisation and mark-up. The system includes ready-made components to segment text into paragraphs, sentences, words and other kinds of token but, crucially, it also allows users to tailor rule-sets to pro ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
We describe LT TTT, a recently developed software system which provides tools to perform text tokenisation and mark-up. The system includes ready-made components to segment text into paragraphs, sentences, words and other kinds of token but, crucially, it also allows users to tailor rule-sets to produce mark-up appropriate for particular applications. We present three case studies of our use of LT TTT: named-entity recognition (MUC-7), citation recognition and mark-up and the preparation of a corpus in the medical domain. We conclude with a discussion of the use of browsers to visualise marked-up text. 1. Introduction The LTG's Text Tokenisation Toolkit (LT TTT, Grover et al., 1999) was developed within an XML processing paradigm whereby tools are combined together in a pipeline allowing each to add, modify or remove some piece of mark-up. The tools are compatible with the LT XML toolset (Thompson et al., 1997) and use the LT XML API to manipulate attribute values and character data ...
Summarising Scientific Articles - Experiments with Relevance and Rhetorical Status
- Computational Linguistics
"... Machine (COLING94), S.Tojo 28 9411023 Abstract Generation Based on Rhetorical Structure Extraction (COLING94), K.Ono et al. ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
Machine (COLING94), S.Tojo 28 9411023 Abstract Generation Based on Rhetorical Structure Extraction (COLING94), K.Ono et al.
What's yours and what's mine: Determining Intellectual Attribution in Scientific Text
"... We believe that identifying the structure of scientific argumentation in articles can help in tasks such as automatic summarization or the auto- mated construction of citation indexes. One par- ticularly important aspect of this structure is the question of who a given scientific statement is at- tr ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We believe that identifying the structure of scientific argumentation in articles can help in tasks such as automatic summarization or the auto- mated construction of citation indexes. One par- ticularly important aspect of this structure is the question of who a given scientific statement is at- tributed to: other researchers, the field in general, or the authors themselves.
Task-Based Evaluation of Summary Quality: Describing Relationships between Scientific Papers
- In Workshop Automatic Summarization, NAACL
, 2001
"... We present a novel method for task-based evalua- tion of summaries of scientific articles. The task we propose is a question-answering task, where the questions are about the relatedness of the current paper to prior research. This evaluation method is time-efficient with respect to material prepara ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
We present a novel method for task-based evalua- tion of summaries of scientific articles. The task we propose is a question-answering task, where the questions are about the relatedness of the current paper to prior research. This evaluation method is time-efficient with respect to material preparation and data collection, so that it is possible to test against many different baselines, something that is not usually feasible in evaluations by relevance decision. We use this methodology to evaluate the quality of summaries our system produces. These summaries are designed to describe the contribution of a scientific article in relation to other work. The re- sults show that this type of summary is indeed more useful than the baselines (random sentences, keyword lists and generic author-written summaries), and nearly as useful as the full texts.
Learning Syntactic Structures with XML
, 2000
"... this paper we want to show that XML format not only offers a good framework to annotate texts, but also provides a good for~ malism and tools in order to learn (syntactic) structures ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
this paper we want to show that XML format not only offers a good framework to annotate texts, but also provides a good for~ malism and tools in order to learn (syntactic) structures
ALLiS: a Symbolic Learning System for Natural Language Learning
, 2000
"... this article the adequacy between theory refinement and Natural Language Learning. For a more detailed presentation of TR, we refer the reader to (Abecker and Schmid, 1996), (Brunk, 1996). (Mooney, 1993) defines it as: Theory refinement systems developed in Machine Learning automatically modify a ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
this article the adequacy between theory refinement and Natural Language Learning. For a more detailed presentation of TR, we refer the reader to (Abecker and Schmid, 1996), (Brunk, 1996). (Mooney, 1993) defines it as: Theory refinement systems developed in Machine Learning automatically modify a Knowledge Base to render it consistent with a set of classified train- ing examples
Personalizing Retrieval of Journal Articles for Patient Care
, 2001
"... this paper and other work in the context of PERSIVAL, we collected a corpus of 29,784 medical articles in full text, either from the web with an automated crawler or via a licensing agreement with Ovid Technologies. The articles appeared in HTML format; we transformed them into XML using a pipeline ..."
Abstract
- Add to MetaCart
this paper and other work in the context of PERSIVAL, we collected a corpus of 29,784 medical articles in full text, either from the web with an automated crawler or via a licensing agreement with Ovid Technologies. The articles appeared in HTML format; we transformed them into XML using a pipeline we developed on the basis of publicly available XML tools. The corpus contains articles from 20 journals in cardiology from 1993 to 2000, comprising roughly 85 million word tokens (cf. Figure 2)
Personalized Search of the Medical Literature:
, 2003
"... We describe a system for personalizing a set of medical journal articles (possibly created as the output of a search engine) by selecting those documents that speci cally match a patient under care. Key element in our approach is the use of targeted parts of the electronic patient record to serve a ..."
Abstract
- Add to MetaCart
We describe a system for personalizing a set of medical journal articles (possibly created as the output of a search engine) by selecting those documents that speci cally match a patient under care. Key element in our approach is the use of targeted parts of the electronic patient record to serve as a readily available user model for the personalization task. We discuss several enhancements to a TF*IDF based approach for measuring the similarity between articles and the patient record. We also present the results of an experiment involving almost 3,000 relevance judgments by medical doctors. Our evaluation establishes that the automated system surpasses in performance alternative methods for personalizing the set of articles, including keyword-based queries manually constructed by medical experts for this purpose.

