Results 1 - 10
of
92
Using Lexical Chains for Text Summarization
, 1997
"... We investigate one technique to produce a summary of an original text without requiring its full semantic interpretation, but instead relying on a model of the topic progression in the text derived from lexical chains. We present a new algorithm to compute lexical chains in a text, merging several r ..."
Abstract
-
Cited by 276 (7 self)
- Add to MetaCart
We investigate one technique to produce a summary of an original text without requiring its full semantic interpretation, but instead relying on a model of the topic progression in the text derived from lexical chains. We present a new algorithm to compute lexical chains in a text, merging several robust knowledge sources: the WordNet thesaurus, a part-of-speech tagger and shallow parser for the ldentification of nominal groups, and a segmentation algorithm derived from (Hearst, 1994) Summarization proceeds in three steps: the original text m first segmented, lexical chains are constructed, strong chains are identified and significant sentences are extracted from the text. We present in this paper empirical results on the identification of strong chain and of significant sentences.
Summarizing Text Documents: Sentence Selection and Evaluation Metrics
- In Research and Development in Information Retrieval
, 1999
"... Human-quality text summarization systems are difficult to design, and even more difficult to evaluate, in part because documents can differ along several dimensions, such as length, writing style and lexical usage. Nevertheless, certain cues can often help suggest the selection of sentences for incl ..."
Abstract
-
Cited by 156 (5 self)
- Add to MetaCart
Human-quality text summarization systems are difficult to design, and even more difficult to evaluate, in part because documents can differ along several dimensions, such as length, writing style and lexical usage. Nevertheless, certain cues can often help suggest the selection of sentences for inclusion in a summary. This paper presents our analysis of news-article summaries generated by sentence selection. Sentences are ranked for potential inclusion in the summary using a weighted combination of statistical and linguistic features. The statistical features were adapted from standard IR methods. The potential linguistic ones were derived from an analysis of news-wire summaries. Toevaluate these features we use a normalized version of precision-recall curves, with a baseline of random sentence selection, as well as analyze the properties of such a baseline. We illustrate our discussions with empirical results showing the importance of corpus-dependent baseline summarization standards, compression ratios and carefully crafted long queries.
Information Fusion in the Context of Multi-Document Summarization
- IN PROCEEDINGS OF THE 37TH ANNUAL MEETING OF THE ACL
, 1999
"... We present a method to automatically generate a concise summary by identifying and synthesizing similar elements across related text from a set of multiple documents. Our approach is unique in its usage of language generation to reformulate the wording of the summary. ..."
Abstract
-
Cited by 107 (16 self)
- Add to MetaCart
We present a method to automatically generate a concise summary by identifying and synthesizing similar elements across related text from a set of multiple documents. Our approach is unique in its usage of language generation to reformulate the wording of the summary.
Summarization Evaluation Methods: Experiments and Analysis
- IN AAAI SYMPOSIUM ON INTELLIGENT SUMMARIZATION
, 1998
"... Two methods are used for evaluation of summarization systems: an evaluation of generated summaries against an "ideal" summary and evaluation of how well summaries help a person perform in a task such as information retrieval. We carried out two large experiments to study the two evaluation methods. ..."
Abstract
-
Cited by 77 (8 self)
- Add to MetaCart
Two methods are used for evaluation of summarization systems: an evaluation of generated summaries against an "ideal" summary and evaluation of how well summaries help a person perform in a task such as information retrieval. We carried out two large experiments to study the two evaluation methods. Our results show that different parameters of an experiment can dramatically affect how well a system scores. For example, summary length was found to affect both types of evaluations. For the "ideal" summary based evaluation, accuracy decreases as summary length increases, while for task based evaluations summary length and accuracy on an information retrieval task appear to correlate randomly. In this paper, we show how this parameter and others can affect evaluation results and describe how parameters can be controlled to produce a sound evaluation. Motivation The evaluation of an NLP system is a key part of any research or development effort and yet it is probably also the most controve...
SIMFINDER: A Flexible Clustering Tool for Summarization
- IN PROCEEDINGS OF THE NAACL WORKSHOP ON AUTOMATIC SUMMARIZATION
, 2001
"... We present a statistical similarity measuring and clustering tool, SIMFINDER, that organizes small pieces of text from one or multiple documents into tight clusters. By placing highly related text units in the same cluster, SIMFINDER enables a subsequent content selection/generation component to red ..."
Abstract
-
Cited by 62 (11 self)
- Add to MetaCart
We present a statistical similarity measuring and clustering tool, SIMFINDER, that organizes small pieces of text from one or multiple documents into tight clusters. By placing highly related text units in the same cluster, SIMFINDER enables a subsequent content selection/generation component to reduce each cluster to a single sentence, either by extraction or by reformulation. We report on improvements in the similarity and clustering components of SIMFINDER, including a quantitative evaluation, and establish the generality of the approach by interfacing SIMFINDER to two very different summarization systems.
Towards Multidocument Summarization by Reformulation: Progress and Prospects
- IN PROCEEDINGS OF AAAI-99
, 1999
"... By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We are developing a multidocument summarization system to automatically generate a concise summary ..."
Abstract
-
Cited by 57 (14 self)
- Add to MetaCart
By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We are developing a multidocument summarization system to automatically generate a concise summary by identifying and synthesizing similarities across a set of related documents. Our approach is unique in its integration of machine learning and statistical techniques to identify similar paragraphs, intersection of similar phrases within paragraphs, and language generation to reformulate the wording of the summary. Our evaluation of system components shows that learning over multiple extracted linguistic features is more effective than information retrieval approaches at identifying similar text units for summarization and that it is possible to generate a fluent summary that conveys similarities among documents even when full semantic interpretations of the input text are not available.
Multi-Document Summarization By Sentence Extraction
- In Proceedings of the ANLP/NAACL Workshop on Automatic Summarization
, 2000
"... This paper discusses a text extraction approach to multidocument summarization that builds on single-document summarization methods by using additional, available in-i formation about the document set as a whole and the relationships between the documents. Multi-document summarization differs from ..."
Abstract
-
Cited by 54 (0 self)
- Add to MetaCart
This paper discusses a text extraction approach to multidocument summarization that builds on single-document summarization methods by using additional, available in-i formation about the document set as a whole and the relationships between the documents. Multi-document summarization differs from single in that the issues of compression, speed, redundancy and passage selec- tion are critical in the formation of useful summaries.
An Annotation Scheme for Discourse-Level Argumentation in Research Articles
- In Proceedings of the 8th Meeting of the European Chapter of the Association for Computational Linguistics (EACL-99
, 1999
"... In order to build robust automatic ab- stracting systems, there is a need for better training resources than are currently available. In this paper, we introduce an annotation scheme for scientific ar- ticles which can be used to build such a resource in a consistent way. The seven categories ..."
Abstract
-
Cited by 44 (9 self)
- Add to MetaCart
In order to build robust automatic ab- stracting systems, there is a need for better training resources than are currently available. In this paper, we introduce an annotation scheme for scientific ar- ticles which can be used to build such a resource in a consistent way. The seven categories of the scheme are based on rhetorical moves of argumentation.
A Common Theory of Information Fusion from Multiple Text Sources Step One: Cross-Document Structure
, 2000
"... We introduce CST (cross-document structure theory), a paradigm for multi-document analysis. CST takes into account the rhetorical structure of dusters of related textual documents. We present a taxonomy of cross-document relationships. We argue that CST can be the basis for multi-document summarizat ..."
Abstract
-
Cited by 41 (11 self)
- Add to MetaCart
We introduce CST (cross-document structure theory), a paradigm for multi-document analysis. CST takes into account the rhetorical structure of dusters of related textual documents. We present a taxonomy of cross-document relationships. We argue that CST can be the basis for multi-document summarization guided by user preferences for summary length, information provenmace, cross-source agreement, and chronological ordering of facts.

