Results 1 -
6 of
6
Query Independent Sentence Scoring approach to DUC 2006
- In Document Understanding Conference
, 2006
"... 1 ..."
Multi-document Summarization Using Support Vector Regression
"... Most multi-document summarization systems follow the extractive framework based on various features. While more and more sophisticated features are designed, the reasonable combination of features becomes a challenge. Usually the features are combined by a linear function whose weights are tuned man ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Most multi-document summarization systems follow the extractive framework based on various features. While more and more sophisticated features are designed, the reasonable combination of features becomes a challenge. Usually the features are combined by a linear function whose weights are tuned manually. In this task, Support Vector Regression (SVR) model is used for automatically combining the features and scoring the sentences. Two important problems are inevitably involved. The first one is how to acquire the training data. Several automatic generation methods are introduced based on the standard reference summaries generated by human. Another indispensable problem in SVR application is feature selection, where various features will be picked out and combined into different feature sets to be tested. With the aid of DUC 2005 and 2006 data sets, comprehensive experiments are conducted with consideration of various SVR kernels and feature sets. Then the trained SVR model is used in the main task of DUC 2007 to get the extractive summaries. 1.
Multi-topic based query-oriented summarization
- SIAM International Conference Data Mining
, 2009
"... Query-oriented summarization aims at extracting an informative summary from a document collection for a given query. It is very useful to help users grasp the main information related to a query. Existing work can be mainly classified into two categories: supervised method and unsupervised method. T ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Query-oriented summarization aims at extracting an informative summary from a document collection for a given query. It is very useful to help users grasp the main information related to a query. Existing work can be mainly classified into two categories: supervised method and unsupervised method. The former requires training examples, which makes the method limited to predefined domains. While the latter usually utilizes clustering algorithms to find ‘centered ’ sentences as the summary. However, the method does not consider the query information, thus the summarization is general about the document collection itself. Moreover, most of existing work assumes that documents related to the query only talks about one topic. Unfortunately, statistics show that a large portion of summarization tasks talk about multiple topics. In this paper, we try to break limitations of the existing methods and study a new setup of the problem of multi-topic based query-oriented summarization. We propose using a probabilistic approach to solve this problem. More specifically, we propose two strategies to incorporate the query information into a probabilistic model. Experimental results on two different genres of data show that our proposed approach can effectively extract a multi-topic summary from a document collection and the summarization performance is better than baseline methods. The approach is quite general and can be applied to many other mining tasks, for example product opinion analysis and question answering. 1
Summarizing Relevant Information for Question-Answering Using Hybrid Relevance Analysis and Surface Feature Salience
"... Abstract:- Much research for question-answering aims to answer factoid, definitional and biographical questions. In most cases, the answers are given as a name, date, quantity, and so on. In this paper, we try to merge techniques of multidocument summarization and question-answering to generate a br ..."
Abstract
- Add to MetaCart
Abstract:- Much research for question-answering aims to answer factoid, definitional and biographical questions. In most cases, the answers are given as a name, date, quantity, and so on. In this paper, we try to merge techniques of multidocument summarization and question-answering to generate a brief, well-organized fluent summary to provide more relevant information for the purpose of answering real-world complicated questions. The problem is addressed as a query-biased sentence retrieval task. We propose a hybrid relevance analysis to evaluate the relevance of a sentence to the query. The summary is created by including sentences with the topmost significances which are measured in terms of sentence relevance and surface feature salience. In addition, a modified Maximal Marginal Relevance is proposed for anti-redundancy. The proposed approach was evaluated with the DUC 2005 corpus and found to perform well with competitive results. Key-Words:- Query-focused summarization; Hybrid relevance analysis; Sentence feature salience; Latent semantic analysis; Modified Maximal marginal relevance;
Language Model Passage Retrieval for Question-Oriented Multi Document Summarization
"... The goal of question-oriented text summarization aims at producing the informative short description according to the given queries. This is somewhat similar to the target of question answering which retrieves exact answers from large raw text collections. In this paper, we present a resource, and t ..."
Abstract
- Add to MetaCart
The goal of question-oriented text summarization aims at producing the informative short description according to the given queries. This is somewhat similar to the target of question answering which retrieves exact answers from large raw text collections. In this paper, we present a resource, and training data-free summarization model for DUC multi-document summarization task. Similar as last year, our method simplified the two-pass retrieval as a passage retrieval task. At first the top-down clustering algorithm is used to merge similar passages into a set of groups. Then the passage retriever extracts relevant groups in response to the given query. Finally a maximizing scorer is used to re-form the sentences into the final summary. This the second time to participate in DUC. Although the result of our system is not comparable with most top-performed methods, the light-weight and rule free techniques still encourage us to further improve via integrating rich sources. 1
Query Focus Guided Sentence Selection Strategy for DUC 2006
"... This paper presents our new query-based multi-document summarization system for DUC 2006. It is an extended version of a generic multi-document summarization system developed previously (namely PoluS 1.0) which incorporates latent semantic analysis (LSA) technology. To make the generated summaries s ..."
Abstract
- Add to MetaCart
This paper presents our new query-based multi-document summarization system for DUC 2006. It is an extended version of a generic multi-document summarization system developed previously (namely PoluS 1.0) which incorporates latent semantic analysis (LSA) technology. To make the generated summaries satisfying user’s information need as possible as we can, we propose a query focus guided sentence selection strategy. The evaluation results show that our system ranks in the middle among 34 submitted systems. Although there is still room to improve the current version of PoluS, it provides a good framework for our future research on multi-document summarization. 1

