Method Combination For Document Filtering (1996)
| Citations: | 45 - 1 self |
BibTeX
@MISC{Hull96methodcombination,
author = {David A. Hull and Jan O. Pedersen and Hinrich Schütze},
title = {Method Combination For Document Filtering},
year = {1996}
}
Years of Citing Articles
OpenURL
Abstract
There is strong empirical and theoretic evidence that combination of retrieval methods can improve performance. In this paper, we systematically compare combination strategies in the context of document filtering, using queries from the Tipster reference corpus. We find that simple averaging strategies do indeed improve performance, but that direct averaging of probability estimates is not the correct approach. Instead, the probability estimates must be renormalized using logistic regression on the known relevance judgements. We examine more complex combination strategies but find them less successful due to the high correlations among our filtering methods which are optimized over the same training data and employ similar document representations. 1 Introduction A text filtering system monitors an incoming document stream and selects documents identified as relevant to one or more of its query profiles. If profile interactions are ignored, this reduces to a number of independent bina...







