• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Optimizing Ranking Functions: A Connectionist Approach to Adaptive Information Retrieval (1994)

by B T Bartell
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 17
Next 10 →

Automatic Combination of Multiple Ranked Retrieval Systems

by Brian Bartell, Garrison Cottrell, Richard Belew , 1994
"... Retrieval performance can often be improved significantly by using a number of different retrieval algorithms and combining the results, in contrast to using just a single retrieval algorithm. This is because different retrieval algorithms, or retrieval experts, often emphasize different document an ..."
Abstract - Cited by 130 (5 self) - Add to MetaCart
Retrieval performance can often be improved significantly by using a number of different retrieval algorithms and combining the results, in contrast to using just a single retrieval algorithm. This is because different retrieval algorithms, or retrieval experts, often emphasize different document and query features when determining relevance and therefore retrieve different sets of documents. However, it is unclear how the different experts are to be combined, in general, to yield a superior overall estimate. We propose a method by which the relevance estimates made by different experts can be automatically combined to result in superior retrieval performance. We apply the method to two expert combination tasks. The applications demonstrate that the method can identify high performance combinations of experts and also is a novel means for determining the combined effectiveness of experts. 1 Introduction In text retrieval, two heads are definitely better than one. Retrieval performanc...

Predicting the Performance of Linearly Combined IR Systems

by Christopher Vogt, Garrison W. Cottrell - Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
"... We introduce a new technique for analyzing combination models. The technique allows us to make qualitative conclusions about which IR systems should be combined. We achieve this by using a linear regression to accurately (r 2 = 0:98) predict the performance of the combined system based on quantita ..."
Abstract - Cited by 46 (4 self) - Add to MetaCart
We introduce a new technique for analyzing combination models. The technique allows us to make qualitative conclusions about which IR systems should be combined. We achieve this by using a linear regression to accurately (r 2 = 0:98) predict the performance of the combined system based on quantitative measurements of individual component systems taken from TREC5. When applied to a linear model (weighted sum of relevance scores), the technique supports several previously suggested hypotheses: one should maximize both the individual systems' performances and the overlap of relevant documents between systems, while minimizing the overlap of nonrelevant documents. It also suggests new conclusions: both systems should distribute scores similarly, but not rank relevant documents similarly. It furthermore suggests that the linear model is only able to exploit a fraction of the benefit possible from combination. The technique is general in nature and capable of pointing out the strengths and...

Relevance score normalization for metasearch

by Mark Montague - 10 th Conf. on Information and Knowledge Management (CIKM 2001). Atlanta, GA , 2001
"... Given the ranked lists of documents returned by multiple search engines in response to a given query, the problem of metasearch is to combine these lists in a way which optimizes the performance of the combination. This problem can be naturally decomposed into three subproblems: (1) normalizing the ..."
Abstract - Cited by 30 (1 self) - Add to MetaCart
Given the ranked lists of documents returned by multiple search engines in response to a given query, the problem of metasearch is to combine these lists in a way which optimizes the performance of the combination. This problem can be naturally decomposed into three subproblems: (1) normalizing the relevance scores given by the input systems, (2) estimating relevance scores for unretrieved documents, and (3) combining the newly-acquired scores for each document into one, improved score. Research on the problem of metasearch has historically concentrated on algorithms for combining (normalized) scores. In this paper, we show that the techniques used for normalizing relevance scores and estimating the relevance scores of unretrieved documents can have a significant effect on the overall performance of metasearch. We propose two new normalization/estimation techniques and demonstrate empirically that the performance of well known metasearch algorithms can be significantly improved through their use. 1.

FRank: A Ranking Method with Fidelity Loss

by Ming-feng Tsai, Tie-yan Liu, Tao Qin, Hsin-hsi Chen, Wei-ying Ma , 2007
"... Ranking problem is becoming important in many fields, especially in information retrieval (IR). Many machine learning techniques have been proposed for ranking problem, such as RankSVM, RankBoost, and RankNet. Among them, RankNet, which is based on a probabilistic ranking framework, is leading to pr ..."
Abstract - Cited by 26 (10 self) - Add to MetaCart
Ranking problem is becoming important in many fields, especially in information retrieval (IR). Many machine learning techniques have been proposed for ranking problem, such as RankSVM, RankBoost, and RankNet. Among them, RankNet, which is based on a probabilistic ranking framework, is leading to promising results and has been applied to a commercial Web search engine. In this paper we conduct further study on the probabilistic ranking framework and provide a novel loss function named fidelity loss for measuring loss of ranking. The fidelity loss not only inherits effective properties of the probabilistic ranking framework in RankNet, but possesses new properties that are helpful for ranking. This includes the fidelity loss obtaining zero for each document pair, and having a finite upper bound that is necessary for conducting query-level normalization. We also propose an algorithm named FRank based on a generalized additive model for the sake of minimizing the fidelity loss and learning an effective ranking function. We evaluated the proposed algorithm for two datasets: TREC dataset and real Web search dataset. The experimental results show that the proposed FRank algorithm outperforms other learning-based ranking methods on both conventional IR problem and Web searching.

Using Relevance to Train a Linear Mixture of Experts

by Christopher C. Vogt, Garrison W. Cottrell, Richard K. Belew, Brian T. Bartell - In [Harman , 1997
"... A linear mixture of experts is used to combine three standard IR systems. The parameters for the mixture are determined automatically through training on document relevance assessments via optimization of a rank-order statistic which is empirically correlated with average precision. The mixture impr ..."
Abstract - Cited by 18 (4 self) - Add to MetaCart
A linear mixture of experts is used to combine three standard IR systems. The parameters for the mixture are determined automatically through training on document relevance assessments via optimization of a rank-order statistic which is empirically correlated with average precision. The mixture improves performance in some cases and degrades it in others, with the degradations possibly due to training techniques, model strength, and poor performance of the individual experts. 1 INTRODUCTION The mixture of experts approach is one which is gaining in popularity in many areas of computer science and artificial intelligence (e.g., [Jordan and Jacobs, 1994]) and one which is especially applicable to information retrieval, since in practice the sets of relevant documents returned by different IR algorithms (or experts) often have little overlap. In fact, the pooling method used by past TREC's to determine which documents are relevant can be viewed as a sort of mixture model on the grandest ...

Optimizing Parameters in a Ranked Retrieval System Using Multi-Query Relevance Feedback

by Brian Bartell, Garrison W. Cottrell, Richard K. Belew - In Proceedings of the Symposium on Document Analysis and Information Retrieval, Las Vegas , 1994
"... A method is proposed by which parameters in ranked-output text retrieval systems can be automatically optimized to improve retrieval performance. A ranked-output text retrieval system implements a ranking function which orders documents, placing documents estimated to be more relevant to the user's ..."
Abstract - Cited by 17 (4 self) - Add to MetaCart
A method is proposed by which parameters in ranked-output text retrieval systems can be automatically optimized to improve retrieval performance. A ranked-output text retrieval system implements a ranking function which orders documents, placing documents estimated to be more relevant to the user's query before less relevant ones. The proposed method is to adjust system parameters to maximize the match between the system's document ordering and the user's desired ordering, given by relevance feedback. The utility of the approach is demonstrated by estimating the similarity measure in a vector space model of information retrieval. The approach automatically finds a similarity measure which performs equivalent to or better than all "classic" similarity measures studied. It also performs within 1% of an estimated theoretically optimal measure. 1 Introduction State of the art document retrieval systems have a large number of free parameters, such as the weights on terms in documents, para...

On-Line New Event Detection, Clustering, And Tracking

by Ron Papka , 1999
"... In this work, we discuss and evaluate solutions to text classification problems associated with the events that are reported in on-line sources of news. We present solutions to three related classification problems: new event detection, event clustering, and event tracking. The primary focus of this ..."
Abstract - Cited by 16 (0 self) - Add to MetaCart
In this work, we discuss and evaluate solutions to text classification problems associated with the events that are reported in on-line sources of news. We present solutions to three related classification problems: new event detection, event clustering, and event tracking. The primary focus of this thesis is new event detection, where the goal is to identify news stories that have not previously been reported, in a stream of broadcast news comprising radio, television, and newswire. We present an algorithm for new event detection, and analyze the effects of incorporating domain properties into the classification algorithm. We explore a solution that models the temporal relationship between news stories, and investigate the use of proper noun phrase

Text-Based Information Retrieval Using Exponentiated Gradient Descent

by Ron Papka, James P. Callan, Andrew G. Barto - Advances in Neural Information Processing Systems , 1996
"... The following investigates the use of single-neuron learning algorithms to improve the performance of text-retrieval systems that accept natural-language queries. A retrieval process is explained that transforms the natural-language query into the query syntax of a real retrieval system: the initial ..."
Abstract - Cited by 15 (5 self) - Add to MetaCart
The following investigates the use of single-neuron learning algorithms to improve the performance of text-retrieval systems that accept natural-language queries. A retrieval process is explained that transforms the natural-language query into the query syntax of a real retrieval system: the initial query is expanded using statistical and learning techniques and is then used for document ranking and binary classification. The results of experiments suggest that Kivinen and Warmuth's Exponentiated Gradient Descent learning algorithm works significantly better than previous approaches. 1 Introduction The following work explores two learning algorithms -- Least Mean Squared (LMS) [1] and Exponentiated Gradient Descent (EG) [2] -- in the context of text-based Information Retrieval (IR) systems. The experiments presented in [3] use connectionist learning models to improve the retrieval of relevant documents from a large collection of text. Here, we present further analysis of those experim...

Adaptive Combination of Evidence for Information Retrieval

by Christopher C. Vogt , 1999
"... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv I Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 II Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 A. Overview . . . . . . . . . . . . . . . . . . . . . . . . . ..."
Abstract - Cited by 10 (1 self) - Add to MetaCart
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv I Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 II Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 A. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 B. Information Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. The Information Retrieval Problem . . . . . . . . . . . . . . . . 4 2. Relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3. Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4. Relevance Feedback . . . . . . . . . . . . . . . . . . . . . . . . . 12 5. Approaches to IR#Sources of Evidence . . . . . . . . . . . . . . . 13 C. Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1. Optimization Techniques . . . . . . . . . . . . . . . . . . . . . . 20 D. Machine Learning in IR . . . . . . . . . . . . . . . . . . . . . . . . . ...

On rank-based effectiveness measures and optimization

by Stephen Robertson, Hugo Zaragoza - Information Retrieval
"... ranking functions; optimization Many current retrieval models and scoring functions contain free parameters which need to be set – ideally, optimized. The process of optimization normally involves some training corpus of the usual document-query-relevance judgement type, and some choice of measure t ..."
Abstract - Cited by 10 (1 self) - Add to MetaCart
ranking functions; optimization Many current retrieval models and scoring functions contain free parameters which need to be set – ideally, optimized. The process of optimization normally involves some training corpus of the usual document-query-relevance judgement type, and some choice of measure that is to be optimized. The paper proposes a way to think about the process of exploring the space of parameter values, and how moving around in this space might be expected to affect different measures. One result, concerning local optima, is demonstrated for a range of rank-based evaluation measures. 1 Parameters and optimization This paper addresses some basic features of many of the measures of retrieval effectiveness in current or past use. The objective is to understand aspects of their behaviour, with particular reference to the optimization of parameters of a retrieval model or ranking function. This is a theoretical paper.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University