• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 785
Next 10 →

Cumulated Gain-based Evaluation of IR Techniques

by Kalervo Järvelin, Jaana Kekäläinen - ACM Transactions on Information Systems , 2002
"... Modem large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, i ..."
Abstract - Cited by 694 (3 self) - Add to MetaCart
measures are defined and discussed and then their use is demonstrated in a case study using TREC data - sample system run results for 20 queries in TREC-7. As relevance base we used novel graded relevance assessments on a four-point scale. The test results indicate that the proposed measures credit IR

An extensive empirical study of feature selection metrics for text classification

by George Forman, Isabelle Guyon, André Elisseeff - J. of Machine Learning Research , 2003
"... Machine learning for text classification is the cornerstone of document categorization, news filtering, document routing, and personalization. In text domains, effective feature selection is essential to make the learning task efficient and more accurate. This paper presents an empirical comparison ..."
Abstract - Cited by 496 (15 self) - Add to MetaCart
of twelve feature selection methods (e.g. Information Gain) evaluated on a benchmark of 229 text classification problem instances that were gathered from Reuters, TREC, OHSUMED, etc. The results are analyzed from multiple goal perspectives—accuracy, F-measure, precision, and recall—since each is appropriate

Query clustering and IR system detection. Experiments on TREC data

by Desire Kompaore, Josiane Mothe, Alain Baccini, Sebastien Dejean
"... This paper investigates two aspects in this experiment. Linguistic techniques are used to categorize queries in a first step. This classification is then used to analyze systems performances in a TREC context. More precisely, we cluster TREC topics with 13 linguistic features (Mothe and al, 2005), a ..."
Abstract - Add to MetaCart
This paper investigates two aspects in this experiment. Linguistic techniques are used to categorize queries in a first step. This classification is then used to analyze systems performances in a TREC context. More precisely, we cluster TREC topics with 13 linguistic features (Mothe and al, 2005

Collection selection and results merging with topically organized U.S. patents and TREC data

by Leah S. Larkey, Margaret E. Connell, Jamie Callan - In CIKM 2000 , 2000
"... We investigate three issues in distributed information retrieval, considering both TREC data and U.S. Patents: (1) topical organization of large text collections, (2) collection ranking and selection with topically organized collections (3) results merging, particularly document score normalization, ..."
Abstract - Cited by 50 (8 self) - Add to MetaCart
We investigate three issues in distributed information retrieval, considering both TREC data and U.S. Patents: (1) topical organization of large text collections, (2) collection ranking and selection with topically organized collections (3) results merging, particularly document score normalization

A Probabilistic Model of Information Retrieval: Development and Status

by K. Sparck Jones, S. Walker, S.E. Robertson , 1998
"... The paper combines a comprehensive account of the probabilistic model of retrieval with new systematic experiments on TREC Programme material. It presents the model from its foundations through its logical development to cover more aspects of retrieval data and a wider range of system functions. Eac ..."
Abstract - Cited by 360 (25 self) - Add to MetaCart
The paper combines a comprehensive account of the probabilistic model of retrieval with new systematic experiments on TREC Programme material. It presents the model from its foundations through its logical development to cover more aspects of retrieval data and a wider range of system functions

Information Retrieval as Statistical Translation

by Adam Berger, John Lafferty
"... We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a statistical model of how a user might distill or "translate" a given document into a query. To assess the rele ..."
Abstract - Cited by 313 (6 self) - Add to MetaCart
by Ponte and Croft. In a series of experiments on TREC data, a simple translation-based retrieval system performs well in compari...

Predicting Query Performance

by Steve Cronen-Townsend, Yun Zhou, W. Bruce Croft , 2002
"... We develop a method for predicting query performance by computing the relative entropy between a query language model and the corresponding collection language model. The resulting clarity score measures the coherence of the language usage in documents whose models are likely to generate the query. ..."
Abstract - Cited by 269 (16 self) - Add to MetaCart
information. We develop an algorithm for automatically setting the clarity score threshold between predicted poorly-performing queries and acceptable queries and validate it using TREC data. In particular, we compare the automatic thresholds to optimum thresholds and also check how frequently results as good

A Markov random field model for term dependencies

by Donald Metzler, W. Bruce Croft
"... This paper develops a general, formal framework for modeling term dependencies via Markov random fields. The model allows for arbitrary text features to be incorporated as evidence. In particular, we make use of features based on occurrences of single terms, ordered phrases, and unordered phrases. W ..."
Abstract - Cited by 289 (55 self) - Add to MetaCart
. We explore full independence, sequential dependence, and full dependence variants of the model. A novel approach is developed to train the model that directly maximizes the mean average precision rather than maximizing the likelihood of the training data. Ad hoc retrieval experiments are presented

The TREC-5 Filtering Track

by David D. Lewis - The Fifth Text REtrieval Conference (TREC-5 , 1997
"... The TREC-5 filtering track, an evaluation of binary text classification systems, was a repeat of the filtering evaluation run in a trial version for TREC-4, with only the data set and participants changing. Seven sites took part, submitting a total of ten runs. We review the nature of the task, the ..."
Abstract - Cited by 41 (0 self) - Add to MetaCart
The TREC-5 filtering track, an evaluation of binary text classification systems, was a repeat of the filtering evaluation run in a trial version for TREC-4, with only the data set and participants changing. Seven sites took part, submitting a total of ten runs. We review the nature of the task

TREC Genomics Track Overview

by William Hersh, Ravi Teja Bhupatiraju , 2003
"... The first year of TREC Genomics Track featured two tasks: ad hoc retrieval and information extraction. Both tasks centered around the Gene Reference into Function (GeneRIF) resource of the National Library of Medicine, which was used as both pseudorelevance judgments for ad hoc document retrieval as ..."
Abstract - Cited by 43 (1 self) - Add to MetaCart
with the growth of new information needs (e.g., question-answering, cross-lingual), data types (e.g., video) and platforms (e.g., the Web). This paper describes the events leading up to the first year of TREC Genomics Track, the first year’s results, and future directions for subsequent years. Genomics
Next 10 →
Results 1 - 10 of 785
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University