Results 1  10
of
183
A Probabilistic Model of Information Retrieval: Development and Status
, 1998
"... The paper combines a comprehensive account of the probabilistic model of retrieval with new systematic experiments on TREC Programme material. It presents the model from its foundations through its logical development to cover more aspects of retrieval data and a wider range of system functions. Eac ..."
Abstract

Cited by 298 (20 self)
 Add to MetaCart
The paper combines a comprehensive account of the probabilistic model of retrieval with new systematic experiments on TREC Programme material. It presents the model from its foundations through its logical development to cover more aspects of retrieval data and a wider range of system functions. Each step in the argument is matched by comparative retrieval tests, to provide a single coherent account of a major line of research. The experiments demonstrate, for a large test collection, that the probabilistic model is effective and robust, and that it responds appropriately, with major improvements in performance, to key features of retrieval situations.
Inference Networks for Document Retrieval
, 1990
"... The use of inference networks to support document retrieval is introduced. A networkbasead retrieval model is described and compared to conventional probabilistic and Boolean models. 1 ..."
Abstract

Cited by 246 (8 self)
 Add to MetaCart
The use of inference networks to support document retrieval is introduced. A networkbasead retrieval model is described and compared to conventional probabilistic and Boolean models. 1
A Probabilistic Relational Algebra for the Integration of Information Retrieval and Database Systems
 ACM Transactions on Information Systems
, 1994
"... We present a probabilistic relational algebra (PRA) which is a generalization of standard relational algebra. Here tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Based on intensional semantics, the tuple weights of the result of a PRA expression ..."
Abstract

Cited by 189 (31 self)
 Add to MetaCart
We present a probabilistic relational algebra (PRA) which is a generalization of standard relational algebra. Here tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Based on intensional semantics, the tuple weights of the result of a PRA expression always confirm to the underlying probabilistic model. We also show for which expressions extensional semantics yields the same results. Furthermore, we discuss complexity issues and indicate possibilities for optimization. With regard to databases, the approach allows for representing imprecise attribute values, whereas for information retrieval, probabilistic document indexing and probabilistic search term weighting can be modelled. As an important extension, we introduce the concept of vague predicates which yields a probabilistic weight instead of a Boolean value, thus allowing for queries with vague selection conditions. So PRA implements uncertainty and vagueness in combination with the...
A DecisionTheoretic Approach to Database Selection in Networked IR
 ACM Transactions on Information Systems
, 1996
"... this paper, we address the resource discovery issue, which consists of two subtasks, namely database detection and database selection. Database detection can be performed relatively easily, either by exploiting the name conventions used in the domain name service of the internet (e.g. names of ftp s ..."
Abstract

Cited by 138 (16 self)
 Add to MetaCart
this paper, we address the resource discovery issue, which consists of two subtasks, namely database detection and database selection. Database detection can be performed relatively easily, either by exploiting the name conventions used in the domain name service of the internet (e.g. names of ftp servers should start with `ftp.', names of Web servers with `www.') or by establishing central registries (e.g. the directoryofservers for WAIS systems)
Probabilistic Models in Information Retrieval
 The Computer Journal
, 1992
"... In this paper, an introduction and survey over probabilistic information retrieval (IR) is given. First, the basic concepts of this approach are described: the probability ranking principle shows that optimum retrieval quality can be achieved under certain assumptions; a conceptual model for IR alon ..."
Abstract

Cited by 113 (4 self)
 Add to MetaCart
In this paper, an introduction and survey over probabilistic information retrieval (IR) is given. First, the basic concepts of this approach are described: the probability ranking principle shows that optimum retrieval quality can be achieved under certain assumptions; a conceptual model for IR along with the corresponding event space clarify the interpretation of the probabilistic parameters involved. For the estimation of these parameters, three different learning strategies are distinguished, namely queryrelated, documentrelated and descriptionrelated learning. As a representative for each of these strategies, a specific model is described. A new approach regards IR as uncertain inference; here, imaging is used as a new technique for estimating the probabilistic parameters, and probabilistic inference networks support more complex forms of inference. Finally, the more general problems of parameter estimation, query expansion and the development of models for advanced document representations are discussed.
A model of information retrieval based on a terminological logic
, 1993
"... According to the logical model of Information Retrieval (IR), the task of IR can be described as the extraction, from a given document base, of those documents d that, given a query q, make the formula d → q valid, where d and q are formulae of the chosen logic and “→ ” denotes the brand of logical ..."
Abstract

Cited by 94 (19 self)
 Add to MetaCart
According to the logical model of Information Retrieval (IR), the task of IR can be described as the extraction, from a given document base, of those documents d that, given a query q, make the formula d → q valid, where d and q are formulae of the chosen logic and “→ ” denotes the brand of logical implication formalized by the logic in question. In this paper, although essentially subscribing to this view, we propose that the logic to be chosen for this endeavour be a Terminological Logic (TL): accordingly, the IR task becomes that of singling out those documents d such that d � q, where d and q are terms of the chosen TL and “�” denotes subsumption between terms. We call this the terminological model of IR. TLs are particularly suitable for modelling IR; in fact, they can be employed: 1) in representing documents under a variety of aspects (e.g. structural, layout, semantic content); 2) in representing queries; 3) in representing lexical, “thesaural ” knowledge. The fact that a single logical language can be used for all these representational endeavours ensures that all these sources of knowledge will participate in the retrieval process in a uniform and principled way. In this paper we introduce Mirtl, a TL for modelling IR according to the above guidelines; its syntax, formal semantics and inferential algorithm are described. 1
COMBINING APPROACHES TO INFORMATION RETRIEVAL
"... The combination of different text representations and search strategies has become a standard technique for improving the effectiveness of information retrieval. Combination, for example, has been studied extensively in the TREC evaluations and is the basis of the “metasearch” engines used on the W ..."
Abstract

Cited by 93 (2 self)
 Add to MetaCart
The combination of different text representations and search strategies has become a standard technique for improving the effectiveness of information retrieval. Combination, for example, has been studied extensively in the TREC evaluations and is the basis of the “metasearch” engines used on the Web. This paper examines the development of this technique, including both experimental results and the retrieval models that have been proposed as formal frameworks for combination. We show that combining approaches for information retrieval can be modeled as combining the outputs of multiple classifiers based on one or more representations, and that this simple model can provide explanations for many of the experimental results. We also show that this view of combination is very similar to the inference net model, and that a new approach to retrieval based on language models supports combination and can be integrated with the inference net model.
A Formal Study of Information Retrieval Heuristics
 SIGIR '04
, 2004
"... Empirical studies of information retrieval methods show that good retrieval performance is closely related to the use of various retrieval heuristics, such as TFIDF weighting. One basic research question is thus what exactly are these "necessary" heuristics that seem to cause good retriev ..."
Abstract

Cited by 76 (14 self)
 Add to MetaCart
Empirical studies of information retrieval methods show that good retrieval performance is closely related to the use of various retrieval heuristics, such as TFIDF weighting. One basic research question is thus what exactly are these "necessary" heuristics that seem to cause good retrieval performance. In this paper, we present a formal study of retrieval heuristics. We formally define a set of basic desirable constraints that any reasonable retrieval function should satisfy, and check these constraints on a variety of representative retrieval functions. We find that none of these retrieval functions satisfies all the constraints unconditionally. Empirical results show that when a constraint is not satisfied, it often indicates nonoptimality of the method, and when a constraint is satisfied only for a certain range of parameter values, its performance tends to be poor when the parameter is out of the range. In general, we find that the empirical performance of a retrieval formula is tightly related to how well it satisfies these constraints. Thus the proposed constraints provide a good explanation of many empirical observations and make it possible to evaluate any existing or new retrieval formula analytically.
Probabilistic Datalog  a Logic for Powerful Retrieval Methods
 Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 1995
"... ing with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 8690481, or <permissions@acm.org> 2 \Delta Since Datalog P al ..."
Abstract

Cited by 68 (18 self)
 Add to MetaCart
ing with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 8690481, or <permissions@acm.org> 2 \Delta Since Datalog P allows for recursive rules, it provides more powerful inference than any other (implemented) probabilistic IR model. Finally, since Datalog P is a generalization of (deterministic) Datalog, it can be used as a standard query language for both IR and database systems, and thus also for integration of these two types of systems on the logical level. 2. INFORMAL DESCRIPTION OF Datalog P Probabilistic Datalog is an extension of stratified Datalog (see e.g. [Ullman 88], [Ceri et al. 90]). On the syntactical level, the only difference is that with ground facts, also a probabilistic weight may be given, e.g. 0.7 indterm(d1,ir). 0.8 indterm(d1,db). Informally speaking, the probabilistic weight gives...