Results 1  10
of
71
Provenance semirings
 In PODS’07, Bejing
, 2007
"... Provenance Semirings We show that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and why provenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation that uses sem ..."
Abstract

Cited by 180 (27 self)
 Add to MetaCart
(Show Context)
Provenance Semirings We show that relational algebra calculations for incomplete databases, probabilistic databases, bag semantics and why provenance are particular cases of the same general algorithms involving semirings. This further suggests a comprehensive provenance representation that uses semirings of polynomials. We extend these considerations to datalog and semirings of formal power series. We give algorithms for datalog provenance calculation as well as datalog evaluation for incomplete and probabilistic databases. Finally, we show that for some semirings containment of conjunctive queries is the same as for standard set semantics.
HySpirit  a Probabilistic Inference Engine for Hypermedia Retrieval in Large Databases
 Proceedings of the 6th International Conference on Extending Database Technology (EDBT
, 1998
"... . HySpirit is a retrieval engine for hypermedia retrieval integrating concepts from information retrieval (IR) and deductive databases. The logical view on IR models retrieval as uncertain inference, for which we use probabilistic reasoning. Since the expressiveness of classical IR models is not suf ..."
Abstract

Cited by 41 (9 self)
 Add to MetaCart
(Show Context)
. HySpirit is a retrieval engine for hypermedia retrieval integrating concepts from information retrieval (IR) and deductive databases. The logical view on IR models retrieval as uncertain inference, for which we use probabilistic reasoning. Since the expressiveness of classical IR models is not sufficient for hypermedia retrieval, HySpirit is based on a probabilistic version of Datalog. In hypermedia retrieval, different nodes may contain contradictory information; thus, we introduce probabilistic fourvalued Datalog. In order to support fact queries as well as contentbased retrieval, HySpirit is based on an open world assumption, but allows for predicatespecific closed world assumptions. For performing efficient retrieval on large databases, our system provides access to external data. We demonstrate the application of HySpirit by giving examples for retrieval on images, structured documents and large databases. 1 Introduction Due to the advances in hardware, processing of multimed...
Answering Queries from Statistics and Probabilistic Views
, 2005
"... this paper, require complex correlations between tuples, for which the query semantics has not been previously studied ..."
Abstract

Cited by 38 (2 self)
 Add to MetaCart
this paper, require complex correlations between tuples, for which the query semantics has not been previously studied
Probability kinematics in information retrieval
 ACM Transactions on Information Systems
, 1995
"... We analyse the kinematics of probabilistic term weights at retrieval time for di erent Information Retrieval models. We present four models based on di erent notions of probabilistic retrieval. Two of these models are based on classical probability theory and can be considered as prototypes of model ..."
Abstract

Cited by 37 (6 self)
 Add to MetaCart
We analyse the kinematics of probabilistic term weights at retrieval time for di erent Information Retrieval models. We present four models based on di erent notions of probabilistic retrieval. Two of these models are based on classical probability theory and can be considered as prototypes of models long in use in Information Retrieval, like the Vector Space Model and the Probabilistic Model. The two other models are based on a logical technique of evaluating the probability of a conditional called imaging, one is a generalisation of the other. We analyse the transfer of probabilities occurring in the term space at retrieval time for these four models, compare their retrieval performance using classical test collections, and discuss the results. We believe that our results provide useful suggestions on how to improve existing probabilistic models of Information Retrieval by taking into consideration termterm similarity.
A Relevance Terminological Logic for Information Retrieval
 IN PROCEEDINGS OF SIGIR96, 19TH INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL
, 1996
"... A Terminological Logic is presented as an information retrieval model, with a fourvalued semantics that gives to its inference relation the flavour of relevance, that is a strict connection in meaning between the premises and the conclusion of the arguments licensed by the logic. The logic also per ..."
Abstract

Cited by 36 (9 self)
 Add to MetaCart
A Terminological Logic is presented as an information retrieval model, with a fourvalued semantics that gives to its inference relation the flavour of relevance, that is a strict connection in meaning between the premises and the conclusion of the arguments licensed by the logic. The logic also permits the expression of metaknowledge enforcing a closedworld reading of the knowledge concerning specified individuals and primitive concepts. A Gentzenstyle, sound and complete calculus for reasoning in the logic is given, thus establishing the basis for an information retrieval engine.
Range Search on Multidimensional Uncertain Data
"... In an uncertain database, every object o is associated with a probability density function, which describes the likelihood that o appears at each position in a multidimensional workspace. This article studies two types of range retrieval fundamental to many analytical tasks. Specifically, a nonfuzzy ..."
Abstract

Cited by 34 (7 self)
 Add to MetaCart
In an uncertain database, every object o is associated with a probability density function, which describes the likelihood that o appears at each position in a multidimensional workspace. This article studies two types of range retrieval fundamental to many analytical tasks. Specifically, a nonfuzzy query returns all the objects that appear in a search region rq with at least a certain probability tq. On the other hand, given an uncertain object q, fuzzy search retrieves the set of objects that are within distance εq from q with no less than probability tq. The core of our methodology is a novel concept of “probabilistically constrained rectangle”, which permits effective pruning/validation of nonqualifying/qualifying data. We develop a new index structure called the Utree for minimizing the query overhead. Our algorithmic findings are accompanied with a thorough theoretical analysis, which reveals valuable insight into the problem characteristics, and mathematically confirms the efficiency of our solutions. We verify the effectiveness of the proposed techniques with extensive
Retrieval of Complex Objects Using a FourValued Logic
 Proceedings of the 19th International ACM SIGIR Conference on Research and Development in Information Retrieval
, 1996
"... The aggregated structure of documents plays a key role in fulltext, multimedia, and network Information Retrieval (IR). Considering aggregation provides new querying facilities and improves retrieval effectiveness. We present a knowledge representation for IR purposes which pays special attention t ..."
Abstract

Cited by 24 (6 self)
 Add to MetaCart
The aggregated structure of documents plays a key role in fulltext, multimedia, and network Information Retrieval (IR). Considering aggregation provides new querying facilities and improves retrieval effectiveness. We present a knowledge representation for IR purposes which pays special attention to this aggregated structure of objects. In addition, further features of objects can be described. Thus, the structure of fulltext documents, the heterogeneity and the spatial and temporal relationships of objects typical for multimedia IR, and meta information for network IR are representable within one integrated framework. The model we propose allows for querying on the content of documents (objects) as well as on other features. The query result may contain objects having different types. Instead of retrieving only whole documents, the retrieval process determines the least aggregated entities that imply the query. 1 Motivation and Background New IR applications like fulltext, multime...
Using a Belief Revision Operator for Document Ranking in Extended Boolean Models
 In Proc. of SIGIR99, the 22th ACM Conference on Research and Development in Information Retrieval
, 1999
"... This paper claims that Belief Revision can be seen as a theoretical framework for document ranking in Extended Boolean Models. For a model of Information Retrieval based on propositional logic, we propose a similarity measure which is equivalent to a PNorm case. Therefore it shares the PNorm good p ..."
Abstract

Cited by 22 (12 self)
 Add to MetaCart
(Show Context)
This paper claims that Belief Revision can be seen as a theoretical framework for document ranking in Extended Boolean Models. For a model of Information Retrieval based on propositional logic, we propose a similarity measure which is equivalent to a PNorm case. Therefore it shares the PNorm good properties and behaviour. Besides, it is theoretically ensured that this measure follows the notion of proximity between the documents and the query. The logical model can naturally deal with incomplete descriptions of documents and the similarity values are also obtained for this case. 1 Introduction Logical approaches have been proposed to model Information Retrieval (IR) in a formal framework. Van Rijsbergen was the pioneer in thinking that logic could help in the retrieval of relevant documents [21]. Moreover, he proposed logic as a new theoretical framework for investigating IR. Given d, a logical representation of a document, and q, a logical representation of a query, retrieval is si...
Solving The Word Mismatch Problem Through Automatic Text Analysis
, 1997
"... Information Retrieval (IR) is concerned with locating documents that are relevant for a user's information need or query from a large collection of documents. A fundamental problem for information retrieval is word mismatch. A query is usually a short and incomplete description of the underlyin ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
Information Retrieval (IR) is concerned with locating documents that are relevant for a user's information need or query from a large collection of documents. A fundamental problem for information retrieval is word mismatch. A query is usually a short and incomplete description of the underlying information need. The users of IR systems and the authors of the documents often use different words to refer to the same concepts. This thesis addresses the word mismatch problem through automatic text analysis. We investigate two text analysis techniques, corpus analysis and local context analysis, and apply them in two domains of word mismatch, stemming and general query expansion. Experimental results show that these techniques ca...