Results 1  10
of
16
A Probabilistic Relational Algebra for the Integration of Information Retrieval and Database Systems
 ACM Transactions on Information Systems
, 1994
"... We present a probabilistic relational algebra (PRA) which is a generalization of standard relational algebra. Here tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Based on intensional semantics, the tuple weights of the result of a PRA expression ..."
Abstract

Cited by 211 (34 self)
 Add to MetaCart
We present a probabilistic relational algebra (PRA) which is a generalization of standard relational algebra. Here tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Based on intensional semantics, the tuple weights of the result of a PRA expression always confirm to the underlying probabilistic model. We also show for which expressions extensional semantics yields the same results. Furthermore, we discuss complexity issues and indicate possibilities for optimization. With regard to databases, the approach allows for representing imprecise attribute values, whereas for information retrieval, probabilistic document indexing and probabilistic search term weighting can be modelled. As an important extension, we introduce the concept of vague predicates which yields a probabilistic weight instead of a Boolean value, thus allowing for queries with vague selection conditions. So PRA implements uncertainty and vagueness in combination with the...
Integrating Structured Data and Text: A relational approach
 Journal of the American Society of Information Science
, 1997
"... We integrate structured data and text using the unchanged, standard relational model. We started with the premise that a relational system could be used to implement an Information Retrieval (IR) system. After implementing a prototype to verify that premise, we then began to investigate the performa ..."
Abstract

Cited by 59 (27 self)
 Add to MetaCart
(Show Context)
We integrate structured data and text using the unchanged, standard relational model. We started with the premise that a relational system could be used to implement an Information Retrieval (IR) system. After implementing a prototype to verify that premise, we then began to investigate the performance of a parallel relational database system for this application. We also tested the effect of query reduction on accuracy and found that queries can be reduced prior to their implementation without incurring a significant loss in precision/recall. This reduction also serves to improve runtime performance. After comparing our results to a special purpose IR system, we conclude that the relational model offers scalable performance and includes the ability to integrate structured data and text in a portable fashion. 1 Introduction Increasingly, applications integrate structured and unstructured data, responding to requests such as "Find articles containing vehicle and sales published in jou...
A Probabilistic Relational Model for the Integration of IR and Databases
 In Proceedings of ACM SIGIR
, 1993
"... In this paper, a probabilistic relational model is presented which combines relational algebra with probabilistic retrieval. Based on certain independence assumptions, the operators of the relational algebra are redefined such that the probabilistic algebra is a generalization of the standard relati ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
In this paper, a probabilistic relational model is presented which combines relational algebra with probabilistic retrieval. Based on certain independence assumptions, the operators of the relational algebra are redefined such that the probabilistic algebra is a generalization of the standard relational algebra. Furthermore, a special join operator implementing probabilistic retrieval is proposed. When applied to typical document databases, queries can not only ask for documents, but for any kind of object in the database. In addition, an implicit ranking of these objects is provided in case the query relates to probabilistic indexing or uses the probabilistic join operator. The proposed algebra is intended as a standard interface to combined database and IR systems, as a basis for implementing userfriendly interfaces. 1 Introduction The fields of databases (DB) and information retrieval (IR) have been coexisting for a very long time, but with little influence on each other. IR peop...
Integrating Diverse Information Management Systems: A Brief Survey
 IEEE Data Engineering Bulletin
, 2001
"... Most current information management systems can be classified into text retrieval systems, relational/object database systems, or semistructured/XML database systems. However, in practice, many applications data sets involve a combination of free text, structured data, and semistructured data. Henc ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
Most current information management systems can be classified into text retrieval systems, relational/object database systems, or semistructured/XML database systems. However, in practice, many applications data sets involve a combination of free text, structured data, and semistructured data. Hence, integration of different types of information management systems has been, and continues to be, an active research topic. In this paper, we present a short survey of prior work on integrating and interoperating between text, structured, and semistructured database systems. We classify existing literature based on the kinds of systems being integrated and the approach to integration. Based on this classification, we identify the challenges and the key themes underlying existing work in this area.
A Probabilistic NF2 Relational Algebra for Integrated Information Retrieval and Database Systems
 In Proceedings of the 2nd World Conference on Integrated Design and Process Technology
, 1996
"... The integration of information retrieval (IR) and database systems requires a data model which allows for modelling documents as entities, representing uncertainty and vagueness and performing uncertain inference. For this purpose, we present a probabilistic data model based on relations in nonfirst ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
The integration of information retrieval (IR) and database systems requires a data model which allows for modelling documents as entities, representing uncertainty and vagueness and performing uncertain inference. For this purpose, we present a probabilistic data model based on relations in nonfirst normalform (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Thus, the set of weighted index terms of a document are represented as a probabilistic subrelation. In a similar way, imprecise attribute values are modelled as a setvalued attribute. We redefine the relational operators for this type of relations such that the result of each operator is again a probabilistic NF2 relation, where the weight of a tuple gives the probability that this tuple belongs to the result. By ordering the tuples according to decreasing probabilities, the model yields a ranking of answers like in most IR models. This effect also can be used for ...
Information retrieval on empty fields
 In HLT 2007
, 2007
"... We explore the problem of retrieving semistructured documents from a realworld collection using a structured query. We formally develop Structured Relevance Models (SRM), a retrieval model that is based on the idea that plausible values for a given field could be inferred from the context provided ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
We explore the problem of retrieving semistructured documents from a realworld collection using a structured query. We formally develop Structured Relevance Models (SRM), a retrieval model that is based on the idea that plausible values for a given field could be inferred from the context provided by the other fields in the record. We then carry out a set of experiments using a snapshot of the National Science Digital Library (NSDL) repository, and queries that only mention fields missing from the test data. For such queries, typical field matching would retrieve no documents at all. In contrast, the SRM approach achieves a mean average precision of over twenty percent. 1
Integration Of Complex Objects And Transitive Relationships For Information Retrieval
"... In this paper we show that in advanced information retrieval (IR) applications capa bilities for data aggregation, transitive computation and NF 2 (nonfirst normal form) relational computation are often necessary at the same time. We demonstrate that complex objects are naturally modeled as NF ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
In this paper we show that in advanced information retrieval (IR) applications capa bilities for data aggregation, transitive computation and NF 2 (nonfirst normal form) relational computation are often necessary at the same time. We demonstrate that complex objects are naturally modeled as NF 2 relations whereas structures like hi erarchical thesauri must be modeled for transitive computation. Transitive process ing cannot be supported by stmcturally static structures like NF 2 relations. We pres ent a truly declarative query interface, which integrates data aggregation, transitive computation and NF 2 relational computation. Thus the interface supports the re trieval and structural manipulation of complex objects (e.g., documents, and biblio graphic references), their retrieval through transitive relationships (e.g., thesauri, citations) and data aggregation based on their components (e.g., citation counts, author productivity). Most importantly, users can formulate queries on a high ab straction level without mastering actual programming or database techniques.
The process strategy for the NF2 relational FRCinterface
 Information & Software Technology
, 1996
"... this paper is based on a different approach. In it the user describes only the structure of the result NF relation in a straightforward and intuitive way. This starting point of our interface affords the possibility of formulating queries in a compact and truly declarative manner  also in those ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
this paper is based on a different approach. In it the user describes only the structure of the result NF relation in a straightforward and intuitive way. This starting point of our interface affords the possibility of formulating queries in a compact and truly declarative manner  also in those cases which require considerable restructuring among data. In this paper we consider the Prologbased implementation of the query processing strategy of our interface. Special attention is paid to those principles and techniques in terms of which the representation and manipulation of complex structural relationships of NF relations can be managed in Prolog
A UserOriented Interface for Generalized Informetric Analysis Based on Applying Advanced Data Modelling Techniques
 Journal of Documentation
, 2000
"... This article presents a novel useroriented interface for generalized informetric analysis and demonstrates how informertic calculations can easily and declaratively be specified through advanced data modeling techniques. The interface is declarative and at a high level. Therefore it is easy to u ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
This article presents a novel useroriented interface for generalized informetric analysis and demonstrates how informertic calculations can easily and declaratively be specified through advanced data modeling techniques. The interface is declarative and at a high level. Therefore it is easy to use, flexible, and extensible. It enables endusers to per form basic informetric ad hoc calculations easily and often with much less effort than in the contemporary online retrieval systems. It also provides several fruitful generaliza tions of typical informetric measurements like impact factors. These are based on sub stituting traditional loci of analysis, for instance journals, by other object types, such as authors, organizations, or countries. In the interface, bibliographic data are modeled as complex objects (nonfirst normal form relations) and terminological and citation net works involving transitive relationships are modeled as binary relations for deductive processing. The interface is flexible, because it makes it trivial to switch focus between various object types for infermetric calculations, e.g. from authors to institutions.
Inhaltsverzeichnis
 University of Dortmund
, 1993
"... this document also contains fact data of different sorts (e.g. dates, names of persons and institutions). So users may want to ask for this information, too. That is, besides documents, they want to search for other types of objects, too. There may be two different goals behind such a need: ..."
Abstract
 Add to MetaCart
this document also contains fact data of different sorts (e.g. dates, names of persons and institutions). So users may want to ask for this information, too. That is, besides documents, they want to search for other types of objects, too. There may be two different goals behind such a need: