Results 1 -
8 of
8
Efficient Query Evaluation on Probabilistic Databases
, 2004
"... We describe a system that supports arbitrarily complex SQL queries with ”uncertain” predicates. The query semantics is based on a probabilistic model and the results are ranked, much like in Information Retrieval. Our main focus is efficient query evaluation, a problem that has not received attentio ..."
Abstract
-
Cited by 275 (36 self)
- Add to MetaCart
We describe a system that supports arbitrarily complex SQL queries with ”uncertain” predicates. The query semantics is based on a probabilistic model and the results are ranked, much like in Information Retrieval. Our main focus is efficient query evaluation, a problem that has not received attention in the past. We describe an optimization algorithm that can compute efficiently most queries. We show, however, that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods. For these queries we describe both an approximation algorithm and a Monte-Carlo simulation algorithm.
Modeling Uncertainty In Deductive Databases
- Proc. Int. Conf. on Database Expert Systems and Applications, (DEXA'94
, 1994
"... . Information Source Tracking (IST) method has been developed recently for the modeling and manipulation of uncertain and inaccurate data in relational databases. In this paper we extend the IST method to deductive databases. We show that positive uncertain databases, i.e. IST-based deductive da ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
. Information Source Tracking (IST) method has been developed recently for the modeling and manipulation of uncertain and inaccurate data in relational databases. In this paper we extend the IST method to deductive databases. We show that positive uncertain databases, i.e. IST-based deductive databases with only positive literals in the heads and the bodies of the rules, enjoy a least model/least fixpoint semantics. Query processing in this model is studied next. We extend the top-down and bottom-up evaluation techniques of logic programming and deductive databases to our model. Finally, we study negation for uncertain databases, concentrating on stratified uncertain databases. 1 Introduction Database systems are evolving into knowledge-base systems, and are increasingly used in applications where handling inaccurate data is essential. In a recent study, uncertainty management was listed as one of the important future challenges in database research. "Further research [in un...
Query Evaluation with Soft-Key Constraints
"... Key Violations often occur in real-life datasets, especially in those integrated from different sources. Enforcing constraints strictly on these datasets is not feasible. In this paper we formalize the notion of soft-key constraints on probabilistic databases, which allow for violation of key constr ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Key Violations often occur in real-life datasets, especially in those integrated from different sources. Enforcing constraints strictly on these datasets is not feasible. In this paper we formalize the notion of soft-key constraints on probabilistic databases, which allow for violation of key constraint by penalizing every violating world by a quantity proportional to the violation. To represent our probabilistic database with constraints, we define a class of markov networks, where we can do query evaluation in PTIME. We also study the evaluation of conjunctive queries on relations with soft keys and present a dichotomy that separates this set into those in PTIME and the rest which are #P-Hard. 1.
Recognizing Credible Experts in Inaccurate Databases
, 1994
"... : While the problem of incomplete data in databases has been extensively studied, a relatively unexplored form of uncertainty in databases, called inaccurate data, demands due attention. Inaccurate data results when data are contributed by various information agents with associated credibility. T ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
: While the problem of incomplete data in databases has been extensively studied, a relatively unexplored form of uncertainty in databases, called inaccurate data, demands due attention. Inaccurate data results when data are contributed by various information agents with associated credibility. Though the data itself is total or complete, the reliability of the data now depends on the agents' credibility. Several issues of this form of data reliability has been reported recently where the credibility of agents were assumed to be known, static and uniform throughout the database. In this paper we address the issue of credibility maintenance of information agents and take the view that the agent credibility is dynamic and is a function of the database knowledge, the agent's performance relative to other agents, and the agent's expertise. We present a method to identify agents' field of expertise (called the contexts) and use agents' context dependent credibility to calculate ...
Schema Design for Uncertain Databases
"... We address schema design in uncertain databases. Since uncertain data is relational in nature, decomposition becomes a key issue in design. Decomposition relies on dependency theory, and primarily on functional dependencies. We study the theory of functional dependencies (FDs) for uncertain relation ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We address schema design in uncertain databases. Since uncertain data is relational in nature, decomposition becomes a key issue in design. Decomposition relies on dependency theory, and primarily on functional dependencies. We study the theory of functional dependencies (FDs) for uncertain relations. We define several kinds of horizontal FDs and vertical FDs, each of which is consistent with conventional FDs when an uncertain relation doesn’t contain any uncertainty. In addition to standard forms of decompositions allowed by ordinary relations, our FDs allow more complex decompositions specific to uncertain data. We show how our theory of FDs can be used for lossless decomposition of uncertain relations. We then present algorithms and complexity results for three fundamental problems with respect to FDs over ordinary and uncertain relations: (1) Testing whether a relation instance satisfies an FD; (2) Finding all FDs satisfied by a relation instance; and (3) Inferring all FDs that hold in the result of a query over uncertain relations with FDs. We also give a sound and complete axiomatization of horizontal and vertical FDs. We look at keys as a special case of FDs. Finally, we briefly consider uncertain data that contains confidence values.
Trusting an Information Agent
- In Proceedings of the International Workshop on Rough Sets and Knowledge Discovery
, 1993
"... : While the common kinds of uncertainties in databases (e.g., null values, disjunction, corrupt/missing data, domain mismatch, etc.) have been extensively studied, a relatively unexplored form of uncertainty in databases, called inaccurate data, demands due attention. Inaccurate data results whe ..."
Abstract
- Add to MetaCart
: While the common kinds of uncertainties in databases (e.g., null values, disjunction, corrupt/missing data, domain mismatch, etc.) have been extensively studied, a relatively unexplored form of uncertainty in databases, called inaccurate data, demands due attention. Inaccurate data results when data are contributed by various information agents with some known reliability. Though the data itself is total or complete, the reliability of the data now depends on the agent's reliability. Several issues of this form of data reliability have been reported recently where the reliability of agents were assumed to be known and static. In this paper we address the issue of reliability maintenance of information agents and take the view that the agent reliability is dynamic and is a function of the database knowledge and the agent evidences (facts that are observed to be true or false). We propose a method of quantifying the level of trust (or the agent reliability) that the datab...
Query Evaluation with Soft Keys
, 2011
"... Key Violations often occur in real-life datasets, especially in those integrated from different sources. Enforcing constraints strictly on these datasets is not feasible. In this paper we formalize the notion of soft-key constraints on probabilistic databases, which allow for violation of key constr ..."
Abstract
- Add to MetaCart
Key Violations often occur in real-life datasets, especially in those integrated from different sources. Enforcing constraints strictly on these datasets is not feasible. In this paper we formalize the notion of soft-key constraints on probabilistic databases, which allow for violation of key constraint by penalizing every violating world by a quantity proportional to the violation. To represent our probabilistic database with constraints, we define a class of markov networks, where we can do query evaluation in PTIME. We also study the evaluation of conjunctive queries on relations with soft keys and present a dichotomy that separates this set into those in PTIME and the rest which are #P-Hard. 1

