Results 1 - 10
of
68
Tractable reasoning and efficient query answering in description logics: The DL-Lite family
- J. of Automated Reasoning
"... Abstract. We propose a new family of Description Logics (DLs), called DL-Lite, specifically tailored to capture basic ontology languages, while keeping low complexity of reasoning. Reasoning here means not only computing subsumption between concepts, and checking satisfiability of the whole knowledg ..."
Abstract
-
Cited by 147 (49 self)
- Add to MetaCart
Abstract. We propose a new family of Description Logics (DLs), called DL-Lite, specifically tailored to capture basic ontology languages, while keeping low complexity of reasoning. Reasoning here means not only computing subsumption between concepts, and checking satisfiability of the whole knowledge base, but also answering complex queries (in particular, unions of conjunctive queries) over the instance level (ABox) of the DL knowledge base. We show that, for the DLs of the DL-Lite family, the usual DL reasoning tasks are polynomial in the size of the TBox, and query answering is LogSpace in the size of the ABox (i.e., in data complexity). To the best of our knowledge, this is the first result of polynomial time data complexity for query answering over DL knowledge bases. Notably our logics allow for a separation between TBox and ABox reasoning during query evaluation: the part of the process requiring TBox reasoning is independent of the ABox, and the part of the process requiring access to the ABox can be carried out by an SQL engine, thus taking advantage of the query optimization strategies provided by current Data Base Management Systems. Since it can be shown that even slight extensions to the logics of the DL-Lite family make query answering at least NLogSpace in data complexity, thus ruling out the possibility of using on-the-shelf relational technology for query processing, we can conclude that the logics of the DL-Lite family are the maximal DLs supporting efficient query answering over large amounts of instances. 1.
DL-Lite: Tractable description logics for ontologies
- In Proc. of AAAI 2005
, 2005
"... We propose a new Description Logic, called DL-Lite, specifically tailored to capture basic ontology languages, while keeping low complexity of reasoning. Reasoning here means not only computing subsumption between concepts, and checking satisfiability of the whole knowledge base, but also answering ..."
Abstract
-
Cited by 142 (45 self)
- Add to MetaCart
We propose a new Description Logic, called DL-Lite, specifically tailored to capture basic ontology languages, while keeping low complexity of reasoning. Reasoning here means not only computing subsumption between concepts, and checking satisfiability of the whole knowledge base, but also answering complex queries (in particular, conjunctive queries) over the set of instances maintained in secondary storage. We show that in DL-Lite the usual DL reasoning tasks are polynomial in the size of the TBox, and query answering is polynomial in the size of the ABox (i.e., in data complexity). To the best of our knowledge, this is the first result of polynomial data complexity for query answering over DL knowledge bases. A notable feature of our logic is to allow for a separation between TBox and ABox reasoning during query evaluation: the part of the process requiring TBox reasoning is independent of the ABox, and the part of the process requiring access to the ABox can be carried out by an SQL engine, thus taking advantage of the query optimization strategies provided by current DBMSs.
Logical foundations of peer-to-peer data integration
- In Proc. of the 23rd ACM SIGACT SIGMOD SIGART Sym. on Principles of Database Systems (PODS-2004
, 2004
"... In peer-to-peer data integration, each peer exports data in terms of its own schema, and data interoperation is achieved by means of mappings among the peer schemas. Peers are autonomous systems and mappings are dynamically created and changed. One of the challenges in these systems is answering que ..."
Abstract
-
Cited by 77 (12 self)
- Add to MetaCart
In peer-to-peer data integration, each peer exports data in terms of its own schema, and data interoperation is achieved by means of mappings among the peer schemas. Peers are autonomous systems and mappings are dynamically created and changed. One of the challenges in these systems is answering queries posed to one peer taking into account the mappings. Obviously, query answering strongly depends on the semantics of the overall system. In this paper, we compare the commonly adopted approach of interpreting peerto-peer systems using a first-order semantics, with an alternative approach based on epistemic logic. We consider several central properties of peer-to-peer systems: modularity, generality, and decidability. We argue that the approach based on epistemic logic is superior with respect to all the above properties. In particular, we show that, in systems in which peers have decidable schemas and conjunctive mappings, but are arbitrarily interconnected, the first-order approach may lead to undecidability of query answering, while the epistemic approach always preserves decidability. This is a fundamental property, since the actual interconnections among peers are not under the control of any actor in the system. 1.
Minimal-Change Integrity Maintenance Using Tuple Deletions
- Information and Computation
, 2005
"... We address the problem of minimal-change integrity maintenance in the context of integrity constraints in relational databases. We assume that integrity-restoration actions are limited to tuple deletions. We focus on two basic computational issues: repair checking (is a database instance a repair of ..."
Abstract
-
Cited by 67 (8 self)
- Add to MetaCart
We address the problem of minimal-change integrity maintenance in the context of integrity constraints in relational databases. We assume that integrity-restoration actions are limited to tuple deletions. We focus on two basic computational issues: repair checking (is a database instance a repair of a given database?) and consistent query answers [3] (is a tuple an answer to a given query in every repair of a given database?). We study the computational complexity of both problems, delineating the boundary between the tractable and the intractable cases. We consider denial constraints, general functional and inclusion dependencies, as well as key and foreign key constraints. Our results shed light on the computational feasibility of minimal-change integrity maintenance. The tractable cases should lead to practical implementations. The intractability results highlight the inherent limitations of any integrity enforcement mechanism, e.g., triggers or referential constraint actions, as a way of performing minimal-change integrity maintenance. 1
Query rewriting and answering under constraints in data integration systems
- In Proc. of the 18th Int. Joint Conf. on Artificial Intelligence (IJCAI 2003
, 2003
"... In this paper we address the problem of query answering and rewriting in global-as-view data integration systems, when key and inclusion dependencies are expressed on the global integration schema. In the case of sound views, we provide sound and complete rewriting techniques for a maximal class of ..."
Abstract
-
Cited by 60 (22 self)
- Add to MetaCart
In this paper we address the problem of query answering and rewriting in global-as-view data integration systems, when key and inclusion dependencies are expressed on the global integration schema. In the case of sound views, we provide sound and complete rewriting techniques for a maximal class of constraints for which decidability holds. Then, we introduce a semantics which is able to cope with violations of constraints, and present a sound and complete rewriting technique for the same decidable class of constraints. Finally, we consider the decision problem of query answering and give decidability and complexity results. 1
Clean answers over dirty databases: A probabilistic approach
- In Proc. ICDE
, 2006
"... The detection of duplicate tuples, corresponding to the same real-world entity, is an important task in data integration and cleaning. While many techniques exist to identify such tuples, the merging or elimination of duplicates can be a difficult task that relies on ad-hoc and often manual solution ..."
Abstract
-
Cited by 49 (2 self)
- Add to MetaCart
The detection of duplicate tuples, corresponding to the same real-world entity, is an important task in data integration and cleaning. While many techniques exist to identify such tuples, the merging or elimination of duplicates can be a difficult task that relies on ad-hoc and often manual solutions. We propose a complementary approach that permits declarative query answering over duplicated data, where each duplicate is associated with a probability of being in the clean database. We rewrite queries over a database containing duplicates to return each answer with the probability that the answer is in the clean database. Our rewritten queries are sensitive to the semantics of duplication and help a user understand which query answers are most likely to be present in the clean database. The semantics that we adopt is independent of the way the probabilities are produced, but is able to effectively exploit them during query answering. In the absence of external knowledge that associates each database tuple with a probability, we offer a technique, based on tuple summaries, that automates this task. We experimentally study the performance of our rewritten queries. Our studies show that the rewriting does not introduce a significant overhead in query execution time. This work is done in the context of the ConQuer project at the University of Toronto, which focuses on the efficient management of inconsistent and dirty databases. 1
10^(10^6) Worlds and Beyond: Efficient Representation and Processing of Incomplete Information
, 2006
"... Current systems and formalisms for representing incomplete information generally suffer from at least one of two weaknesses. Either they are not strong enough for representing results of simple queries, or the handling and processing of the data, e.g. for query evaluation, is intractable. In this pa ..."
Abstract
-
Cited by 46 (6 self)
- Add to MetaCart
Current systems and formalisms for representing incomplete information generally suffer from at least one of two weaknesses. Either they are not strong enough for representing results of simple queries, or the handling and processing of the data, e.g. for query evaluation, is intractable. In this paper, we present a decomposition-based approach to addressing this problem. We introduce world-set decompositions (WSDs), a space-efficient formalism for representing any finite set of possible worlds over relational databases. WSDs are therefore a strong representation system for any relational query language. We study the problem of efficiently evaluating relational algebra queries on sets of worlds represented by WSDs. We also evaluate our technique experimentally in a large census data scenario and show that it is both scalable and efficient.
First-Order Query Rewriting for Inconsistent Databases
- In Proc. International Conference on Database Theory (ICDT 05), Springer LNCS 3363
, 2005
"... www.elsevier.com/locate/jcss We consider the problem of retrieving consistent answers over databases that might be inconsistent with respect to a set of integrity constraints. In particular, we concentrate on sets of constraints that consist of key dependencies, and we give an algorithm that compute ..."
Abstract
-
Cited by 45 (1 self)
- Add to MetaCart
www.elsevier.com/locate/jcss We consider the problem of retrieving consistent answers over databases that might be inconsistent with respect to a set of integrity constraints. In particular, we concentrate on sets of constraints that consist of key dependencies, and we give an algorithm that computes the consistent answers for a large and practical class of conjunctive queries. Given a query q, the algorithm returns a first-order query Q (called a query rewriting) such that for every (potentially inconsistent) database I, the consistent answers for q can be obtained by evaluating Q directly on I. © 2006 Published by Elsevier Inc.
A Cost-Based Model and Effective Heuristic for Repairing Constraints by Value Modification
- In ACM SIGMOD International Conference on Management of Data
, 2005
"... Data integrated from multiple sources may contain inconsistencies that violate integrity constraints. The constraint repair problem attempts to find “low cost ” changes that, when applied, will cause the constraints to be satisfied. While in most previous work repair cost is stated in terms of tuple ..."
Abstract
-
Cited by 42 (4 self)
- Add to MetaCart
Data integrated from multiple sources may contain inconsistencies that violate integrity constraints. The constraint repair problem attempts to find “low cost ” changes that, when applied, will cause the constraints to be satisfied. While in most previous work repair cost is stated in terms of tuple insertions and deletions, we follow recent work to define a database repair as a set of value modifications. In this context, we introduce a novel cost framework that allows for the application of techniques from record-linkage to the search for good repairs. We prove that finding minimal-cost repairs in this model is NP-complete in the size of the database, and introduce an approach to heuristic repair-construction based on equivalence classes of attribute values. Following this approach, we define two greedy algorithms. While these simple algorithms take time cubic in the size of the database, we develop optimizations inspired by algorithms for duplicate-record detection that greatly improve scalability. We evaluate our framework and algorithms on synthetic and real data, and show that our proposed optimizations greatly improve performance at little or no cost in repair quality. 1.
Efficient Evaluation of Logic Programs for Querying Data Integration Systems
, 2003
"... Many data integration systems provide transparent access to heterogeneous data sources through a unified view of all data in terms of a global schema, which may be equipped with integrity constraints on the data. Since these constraints might be violated by the data retrieved from the sources, me ..."
Abstract
-
Cited by 37 (5 self)
- Add to MetaCart
Many data integration systems provide transparent access to heterogeneous data sources through a unified view of all data in terms of a global schema, which may be equipped with integrity constraints on the data. Since these constraints might be violated by the data retrieved from the sources, methods for handling such a situation are needed. To this end, recent approaches model query answering in data integration systems in terms of nonmonotonic logic programs.

