Results 1 - 10
of
68
Data Integration: A Theoretical Perspective
- Symposium on Principles of Database Systems
, 2002
"... Data integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data. The problem of designing data integration systems is important in current real world applications, and is characterized by a number of issues that are interestin ..."
Abstract
-
Cited by 585 (35 self)
- Add to MetaCart
Data integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data. The problem of designing data integration systems is important in current real world applications, and is characterized by a number of issues that are interesting from a theoretical point of view. This document presents on overview of the material to be presented in a tutorial on data integration. The tutorial is focused on some of the theoretical issues that are relevant for data integration. Special attention will be devoted to the following aspects: modeling a data integration application, processing queries in data integration, dealing with inconsistent data sources, and reasoning on queries.
Complexity of Answering Queries Using Materialized Views
- In PODS
, 1998
"... We study the complexity of the problem of answering queries using materialized views. This problem has attracted a lot of attention recently because of its relevance in data integration. Previous work considered only conjunctive view definitions. We examine the consequences of allowing more expressi ..."
Abstract
-
Cited by 248 (5 self)
- Add to MetaCart
We study the complexity of the problem of answering queries using materialized views. This problem has attracted a lot of attention recently because of its relevance in data integration. Previous work considered only conjunctive view definitions. We examine the consequences of allowing more expressive view definition languages. The languageswe consider for view definitions and user queries are: conjunctive queries with inequality, positive queries, datalog, and first-order logic. We show that the complexity of the problem depends on whether views are assumed to store all the tuples that satisfy the view definition, or only a subset of it. Finally, we apply the results to the view consistency and view self-maintainability problems which arise in data warehousing. 1 Introduction The notion of materialized view is essential in databases [34] and is attracting more and more attention with the popularity of data warehouses [28]. The problem of answering queries using materialized views [24...
On the Decidability of Query Containment under Constraints
- IN PROC. OF THE 17TH ACM SIGACT SIGMOD SIGART SYMP. ON PRINCIPLES OF DATABASE SYSTEMS (PODS’98
, 1998
"... Query containment under constraints is the problem of checking whether for every database satisfying a given set of constraints, the result of one query is a subset of the result of another query. Recent research points out that this is a central problem in several database applications, and we addr ..."
Abstract
-
Cited by 222 (56 self)
- Add to MetaCart
Query containment under constraints is the problem of checking whether for every database satisfying a given set of constraints, the result of one query is a subset of the result of another query. Recent research points out that this is a central problem in several database applications, and we address it within a setting where constraints are specified in the form of special inclusion dependencies over complex expressions, built by using intersection and difference of relations, special forms of quantification, regular expressions over binary relations, and cardinality constraints. These types of constraints capture a great variety of data models, including the relational, the entity-relational, and the object-oriented model. We study the problem of checking whether q is contained in q 0 with respect to the constraints specified in a schema S, where q and q 0 are nonrecursive Datalog programs whose atoms are complex expressions. We present the following results on query containme...
Query optimization in database systems
- ACM Computing Surveys
, 1984
"... Efficient methods of processing unanticipated queries are a crucial prerequisite for the success of generalized database management systems. A wide variety of approaches to improve the performance of query evaluation algorithms have been proposed: logic-based and semantic transformations, fast imple ..."
Abstract
-
Cited by 194 (0 self)
- Add to MetaCart
Efficient methods of processing unanticipated queries are a crucial prerequisite for the success of generalized database management systems. A wide variety of approaches to improve the performance of query evaluation algorithms have been proposed: logic-based and semantic transformations, fast implementations of basic operations, and combinatorial or heuristic algorithms for generating alternative access plans and choosing among them. These methods are presented in the framework of a general query evaluation procedure using the relational calculus representation of queries. In addition, nonstandard query optimization issues such as higher level query evaluation, query optimization in distributed databases, and use of database machines are addressed. The focus, however, is on query optimization in centralized database systems.
Tractable reasoning and efficient query answering in description logics: The DL-Lite family
- J. of Automated Reasoning
"... Abstract. We propose a new family of Description Logics (DLs), called DL-Lite, specifically tailored to capture basic ontology languages, while keeping low complexity of reasoning. Reasoning here means not only computing subsumption between concepts, and checking satisfiability of the whole knowledg ..."
Abstract
-
Cited by 147 (49 self)
- Add to MetaCart
Abstract. We propose a new family of Description Logics (DLs), called DL-Lite, specifically tailored to capture basic ontology languages, while keeping low complexity of reasoning. Reasoning here means not only computing subsumption between concepts, and checking satisfiability of the whole knowledge base, but also answering complex queries (in particular, unions of conjunctive queries) over the instance level (ABox) of the DL knowledge base. We show that, for the DLs of the DL-Lite family, the usual DL reasoning tasks are polynomial in the size of the TBox, and query answering is LogSpace in the size of the ABox (i.e., in data complexity). To the best of our knowledge, this is the first result of polynomial time data complexity for query answering over DL knowledge bases. Notably our logics allow for a separation between TBox and ABox reasoning during query evaluation: the part of the process requiring TBox reasoning is independent of the ABox, and the part of the process requiring access to the ABox can be carried out by an SQL engine, thus taking advantage of the query optimization strategies provided by current Data Base Management Systems. Since it can be shown that even slight extensions to the logics of the DL-Lite family make query answering at least NLogSpace in data complexity, thus ruling out the possibility of using on-the-shelf relational technology for query processing, we can conclude that the logics of the DL-Lite family are the maximal DLs supporting efficient query answering over large amounts of instances. 1.
On the decidability and complexity of query answering over inconsistent and incomplete databases
- In Proc. of PODS 2003
, 2003
"... In databases with integrity constraints, data may not satisfy the constraints. In this paper, we address the problem of obtaining consistent answers in such a setting, when key and inclusion dependencies are expressed on the database schema. We establish decidability and complexity results for query ..."
Abstract
-
Cited by 96 (24 self)
- Add to MetaCart
In databases with integrity constraints, data may not satisfy the constraints. In this paper, we address the problem of obtaining consistent answers in such a setting, when key and inclusion dependencies are expressed on the database schema. We establish decidability and complexity results for query answering under different assumptions on data (soundness and/or completeness). In particular, after showing that the problem is in general undecidable, we identify the maximal class of inclusion dependencies under which query answering is decidable in the presence of key dependencies. Although obtained in a single database context, such results are directly applicable to data integration, where multiple information sources may provide data that are inconsistent with respect to the global view of the sources. 1.
ILOG: Declarative Creation and Manipulation of Object Identifiers
, 1991
"... yosikawaQkyoto-su.ac.jp Abstract: This paper introduces ILOG ( a declarative language in the style of (stratified) datalog ( which can be used for querying, schema translation, and schema augmentation in the context of object-based data models. The semantics of ILOG is based on the use of Skolem fun ..."
Abstract
-
Cited by 84 (1 self)
- Add to MetaCart
yosikawaQkyoto-su.ac.jp Abstract: This paper introduces ILOG ( a declarative language in the style of (stratified) datalog ( which can be used for querying, schema translation, and schema augmentation in the context of object-based data models. The semantics of ILOG is based on the use of Skolem functors, and is closely related to semantics for object-based data manipulation languages which provide mechanisms for explicit creation of object identifiers (OIDs). A normal form is presented for ILOG ’ programs not involving recursion through OID creation, which identifies a precise correspondence between OIDs created in the target, and values and OIDs in the source. The expressive power of various sublanguages of ILOG ’ is shown to range from a natural generalization of the conjunctive queries to the object-based context, to a language which can specify all computable database translat.ions (up to duplicate copies). The issue of testing vuliilityof ILOG programs translat.ing one semantic schema to another is studied: cases are presented for which several-validity issues (e.g., functional and/or subset relationships in the
Obtaining Complete Answers from Incomplete Databases
- In Proc. of the 22nd Int. Conf. on Very Large Data Bases (VLDB'96
, 1996
"... We consider the problem of answering queries from databases that may be incomplete. A database is incomplete if some tuples may be missing from some relations, and only a part of each relation is known to be complete. This problem arises in several contexts. For example, systems that provide access ..."
Abstract
-
Cited by 76 (7 self)
- Add to MetaCart
We consider the problem of answering queries from databases that may be incomplete. A database is incomplete if some tuples may be missing from some relations, and only a part of each relation is known to be complete. This problem arises in several contexts. For example, systems that provide access to multiple heterogeneous information sources often encounter incomplete sources. The question we address is to determine whether the answer to a specific given query is complete even when the database is incomplete. We present a novel sound and complete algorithm for the answer-completeness problem by relating it to the problem of independence of queries from updates. We also show an important case of the independence problem (and therefore of the answer-completeness problem) that can be decided in polynomial time, whereas the best known algorithm for this case is exponential. This case involves updates that are described using a conjunction of comparison predicates. We also describe an alg...
Query rewriting and answering under constraints in data integration systems
- In Proc. of the 18th Int. Joint Conf. on Artificial Intelligence (IJCAI 2003
, 2003
"... In this paper we address the problem of query answering and rewriting in global-as-view data integration systems, when key and inclusion dependencies are expressed on the global integration schema. In the case of sound views, we provide sound and complete rewriting techniques for a maximal class of ..."
Abstract
-
Cited by 60 (22 self)
- Add to MetaCart
In this paper we address the problem of query answering and rewriting in global-as-view data integration systems, when key and inclusion dependencies are expressed on the global integration schema. In the case of sound views, we provide sound and complete rewriting techniques for a maximal class of constraints for which decidability holds. Then, we introduce a semantics which is able to cope with violations of constraints, and present a sound and complete rewriting technique for the same decidable class of constraints. Finally, we consider the decision problem of query answering and give decidability and complexity results. 1
How to decide query containment under constraints using a description logic
- In Proc. of the 7th Int. Conf. on Logic for Programming and Automated Reasoning (LPAR 2000), Lecture Notes in Artificial Intelligence
"... Query containment under constraints is the problem of determining whether the result of one query is contained in the result of another query for every database satisfying a given set of constraints (derived, for example, from a schema). This problem is of particular importance in information integr ..."
Abstract
-
Cited by 50 (20 self)
- Add to MetaCart
Query containment under constraints is the problem of determining whether the result of one query is contained in the result of another query for every database satisfying a given set of constraints (derived, for example, from a schema). This problem is of particular importance in information integration (see [7]) and data warehousing

