Results 1 - 10
of
24
Data Integration: A Theoretical Perspective
- Symposium on Principles of Database Systems
, 2002
"... Data integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data. The problem of designing data integration systems is important in current real world applications, and is characterized by a number of issues that are interestin ..."
Abstract
-
Cited by 585 (35 self)
- Add to MetaCart
Data integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data. The problem of designing data integration systems is important in current real world applications, and is characterized by a number of issues that are interesting from a theoretical point of view. This document presents on overview of the material to be presented in a tutorial on data integration. The tutorial is focused on some of the theoretical issues that are relevant for data integration. Special attention will be devoted to the following aspects: modeling a data integration application, processing queries in data integration, dealing with inconsistent data sources, and reasoning on queries.
Constraint Query Languages
, 1992
"... We investigate the relationship between programming with constraints and database query languages. We show that efficient, declarative database programming can be combined with efficient constraint solving. The key intuition is that the generalization of a ground fact, or tuple, is a conjunction ..."
Abstract
-
Cited by 318 (35 self)
- Add to MetaCart
We investigate the relationship between programming with constraints and database query languages. We show that efficient, declarative database programming can be combined with efficient constraint solving. The key intuition is that the generalization of a ground fact, or tuple, is a conjunction of constraints over a small number of variables. We describe the basic Constraint Query Language design principles and illustrate them with four classes of constraints: real polynomial inequalities, dense linear order inequalities, equalities over an infinite domain, and boolean equalities. For the analysis, we use quantifier elimination techniques from logic and the concept of data complexity from database theory. This framework is applicable to managing spatial data and can be combined with existing multidimensional searching algorithms and data structures.
Complexity of Answering Queries Using Materialized Views
- In PODS
, 1998
"... We study the complexity of the problem of answering queries using materialized views. This problem has attracted a lot of attention recently because of its relevance in data integration. Previous work considered only conjunctive view definitions. We examine the consequences of allowing more expressi ..."
Abstract
-
Cited by 248 (5 self)
- Add to MetaCart
We study the complexity of the problem of answering queries using materialized views. This problem has attracted a lot of attention recently because of its relevance in data integration. Previous work considered only conjunctive view definitions. We examine the consequences of allowing more expressive view definition languages. The languageswe consider for view definitions and user queries are: conjunctive queries with inequality, positive queries, datalog, and first-order logic. We show that the complexity of the problem depends on whether views are assumed to store all the tuples that satisfy the view definition, or only a subset of it. Finally, we apply the results to the view consistency and view self-maintainability problems which arise in data warehousing. 1 Introduction The notion of materialized view is essential in databases [34] and is attracting more and more attention with the popularity of data warehouses [28]. The problem of answering queries using materialized views [24...
On the Decidability of Query Containment under Constraints
- IN PROC. OF THE 17TH ACM SIGACT SIGMOD SIGART SYMP. ON PRINCIPLES OF DATABASE SYSTEMS (PODS’98
, 1998
"... Query containment under constraints is the problem of checking whether for every database satisfying a given set of constraints, the result of one query is a subset of the result of another query. Recent research points out that this is a central problem in several database applications, and we addr ..."
Abstract
-
Cited by 222 (56 self)
- Add to MetaCart
Query containment under constraints is the problem of checking whether for every database satisfying a given set of constraints, the result of one query is a subset of the result of another query. Recent research points out that this is a central problem in several database applications, and we address it within a setting where constraints are specified in the form of special inclusion dependencies over complex expressions, built by using intersection and difference of relations, special forms of quantification, regular expressions over binary relations, and cardinality constraints. These types of constraints capture a great variety of data models, including the relational, the entity-relational, and the object-oriented model. We study the problem of checking whether q is contained in q 0 with respect to the constraints specified in a schema S, where q and q 0 are nonrecursive Datalog programs whose atoms are complex expressions. We present the following results on query containme...
Data Exchange: Semantics and Query Answering
- In ICDT
, 2003
"... Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related to the semantics of data exchange and to query answe ..."
Abstract
-
Cited by 220 (28 self)
- Add to MetaCart
Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related to the semantics of data exchange and to query answering in the context of data exchange. These issues arise because, given a source instance, there may be many target instances that satisfy the constraints of the data exchange problem. We give an algebraic specification that selects, among all solutions to the data exchange problem, a special class of solutions that we call universal. A universal solution has no more and no less data than required for data exchange and it represents the entire space of possible solutions. We then identify fairly general, and practical, conditions that guarantee the existence of a universal solution and yield algorithms to compute a canonical universal solution efficiently. We adopt the notion of "certain answers" in indefinite databases for the semantics for query answering in data exchange. We investigate the computational complexity of computing the certain answers in this context and also study the problem of computing the certain answers of target queries by simply evaluating them on a canonical universal solution.
Temporal Query Languages: a Survey
, 1995
"... We define formal notions of temporal domain and temporal database, and use them to survey a wide spectrum of temporal query languages. We distinguish between an abstract temporal database and its concrete representations, and accordingly between abstract and concrete temporal query languages. We als ..."
Abstract
-
Cited by 97 (11 self)
- Add to MetaCart
We define formal notions of temporal domain and temporal database, and use them to survey a wide spectrum of temporal query languages. We distinguish between an abstract temporal database and its concrete representations, and accordingly between abstract and concrete temporal query languages. We also address the issue of incomplete temporal information. 1 Introduction A temporal database is a repository of temporal information. A temporal query language is any query language for temporal databases. In this paper we propose a formal notion of temporal database and use this notion in surveying a wide spectrum of temporal query languages. The need to store temporal information arises in many computer applications. Consider, for example, records of various kinds: financial [37], personnel, medical [98], or judicial. Also, monitoring data, e.g., in telecommunications network management [4] or process control, has often a temporal dimension. There has been a lot of research in temporal dat...
Constraint Programming and Database Query Languages
- In Proc. 2nd Conference on Theoretical Aspects of Computer Software (TACS
, 1994
"... . The declarative programming paradigms used in constraint languages can lead to powerful extensions of Codd's relational data model. The development of constraint database query languages from logical database query languages has many similarities with the development of constraint logic programmin ..."
Abstract
-
Cited by 61 (4 self)
- Add to MetaCart
. The declarative programming paradigms used in constraint languages can lead to powerful extensions of Codd's relational data model. The development of constraint database query languages from logical database query languages has many similarities with the development of constraint logic programming from logic programming, but with the additional requirements of data efficient, set-at-a-time, and bottomup evaluation. In this overview of constraint query languages (CQLs) we first present the framework of [41]. The principal idea is that: "the k-tuple (or record) data type can be generalized by a conjunction of quantifier-free constraints over k variables". The generalization must preserve various language properties of the relational data model, e.g., the calculus/algebra equivalence, and have time complexity polynomial in the size of the data. We next present an algebra for dense order constraints that is simpler to evaluate than the calculus described in [41], and we sharpen some of...
The Complexity of Querying Indefinite Data about Linearly Ordered Domains
- In The Proceedings of the Eleventh ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems
, 1992
"... In applications dealing with ordered domains, the available data is frequently indefinite. While the domain is actually linearly ordered, only some of the order relations holding between points in the data are known. Thus, the data provides only a partial order, and query answering involves determin ..."
Abstract
-
Cited by 39 (2 self)
- Add to MetaCart
In applications dealing with ordered domains, the available data is frequently indefinite. While the domain is actually linearly ordered, only some of the order relations holding between points in the data are known. Thus, the data provides only a partial order, and query answering involves determining what holds under all the compatible linear orders. In this paper we study the complexity of evaluating queries in logical databases containing such indefinite information. We show that in this context queries are intractable even under the data complexity measure, but identify a number of PTIME sub-problems. Data complexity in the case of monadic predicates is one of these PTIME cases, but for disjunctive queries the proof is non-constructive, using well-quasi-order techniques. We also show that the query problem we study is equivalent to the problem of containment of conjunctive relational database queries containing inequalities. One of our results implies that the latter is \Pi p 2 ...
Relational Queries over Interpreted Structures
- Journal of the ACM
"... We rework parts of the classical relational theory when the underlying domain is a structure with some interpreted operations that can be used in queries. We identify parts of the classical theory that go through `as before' when interpreted structure is present, parts that go through only for cl ..."
Abstract
-
Cited by 21 (11 self)
- Add to MetaCart
We rework parts of the classical relational theory when the underlying domain is a structure with some interpreted operations that can be used in queries. We identify parts of the classical theory that go through `as before' when interpreted structure is present, parts that go through only for classes of nicely-behaved structures, and parts that only arise in the interpreted case. The first category includes a number of results on language equivalence and expressive power characterizations for the active-domain semantics for a variety of logics. Under this semantics, quantifiers range over elements of a relational database. The main kind of results we prove here are generic collapse results: for generic queries, adding operations beyond order, does not give us extra power. The second category includes results on the natural semantics, under which quantifiers range over the entire interpreted structure. We prove, for a variety of structures, natural-active collapse results, s...
On Containment of Conjunctive Queries with Arithmetic Comparisons
- Advances in Database Technology - EDBT
, 2004
"... We study the following problem: how to test if Q2 is contained in Q1, where Q1 and Q2 are conjunctive queries with arithmetic comparisons? This problem is fundamental in a large variety of database applications. Existing algorithms first normalize the queries, then test a logical implication using m ..."
Abstract
-
Cited by 15 (6 self)
- Add to MetaCart
We study the following problem: how to test if Q2 is contained in Q1, where Q1 and Q2 are conjunctive queries with arithmetic comparisons? This problem is fundamental in a large variety of database applications. Existing algorithms first normalize the queries, then test a logical implication using multiple containment mappings from Q1 to Q2. We are interested in cases where the containment can be tested more efficiently. This work aims to (a) reduce the problem complexity from PgrP2-completeness to NP-completeness in these cases; (b) utilize the advantages of the homomorphism property (i.e., the containment test is based on a single containment mapping) in applications such as those of answering queries using views; and (c) observing that many real queries have the homomorphism property. The following are our results. (1) We show several cases where the normalization step is not needed, thus reducing the size of the queries and the number of containment mappings. (2) We develop an algorithm for checking various syntactic conditions on queries, under which the homomorphism property holds. (3) We further reduce the conditions of these classes using practical domain knowledge that is easily obtainable. (4) We conducted experiments on real queries, and show that most of the queries pass this test.

