Results 1 - 10
of
65
Data Integration: A Theoretical Perspective
- Symposium on Principles of Database Systems
, 2002
"... Data integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data. The problem of designing data integration systems is important in current real world applications, and is characterized by a number of issues that are interestin ..."
Abstract
-
Cited by 585 (35 self)
- Add to MetaCart
Data integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data. The problem of designing data integration systems is important in current real world applications, and is characterized by a number of issues that are interesting from a theoretical point of view. This document presents on overview of the material to be presented in a tutorial on data integration. The tutorial is focused on some of the theoretical issues that are relevant for data integration. Special attention will be devoted to the following aspects: modeling a data integration application, processing queries in data integration, dealing with inconsistent data sources, and reasoning on queries.
Answering Queries Using Views: A Survey
, 2000
"... The problem of answering queries using views is to find efficient methods of answering a query using a set of previously defined materialized views over the database, rather than accessing the database relations. The problem has recently received significant attention because of its relevance to a w ..."
Abstract
-
Cited by 395 (27 self)
- Add to MetaCart
The problem of answering queries using views is to find efficient methods of answering a query using a set of previously defined materialized views over the database, rather than accessing the database relations. The problem has recently received significant attention because of its relevance to a wide variety of data management problems. In query optimization, finding a rewriting of a query using a set of materialized views can yield a more efficient query execution plan. To support the separation of the logical and physical views of data, a storage schema can be described using views over the logical schema. As a result, finding a query execution plan that accesses the storage amounts to solving the problem of answering queries using views. Finally, the problem arises in data integration systems, where data sources can be described as precomputed views over a mediated schema. This article surveys the state of the art on the problem of answering queries using views, and synthesizes the disparate works into a coherent framework. We describe the different applications of the problem, the algorithms proposed to solve it and the relevant theoretical results.
Complexity of Answering Queries Using Materialized Views
- In PODS
, 1998
"... We study the complexity of the problem of answering queries using materialized views. This problem has attracted a lot of attention recently because of its relevance in data integration. Previous work considered only conjunctive view definitions. We examine the consequences of allowing more expressi ..."
Abstract
-
Cited by 248 (5 self)
- Add to MetaCart
We study the complexity of the problem of answering queries using materialized views. This problem has attracted a lot of attention recently because of its relevance in data integration. Previous work considered only conjunctive view definitions. We examine the consequences of allowing more expressive view definition languages. The languageswe consider for view definitions and user queries are: conjunctive queries with inequality, positive queries, datalog, and first-order logic. We show that the complexity of the problem depends on whether views are assumed to store all the tuples that satisfy the view definition, or only a subset of it. Finally, we apply the results to the view consistency and view self-maintainability problems which arise in data warehousing. 1 Introduction The notion of materialized view is essential in databases [34] and is attracting more and more attention with the popularity of data warehouses [28]. The problem of answering queries using materialized views [24...
CARIN: A Representation Language Combining Horn Rules and Description Logics
, 1996
"... . We describe CARIN, a novel family of representation languages, which integrate the expressive power of Horn rules and of description logics. We address the key issue in designing such a language, namely, providing a sound and complete inference procedure. We identify existential entailment as a c ..."
Abstract
-
Cited by 94 (1 self)
- Add to MetaCart
. We describe CARIN, a novel family of representation languages, which integrate the expressive power of Horn rules and of description logics. We address the key issue in designing such a language, namely, providing a sound and complete inference procedure. We identify existential entailment as a core problem in reasoning in CARIN, and describe an existential entailment algorithm for CARIN languages whose description logic component is ALCNR. This algorithm entails several important results for reasoning in CARIN, most notably: (1) a sound and complete inference procedure for non recursive CARIN-ALCNR, and (2) an algorithm for determining rule subsumption over ALCNR. 1 Introduction Horn rule languages have formed the basis for many Artificial Intelligence application languages because their expressive power is sufficient for many applications, and they have good computational properties. One of the significant limitations of Horn rules is that they are not expressive enough to mod...
Logic-Based Techniques In Data Integration
, 1999
"... The data integration problem is to provide uniform access to multiple heterogeneous information sources available online (e.g., databases on the WWW). This problem has recently received considerable attention from researchers in the fields of Artificial Intelligence and Database Systems. The data in ..."
Abstract
-
Cited by 87 (0 self)
- Add to MetaCart
The data integration problem is to provide uniform access to multiple heterogeneous information sources available online (e.g., databases on the WWW). This problem has recently received considerable attention from researchers in the fields of Artificial Intelligence and Database Systems. The data integration problem is complicated by the facts that (1) sources contain closely related and overlapping data, (2) data is stored in multiple data models and schemas, and (3) data sources have differing query processing capabilities. A key element in a data integration system is the language used to describe the contents and capabilities of the data sources. While such a language needs to be as expressive as possible, it should also enable to efficiently address the main inference problem that arises in this context: to translate a user query that is formulated over a mediated schema into a query on the local schemas. This paper describes several lanaguages for describing contents of data sources, ...
The Use of CARIN Language And Algorithms For Information Integration: The PICSEL System
- INTELLIGENT INFORMATION INTEGRATION WORKSHOP ASSOCIATED WITH ECAI’98 CONFERENCE
, 1999
"... ... In this paper, we describe the way the expressive power of the Carin language is exploited in the Picsel information integration system, while maintaining the decidability of query answering. We illustrate it on examples coming from the tourism domain, which is the first real case that we have t ..."
Abstract
-
Cited by 68 (4 self)
- Add to MetaCart
... In this paper, we describe the way the expressive power of the Carin language is exploited in the Picsel information integration system, while maintaining the decidability of query answering. We illustrate it on examples coming from the tourism domain, which is the first real case that we have to consider in Picsel, in collaboration with the travel agency Degriftour.
Rewriting of Regular Expressions and Regular Path Queries
, 2002
"... Recent work on semi-structured data has revitalized the interest in path queries, i.e., queries that ask for all pairs of objects in the database that are connected by a path conforming to a certain specification, in particular to a regular expression. Also, in semi-structured data, as well as in da ..."
Abstract
-
Cited by 66 (23 self)
- Add to MetaCart
Recent work on semi-structured data has revitalized the interest in path queries, i.e., queries that ask for all pairs of objects in the database that are connected by a path conforming to a certain specification, in particular to a regular expression. Also, in semi-structured data, as well as in data integration, data warehousing, and query optimization, the problem of view-based query rewriting is receiving much attention: Given a query and a collection of views, generate a new query which uses the views and provides the answer to the original one. In this paper we address the problem of view-based query rewriting in the context of semi-structured data. We present a method for computing the rewriting of a regular expression E in terms of other regular expressions. The method computes the exact rewriting (the one that defines the same regular language as E) if it exists, or the rewriting that defines the maximal language contained in the one defined by E, otherwise. We present a complexity analysis of both the problem and the method, showing that the latter is essentially optimal. Finally, we illustrate how to exploit the method for view-based rewriting of regular path queries in semi-structured data. The complexity results established for the rewriting of regular expressions apply also to the case of regular path queries.
Theory of Answering Queries Using Views
- SIGMOD Record
, 2000
"... The problem of answering queries using views is to nd ecient methods of answering a query using a set of previously materialized views over the database, rather than accessing the database relations. The problem has recently received signicant attention because of its relevance to a wide variety of ..."
Abstract
-
Cited by 62 (0 self)
- Add to MetaCart
The problem of answering queries using views is to nd ecient methods of answering a query using a set of previously materialized views over the database, rather than accessing the database relations. The problem has recently received signicant attention because of its relevance to a wide variety of data management problems, such as query optimization, the maintenance of physical data independence, data integration and data warehousing. This article surveys the theoretical issues concerning the problem of answering queries using views. 1
DATA INTEGRATION IN DATA WAREHOUSING
, 2001
"... Information integration is one of the most important aspects of a Data Warehouse. When data passes from the sources of the application-oriented operational environment to the Data Warehouse, possible inconsistencies and redundancies should be resolved, so that the warehouse is able to provide an int ..."
Abstract
-
Cited by 61 (19 self)
- Add to MetaCart
Information integration is one of the most important aspects of a Data Warehouse. When data passes from the sources of the application-oriented operational environment to the Data Warehouse, possible inconsistencies and redundancies should be resolved, so that the warehouse is able to provide an integrated and reconciled view of data of the organization. We describe a novel approach to data integration in Data Warehousing. Our approach is based on a conceptual representation of the Data Warehouse application domain, and follows the so-called local-as-view paradigm: both source and Data Warehouse relations are defined as views over the conceptual model. We propose a technique for declaratively specifying suitable reconciliation correspondences to be used in order to solve conflicts among data in different sources. The main goal of the method is to support the design of mediators that materialize the data in the Data Warehouse relations. Starting from the specification of one such relation as a query over the conceptual model, a rewriting algorithm reformulates the query in terms of both the source relations and the reconciliation correspondences, thus obtaining a correct specification of how to load the data in the materialized view.

