Results 1 - 10
of
41
Query optimization in the presence of limited access patterns
- In Proc. of ACM SIGMOD Conf. on Management of Data
, 1999
"... We consider the problem of query optimization in the presence of limitations on access patterns to the data (i.e., when one must provide values for one of the attributes of a relation in order to obtain tuples). We show that in the presence of limited access patterns we must search a space of annota ..."
Abstract
-
Cited by 101 (9 self)
- Add to MetaCart
We consider the problem of query optimization in the presence of limitations on access patterns to the data (i.e., when one must provide values for one of the attributes of a relation in order to obtain tuples). We show that in the presence of limited access patterns we must search a space of annotated query plans, where the annotations describe the inputs that must be given to the plan. We describe a theoretical and experimental analysis of the resulting search space and a novel query optimization algorithm that is designed to perform well under the di erent conditions that may arise. The algorithm searches the set of annotated query plans, pruning invalid and non-viable plans as early as possible in the search space, and it also uses a best- rst search strategy in order to produce a rst complete plan early in the search. We describe experiments to illustrate the performance of our algorithm. 1
WSQ/DSQ: A Practical Approach for Combined Querying of Databases and the Web
- In SIGMOD
, 2000
"... We present WSQ/DSQ (pronounced "wisk-disk"), a new approach for combining the query facilities of traditional databases with existing search engines on the Web. WSQ, for Web-Supported (Database) Queries, leverages results from Web searches to enhance SQL queries over a relational databas ..."
Abstract
-
Cited by 77 (5 self)
- Add to MetaCart
We present WSQ/DSQ (pronounced "wisk-disk"), a new approach for combining the query facilities of traditional databases with existing search engines on the Web. WSQ, for Web-Supported (Database) Queries, leverages results from Web searches to enhance SQL queries over a relational database. DSQ, for Database-Supported (Web) Queries, uses information stored in the database to enhance and explain Web searches. This paper focuses primarily on WSQ, describing a simple, low-overhead way to support WSQ in a relational DBMS, and demonstrating the utility of WSQ with a number of interesting queries and results. The queries supported by WSQ are enabled by two virtual tables, whose tuples represent Web search results generated dynamically during query execution. WSQ query execution may involve many high-latency calls to one or more search engines, during which the query processor is idle. We present a lightweight technique called asynchronous iteration that can be integrated easily into a standard sequential query processor to enable concurrency between query processing and multiple Web search requests. Asynchronous iteration has broader applications than WSQ alone, and it opens up many interesting query optimization issues. We have developed a prototype implementation of WSQ by extending a DBMS with virtual tables and asynchronous iteration; performance results are reported. 1
Optimizing Recursive Information Gathering Plans
, 1999
"... In this paper we describe two optimization techniques that are specially tailored for information gathering. The first is a greedy minimization algorithm that minimizes an information gathering plan by removing redundant and overlapping information sources without loss of completeness. We then discu ..."
Abstract
-
Cited by 51 (10 self)
- Add to MetaCart
In this paper we describe two optimization techniques that are specially tailored for information gathering. The first is a greedy minimization algorithm that minimizes an information gathering plan by removing redundant and overlapping information sources without loss of completeness. We then discuss a set of...
Description logics for information integration
- COMPUTATIONAL LOGIC: LOGIC PROGRAMMING AND BEYOND. LNCS
, 2002
"... Information integration is the problem of combining the data residing at different, heterogeneous sources, and providing the user with a unified view of these data, called mediated schema. The mediated schema is therefore a reconciled view of the information, which can be queried by the user. It i ..."
Abstract
-
Cited by 39 (4 self)
- Add to MetaCart
Information integration is the problem of combining the data residing at different, heterogeneous sources, and providing the user with a unified view of these data, called mediated schema. The mediated schema is therefore a reconciled view of the information, which can be queried by the user. It is the task of the system to free the user from the knowledge on where data are, and how data are structured at the sources. In this chapter, we discuss data integration in general, and describe a logic-based approach to data integration. A logic of the Description Logics family is used to model the information managed by the integration system, to formulate queries posed to the system, and to perform several types of automated reasoning supporting both the modeling, and the query answering process. We focus, in particular, on a specific Description Logic, called DLR, specifically designed for database applications. In the chapter, we illustrate how DLR is used to model a mediated schema of an integration system, to specify the semantics of the data sources, and finally to support the query answering process by means of the associated reasoning methods.
Query Planning with Limited Source Capabilities
- International Conference on Data Engineering (ICDE
, 1999
"... In information-integration systems, sources may have diverse and limited query capabilities. In this paper we show that because sources have restrictions on retrieving their information, sources not mentioned in a query can contribute to the query result by providing useful bindings. In some cases w ..."
Abstract
-
Cited by 35 (8 self)
- Add to MetaCart
(Show Context)
In information-integration systems, sources may have diverse and limited query capabilities. In this paper we show that because sources have restrictions on retrieving their information, sources not mentioned in a query can contribute to the query result by providing useful bindings. In some cases we can access sources repeatedly to retrieve bindings to answer a query, and query planning thus becomes considerably more challenging. We find all the obtainable answers to a query by translating the query and source descriptions to a simple recursive Datalog program, and evaluating the program on the source relations. This program often accesses sources that are not in the query. Some of these accesses are essential, as they provide bindings that let us query sources, which we could not do otherwise. However, some of these accesses can be proven not to add anything to the query's answer. We show in which cases these off-query accesses are useless, and prove that in these cases we can comput...
Computing Complete Answers to Queries in the Presence of Limited Access Patterns
- Journal of VLDB
, 1999
"... In data applications such as information integration, there can be limited access patterns to relations, i.e., binding patterns require values to be specified for certain attributes in order to retrieve data from a relation. As a consequence, we cannot retrieve all tuples from these relations. In th ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
In data applications such as information integration, there can be limited access patterns to relations, i.e., binding patterns require values to be specified for certain attributes in order to retrieve data from a relation. As a consequence, we cannot retrieve all tuples from these relations. In this article we study the problem of computing the complete answer to a query, i.e., the answer that could be computed if all the tuples could be retrieved. A query is stable if for any instance of the relations in the query, its complete answer can be computed using the access patterns permitted by the relations. We study the problem of testing stability of various classes of queries, including conjunctive queries, unions of conjunctive queries, and conjunctive queries with arithmetic comparisons.
On Answering Queries in the Presence of Limited Access Patterns
- In Proc. of ICDT 2001
, 2001
"... . In information-integration systems, source relations often have limitations on access patterns to their data; i.e., when one must provide values for certain attributes of a relation in order to retrieve its tuples. In this paper we consider the following fundamental problem: can we compute the ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
(Show Context)
. In information-integration systems, source relations often have limitations on access patterns to their data; i.e., when one must provide values for certain attributes of a relation in order to retrieve its tuples. In this paper we consider the following fundamental problem: can we compute the complete answer to a query by accessing the relations with legal patterns? The complete answer to a query is the answer that we could compute if we could retrieve all the tuples from the relations. We give algorithms for solving the problem for various classes of queries, including conjunctive queries, unions of conjunctive queries, and conjunctive queries with arithmetic comparisons. We prove the problem is undecidable for datalog queries. If the complete answer to a query cannot be computed, we often need to compute its maximal answer. The second problem we study is, given two conjunctive queries on relations with limited access patterns, how to test whether the maximal answer to...
Answering Queries with Useful Bindings
- ACM Transactions on Database Systems (TODS
, 2001
"... this paper, we propose a query-planning framework to answer queries in the presence of limited access patterns. In the framework, a query and source descriptions are translated to a recursive datalog program. We then solve optimization problems in this framework, including how to decide whether acce ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
this paper, we propose a query-planning framework to answer queries in the presence of limited access patterns. In the framework, a query and source descriptions are translated to a recursive datalog program. We then solve optimization problems in this framework, including how to decide whether accessing off-query sources is necessary, how to choose useful sources for a query, and how to test query containment. We develop algorithms to solve these problems, and thus construct an efficient program to answer a query
Joint Optimization of Cost and Coverage of Query Plans in Data Integration
, 2001
"... Existing approaches for optimizing queries in data integration use decoupled strategies--attempting to optimize coverage and cost in two separate phases. Since sources tend to have a variety of access limitations, such phased optimization of cost and coverage can unfortunately lead to expensive plan ..."
Abstract
-
Cited by 19 (12 self)
- Add to MetaCart
Existing approaches for optimizing queries in data integration use decoupled strategies--attempting to optimize coverage and cost in two separate phases. Since sources tend to have a variety of access limitations, such phased optimization of cost and coverage can unfortunately lead to expensive planning as well as highly inefficient plans. In this paper we present techniques for joint optimization of cost and coverage of the query plans. Our algorithms search in the space of parallel query plans that support multiple sources for each subgoal conjunct. The refinement of the partial plans takes into account the potential parallelism between source calls, and the binding compatibilities between the sources included in the plan. We start by introducing and motivating our query plan representation. We then briefly review how to compute the cost and coverage of a parallel plan. Next, we provide both a System-R style query optimization algorithm as well as a greedy local search algorithm for searching in the space of such query plans. Finally we present a simulation study that demonstrates that the plans generated by our approach will be significantly better, both in terms of planning cost, and in terms of plan execution cost, compared to the existing approaches. 1.
Survey on Methods for Query Rewriting and Query Answering Using Views
, 2001
"... A Data Integration System is constituted by three main components: source schemas, a global schema and a mapping between the two. There exist two main approaches for specifying the mapping: in the local-as-view (LAV) approach the source structures are defined as views over the global schema; on t ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
A Data Integration System is constituted by three main components: source schemas, a global schema and a mapping between the two. There exist two main approaches for specifying the mapping: in the local-as-view (LAV) approach the source structures are defined as views over the global schema; on the contrary in the global-as-view (GAV) approach each global concept is defined in terms of a view over the source schemas. The problem of query processing is to nd ecient methods for answering queries posed to the global schema on the basis of the data stored at sources. In LAV there exist two approaches to query processing: by query rewriting, in which one tries to compute a rewriting of the query in terms of the views and then evaluates such a rewriting, and by query answering, in which one aims at directly answering the query based on the view extensions. In GAV, existing systems deal with query processing by simply unfolding each global concept in the query with its definition in terms of the sources. In this paper, we survey the most important query processing algorithms proposed in the literature for LAV, and we describe the principal GAV data integration systems and the form of query processing they adopt.