Results 1 - 10
of
33
Query optimization in the presence of limited access patterns
- In Proc. of ACM SIGMOD Conf. on Management of Data
, 1999
"... We consider the problem of query optimization in the presence of limitations on access patterns to the data (i.e., when one must provide values for one of the attributes of a relation in order to obtain tuples). We show that in the presence of limited access patterns we must search a space of annota ..."
Abstract
-
Cited by 77 (10 self)
- Add to MetaCart
We consider the problem of query optimization in the presence of limitations on access patterns to the data (i.e., when one must provide values for one of the attributes of a relation in order to obtain tuples). We show that in the presence of limited access patterns we must search a space of annotated query plans, where the annotations describe the inputs that must be given to the plan. We describe a theoretical and experimental analysis of the resulting search space and a novel query optimization algorithm that is designed to perform well under the di erent conditions that may arise. The algorithm searches the set of annotated query plans, pruning invalid and non-viable plans as early as possible in the search space, and it also uses a best- rst search strategy in order to produce a rst complete plan early in the search. We describe experiments to illustrate the performance of our algorithm. 1
M.: Description logics for information integration
- Computational Logic: Logic Programming and Beyond. LNCS
, 2002
"... Abstract. Information integration is the problem of combining the data residing at different, heterogeneous sources, and providing the user with a unified view of these data, called mediated schema. The mediated schema is therefore a reconciled view of the information, which can be queried by the us ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
Abstract. Information integration is the problem of combining the data residing at different, heterogeneous sources, and providing the user with a unified view of these data, called mediated schema. The mediated schema is therefore a reconciled view of the information, which can be queried by the user. It is the task of the system to free the user from the knowledge on where data are, and how data are structured at the sources. In this chapter, we discuss data integration in general, and describe a logic-based approach to data integration. A logic of the Description Logics family is used to model the information managed by the integration system, to formulate queries posed to the system, and to perform several types of automated reasoning supporting both the modeling, and the query answering process. We focus, in particular, on a specific Description Logic, called DLR, specifically designed for database applications. In the chapter, we illustrate how DLR is used to model a mediated schema of an integration system, to specify the semantics of the data sources, and finally to support the query answering process by means of the associated reasoning methods. 1
Query Planning with Limited Source Capabilities
- International Conference on Data Engineering (ICDE
, 1999
"... In information-integration systems, sources may have diverse and limited query capabilities. In this paper we show that because sources have restrictions on retrieving their information, sources not mentioned in a query can contribute to the query result by providing useful bindings. In some cases w ..."
Abstract
-
Cited by 29 (9 self)
- Add to MetaCart
In information-integration systems, sources may have diverse and limited query capabilities. In this paper we show that because sources have restrictions on retrieving their information, sources not mentioned in a query can contribute to the query result by providing useful bindings. In some cases we can access sources repeatedly to retrieve bindings to answer a query, and query planning thus becomes considerably more challenging. We find all the obtainable answers to a query by translating the query and source descriptions to a simple recursive Datalog program, and evaluating the program on the source relations. This program often accesses sources that are not in the query. Some of these accesses are essential, as they provide bindings that let us query sources, which we could not do otherwise. However, some of these accesses can be proven not to add anything to the query's answer. We show in which cases these off-query accesses are useless, and prove that in these cases we can comput...
On Answering Queries in the Presence of Limited Access Patterns
- In Proc. of ICDT 2001
, 2001
"... . In information-integration systems, source relations often have limitations on access patterns to their data; i.e., when one must provide values for certain attributes of a relation in order to retrieve its tuples. In this paper we consider the following fundamental problem: can we compute the ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
. In information-integration systems, source relations often have limitations on access patterns to their data; i.e., when one must provide values for certain attributes of a relation in order to retrieve its tuples. In this paper we consider the following fundamental problem: can we compute the complete answer to a query by accessing the relations with legal patterns? The complete answer to a query is the answer that we could compute if we could retrieve all the tuples from the relations. We give algorithms for solving the problem for various classes of queries, including conjunctive queries, unions of conjunctive queries, and conjunctive queries with arithmetic comparisons. We prove the problem is undecidable for datalog queries. If the complete answer to a query cannot be computed, we often need to compute its maximal answer. The second problem we study is, given two conjunctive queries on relations with limited access patterns, how to test whether the maximal answer to...
Computing Complete Answers to Queries in the Presence of Limited Access Patterns
- Journal of VLDB
, 1999
"... In data applications such as information integration, there can be limited access patterns to relations, i.e., binding patterns require values to be specified for certain attributes in order to retrieve data from a relation. As a consequence, we cannot retrieve all tuples from these relations. In th ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
In data applications such as information integration, there can be limited access patterns to relations, i.e., binding patterns require values to be specified for certain attributes in order to retrieve data from a relation. As a consequence, we cannot retrieve all tuples from these relations. In this article we study the problem of computing the complete answer to a query, i.e., the answer that could be computed if all the tuples could be retrieved. A query is stable if for any instance of the relations in the query, its complete answer can be computed using the access patterns permitted by the relations. We study the problem of testing stability of various classes of queries, including conjunctive queries, unions of conjunctive queries, and conjunctive queries with arithmetic comparisons.
Mediators over Ontology-based Information Sources
- In Second International Conference on Web Information Systems Engineering, WISE 2001
, 2001
"... We propose a model for providing integrated and unified access to multiple information sources. Each information source comprises two parts: (a) an ontology i.e. a set of terms structured by a subsumption relation, and (b) a database that stores objects under the terms of the ontology. We assume tha ..."
Abstract
-
Cited by 14 (11 self)
- Add to MetaCart
We propose a model for providing integrated and unified access to multiple information sources. Each information source comprises two parts: (a) an ontology i.e. a set of terms structured by a subsumption relation, and (b) a database that stores objects under the terms of the ontology. We assume that the objects of interest belong to an underlying domain that is common to all sources (e.g. a set of web pages of interest), and that different sources may use different ontologies with terms that correspond to different natural languages or to different levels of granularity. Information integration is obtained through a mediator comprising two parts: (a) an ontology, and (b) a set of articulations to the information sources. Here, by articulation to a source we mean a set of relationships between terms of the mediator and terms of that source. Information requests (queries) are addressed to the mediator whose task is to analyze each query into sub-queries, translate them into queries to the appropriate sources, then merge the results to answer the original query. We study the querying and answering process in such a model and present algorithms for handling the main tasks of the mediator, namely, query translation between the mediator and the sources, source selection and result merging to produce the final answer.
Survey on Methods for Query Rewriting and Query Answering Using Views
, 2001
"... A Data Integration System is constituted by three main components: source schemas, a global schema and a mapping between the two. There exist two main approaches for specifying the mapping: in the local-as-view (LAV) approach the source structures are de ned as views over the global schema; on t ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
A Data Integration System is constituted by three main components: source schemas, a global schema and a mapping between the two. There exist two main approaches for specifying the mapping: in the local-as-view (LAV) approach the source structures are de ned as views over the global schema; on the contrary in the global-as-view (GAV) approach each global concept is de ned in terms of a view over the source schemas. The problem of query processing is to nd ecient methods for answering queries posed to the global schema on the basis of the data stored at sources. In LAV there exist two approaches to query processing: by query rewriting, in which one tries to compute a rewriting of the query in terms of the views and then evaluates such a rewriting, and by query answering, in which one aims at directly answering the query based on the view extensions. In GAV, existing systems deal with query processing by simply unfolding each global concept in the query with its de nition in terms of the sources. In this paper, we survey the most important query processing algorithms proposed in the literature for LAV, and we describe the principal GAV data integration systems and the form of query processing they adopt.
Optimized Seamless Integration of Biomolecular Data
- IEEE symposium on Bio-Informatics and Biomedical Engineering (BIBE’2001), Washington DC
, 2001
"... Today, scientific data is inevitably digitized, stored in a variety of heterogeneous formats, and is accessible over the Internet. Scientists need to access an integrated view of multiple remote or local heterogeneous data sources. They then integrate the results of complex queries and apply further ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Today, scientific data is inevitably digitized, stored in a variety of heterogeneous formats, and is accessible over the Internet. Scientists need to access an integrated view of multiple remote or local heterogeneous data sources. They then integrate the results of complex queries and apply further analysis and visualization to support the task of scientific discovery. Building a digital library for scientific discovery requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web, as well as data that is locally materialized in warehouses or is generated by software. We consider several tasks to provide optimized and seamless integration of biomolecular data. Challenges to be addressed include capturing and representing source capabilities; developing a methodology to acquire and represent metadata about source contents and access costs; and decision support to select sources and capabilities using cost based and semantic knowledge, and generating low cost query evaluation plans.
Answering Queries with Useful Bindings
- ACM Transactions on Database Systems (TODS
, 2001
"... this paper, we propose a query-planning framework to answer queries in the presence of limited access patterns. In the framework, a query and source descriptions are translated to a recursive datalog program. We then solve optimization problems in this framework, including how to decide whether acce ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
this paper, we propose a query-planning framework to answer queries in the presence of limited access patterns. In the framework, a query and source descriptions are translated to a recursive datalog program. We then solve optimization problems in this framework, including how to decide whether accessing off-query sources is necessary, how to choose useful sources for a query, and how to test query containment. We develop algorithms to solve these problems, and thus construct an efficient program to answer a query
Modeling Interactive Web Sources for Information Mediation
- In Procs. of International Workshop on World-Wide Web and Conceptual Modeling, Lecture Notes in Computer Science
, 1999
"... . We propose a method for modeling complex Web sources that have active user interaction requirements. Here #active" refers to the fact that certain information in these sources is only reachable through interactions like #lling out forms or clicking on image maps. Typically, the former interact ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
. We propose a method for modeling complex Web sources that have active user interaction requirements. Here #active" refers to the fact that certain information in these sources is only reachable through interactions like #lling out forms or clicking on image maps. Typically, the former interaction can be automated by wrapper software #e.g., using parameterized urls or post commands# while the latter cannot and thus requires explicit user interaction. We propose a modeling technique for suchinteractiveWeb sources and the information they export, based on so-called interaction diagrams. The nodes of an interaction diagram model sources and their exported information, whereas edges model transitions and their interactions. The paths of a diagram correspond to sequences of interactions and allow to derive the various query capabilities of the source. Based on these, one can determine which queries are supported by a source and derive query plans with minimal user interaction....

