Results 1 - 10
of
25
Outerjoin Simplification and Reordering for Query Optimization
- ACM TRANSACTIONS ON DATABASE SYSTEMS
"... Conventional database optimizers take full advantage of associativity and commutativity properties of join to implement efficient and powerful optimizations on select/project/join queries. However, only limited optimization is performed on other binary operators. In this paper, we present the theory ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
Conventional database optimizers take full advantage of associativity and commutativity properties of join to implement efficient and powerful optimizations on select/project/join queries. However, only limited optimization is performed on other binary operators. In this paper, we present the theory and algorithms needed to generate alternative evaluation orders for the optimization of queries containing outerjoins. Our results include both a complete set of transformation rules, suitable for new-generation, transformation-based optimizers, and a bottom-up join enumeration algorithm compatible with those used by traditional optimizers.
Matching Twigs in Probabilistic XML
, 2007
"... Evaluation of twig queries over probabilistic XML is investigated. Projection is allowed and, in particular, a query may be Boolean. It is shown that for a well-known model of probabilistic XML, the evaluation of twigs with projection is tractable under data complexity (whereas in other probabilisti ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Evaluation of twig queries over probabilistic XML is investigated. Projection is allowed and, in particular, a query may be Boolean. It is shown that for a well-known model of probabilistic XML, the evaluation of twigs with projection is tractable under data complexity (whereas in other probabilistic data models, projection is intractable). Under queryand-data complexity, the problem becomes intractable even without projection (and for rather simple twigs and data). In earlier work on probabilistic XML, answers are always complete. However, there is often a need to produce partial answers because XML data may have missing sub-elements and, furthermore, complete answers may be deemed irrelevant if their probabilities are too low. It is shown how to define a semantics that provides partial answers that are maximal with respect to a probability threshold, which is specified by the user. For this semantics, it is shown how to efficiently evaluate twigs, even under query-and-data complexity if there is no projection.
Querying Semantically Tagged Documents on the World-Wide Web
, 1999
"... . QUEST is a system for Querying Semantically Tagged documents on the World-Wide Web. The advent of new markup languages, such as xml, facilitates authoring of Web documents that contain not just html tags for instructing a browser how to view a document, but also contain objects that represent the ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
. QUEST is a system for Querying Semantically Tagged documents on the World-Wide Web. The advent of new markup languages, such as xml, facilitates authoring of Web documents that contain not just html tags for instructing a browser how to view a document, but also contain objects that represent the semantic structure of the document. When such documents become widely available, more powerful methods to access and query information on the Web will be possible. The QUEST system was designed and implemented for querying and manipulating documents written in the markup language ohtml. ohtml combines html and objects of the oem data model. QUEST has several new features. First, QUEST can be used to query a combination of hypertext and object structures. Second, The results of queries are ohtml pages and thus of the same type as the data being queried. Third, QUEST implements a new approach for querying semistructured data that produces meaningful answers even when the input data is incomple...
Conflict Handling Strategies in an Integrated Information System
, 2006
"... Integrated information systems provide users and applications with a unified view of heterogeneous data sources. To provide a single consistent result for every object represented in these data sources, data fusion is concerned with resolving data inconsistencies within and among the sources. We pre ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Integrated information systems provide users and applications with a unified view of heterogeneous data sources. To provide a single consistent result for every object represented in these data sources, data fusion is concerned with resolving data inconsistencies within and among the sources. We present a classification of conflict resolution strategies and show how these are implemented within an integrated information system, the Humboldt-Merger.
The EVE Framework: View Synchronization In Evolving Environments
, 1997
"... The construction and maintenance of data warehouses (views) in large-scale environments composed of numerous distributed and evolving information sources (ISs) such as the WWW has received great attention recently. Suchenvironments are plagued with changing information because ISs tend to continuo ..."
Abstract
-
Cited by 7 (6 self)
- Add to MetaCart
The construction and maintenance of data warehouses (views) in large-scale environments composed of numerous distributed and evolving information sources (ISs) such as the WWW has received great attention recently. Suchenvironments are plagued with changing information because ISs tend to continuously evolveby modifying not only their content but also their query capabilities and interface and by joining or leaving the environment at any time. We are the first to introduce and address the problem of capability (schema) changes of ISs, while previous work in this area, such as incremental view maintenance, has mainly dealt with data changes at ISs. In this paper, we outline our solution approach to this challenging new problem of how to adapt views in such evolving environments. We identify a new view adaptation problem for view evolution in the context of ISs capability changes, whichwe call View Synchronization.We also outline the Evolvable View Environment (EVE) approach that ...
An Incremental Algorithm for Computing Ranked Full Disjunctions
- In PODS
, 2005
"... The full disjunction is a variation of the join operator that maximally combines tuples from connected relations, while preserving all information in the relations. The full disjunction can be seen as a natural extension of the binary outerjoin operator to an arbitrary number of relations and is a u ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
The full disjunction is a variation of the join operator that maximally combines tuples from connected relations, while preserving all information in the relations. The full disjunction can be seen as a natural extension of the binary outerjoin operator to an arbitrary number of relations and is a useful operator for information integration. This paper presents the algorithm IncrementalFD for computing the full disjunction of a set of relations. IncrementalFD improves upon previous algorithms for computing the full disjunction in three ways. First, it has a lower total runtime when computing the full result and a lower runtime when computing only k tuples of the result, for any constant k. Second, for a natural class of ranking functions, IncrementalFD returns tuples in ranking order. Third, IncrementalFD can be adapted to have a block-based execution, instead of a tuple-based execution. 1.
View Matching for Outer-Join Views
, 2005
"... Prior work on computing queries from materialized views has focused on views defined by expressions consisting of selection, projection, and inner joins, with an optional aggregation on top (SPJG views). This paper provides the first view matching algorithm for views that may also contain oute ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Prior work on computing queries from materialized views has focused on views defined by expressions consisting of selection, projection, and inner joins, with an optional aggregation on top (SPJG views). This paper provides the first view matching algorithm for views that may also contain outer joins (SPOJG views).
An Architecture for Transparent Access to Semantically Heterogeneous Information Sources
- 1st Int. Workshop on Cooperative Information Agents - LNCS 1202
, 1997
"... . We propose an agent architecture that provides transparent access to a set of distributed, heterogeneous, and autonomous information sources. Our objectives are twofold: First, we want to support quick development of mediators by automatically deriving mediator specifications which are subsequentl ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
. We propose an agent architecture that provides transparent access to a set of distributed, heterogeneous, and autonomous information sources. Our objectives are twofold: First, we want to support quick development of mediators by automatically deriving mediator specifications which are subsequently fed to a mediator generator. Second, we wish to supply the user with a relatively simple model of the information provided in the system so that he can formulate queries without being aware of the existing information sources or service providers and their locations. The system then uses its knowledge about available information sources to generate appropriate query execution plans. 1 Introduction In recent years, the number of information sources offered on the net has grown tremendously. Support for access to these information sources has mostly taken the form of browsing and search tools. However, the responsibility of finding the right information sources, phrasing queries against the...
instances navigation for querying integrated data from web-sites
- In WEBIST 2006, Proceedings of the Second International Conference on Web Information Systems and Technologies, Setubal
, 2006
"... Abstract: Research on data integration has provided a set of rich and well understood schema mediation languages and systems that provide a meta-data representation of the modeled real world, while, in general, they do not deal with data instances. Such meta-data are necessary for querying classes r ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Abstract: Research on data integration has provided a set of rich and well understood schema mediation languages and systems that provide a meta-data representation of the modeled real world, while, in general, they do not deal with data instances. Such meta-data are necessary for querying classes result of an integration process: the end user typically does not know the contents of such classes, he simply defines his queries on the basis of the names of classes and attributes. In this paper we introduce an approach enriching the description of selected attributes specifying as meta-data a list of the “relevant values ” for such attributes. Furthermore relevant values may be hierarchically collected in a taxonomy. In this way, the user may exploit new meta-data in the interactive process of creating/refining a query. The same meta-data are also exploited by the system in the query rewriting/unfolding process in order to filter the results showed to the user. We conducted an evaluation of the strategy in an e-business context within the EU-IST SEWASIE project. The evaluation proved the practicability of the approach for large value instances. 1
Full disjunctions: Polynomial-delay iterators in action
- In VLDB
, 2006
"... Full disjunctions are an associative extension of the outerjoin operator to an arbitrary number of relations. Their main advantage is the ability to maximally combine data from different relations while preserving all the original information. An algorithm for efficiently computing full disjunctions ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Full disjunctions are an associative extension of the outerjoin operator to an arbitrary number of relations. Their main advantage is the ability to maximally combine data from different relations while preserving all the original information. An algorithm for efficiently computing full disjunctions is presented. This algorithm is superior to previous ones in three ways. First, it is the first algorithm that computes a full disjunction with a polynomial delay between tuples. Hence, it can be implemented as an iterator that produces a stream of tuples, which is important in many cases (e.g., pipelined query processing and Web applications). Second, the total runtime is linear in the size of the output. Third, the algorithm employs a novel optimization that divides the relation schemes into biconnected components, uses a separate iterator for each component and applies outerjoins whenever possible. Combining efficiently full disjunctions with standard SQL operators is discussed. Experiments show the superiority of our algorithm over the state of the art. 1.

