Results 1 - 10
of
30
A complete and efficient algebraic compiler for XQuery
- In ICDE
, 2006
"... As XQuery nears standardization, more sophisticated XQuery applications are emerging, which often exploit the entire language and are applied to non-trivial XML sources. We propose an algebra and optimization techniques that are suitable for building an XQuery compiler that is complete, correct, and ..."
Abstract
-
Cited by 36 (6 self)
- Add to MetaCart
As XQuery nears standardization, more sophisticated XQuery applications are emerging, which often exploit the entire language and are applied to non-trivial XML sources. We propose an algebra and optimization techniques that are suitable for building an XQuery compiler that is complete, correct, and efficient. We describe the compilation rules for the complete language into that algebra and present novel optimization techniques that address the needs of complex queries. These techniques include new query unnesting rewritings and specialized join algorithms that account for XQuery’s complex predicate semantics. The algebra and optimizations are implemented in the Galax XQuery engine, and yield execution plans that are up to three orders of magnitude faster than earlier versions of Galax. 1.
Tree Logical Classes for Efficient Evaluation of XQuery
- In SIGMOD
, 2004
"... XML is widely praised for its flexibility in allowing repeated and missing sub-elements. However, this flexibility makes it challenging to develop a bulk algebra, which typically manipulates sets of objects with identical structure. A set of XML elements, say of type book, may have members that vary ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
XML is widely praised for its flexibility in allowing repeated and missing sub-elements. However, this flexibility makes it challenging to develop a bulk algebra, which typically manipulates sets of objects with identical structure. A set of XML elements, say of type book, may have members that vary greatly in structure, e.g. in the number of author sub-elements. This kind of heterogeneity may permeate the entire document in a recursive fashion: e.g., different authors of the same or different book may in turn greatly vary in structure. Even when the document conforms to a schema, the flexible nature of schemas for XML still allows such significant variations in structure among elements in a collection. Bulk processing of such heterogeneous sets is problematic.
Pattern Tree Algebras: Sets Or Sequences?
, 2005
"... XML and XQuery semantics are very sensitive to the order of the produced output. Although pattern-tree based algebraic approaches are becoming more and more popular for evaluating XML, there is no universally accepted technique which can guarantee both a correct output order and a choice of ef ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
XML and XQuery semantics are very sensitive to the order of the produced output. Although pattern-tree based algebraic approaches are becoming more and more popular for evaluating XML, there is no universally accepted technique which can guarantee both a correct output order and a choice of efficient alternative plans.
Structured Materialized Views for XML Queries
, 2007
"... The performance of XML database queries can be greatly enhanced by rewriting them using materialized views. We study the problem of rewriting a query using materialized views, where both the query and the views are described by a tree pattern language, appropriately extended to capture a large XQuer ..."
Abstract
-
Cited by 14 (5 self)
- Add to MetaCart
The performance of XML database queries can be greatly enhanced by rewriting them using materialized views. We study the problem of rewriting a query using materialized views, where both the query and the views are described by a tree pattern language, appropriately extended to capture a large XQuery subset. The pattern language features optional nodes and nesting, allowing to capture the data needs of nested XQueries. The language also allows describing storage features such as structural identifiers, which enlarge the space of rewritings. We study pattern containment and equivalent rewriting under the constraints expressed in a structural summary, whose enhanced form also entails integrity constraints. Our approach is implemented in the ULoad [7] prototype and we present a performance analysis.
Path Summaries and Path Partitioning in Modern XML Databases
- WORLD WIDE WEB (2008 ) 11:117–151
, 2008
"... XML path summaries are compact structures representing all the simple parent-child paths of an XML document. Such paths have also been used in many works as a basis for partitioning the document’s content in a persistent store, under the form of path indices or path tables. We revisit the notions of ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
XML path summaries are compact structures representing all the simple parent-child paths of an XML document. Such paths have also been used in many works as a basis for partitioning the document’s content in a persistent store, under the form of path indices or path tables. We revisit the notions of path summaries and path-driven storage model in the context of current-day XML databases. This context is characterized by complex queries, typically expressed in an XQuery subset, and by the presence of efficient encoding techniques such as structural node identifiers. We review a path summary’s many uses for query optimization, and given them a common basis, namely relevant paths. We discuss summary-based tree pattern minimization and present some efficient summary-based minimization heuristics. We consider relevant path computation and provide a time- and memory-efficient computation algorithm. We combine the principle of path partitioning with the presence of structural identifiers in a simple path-partitioned storage model, which allows for selective data access and efficient query plans. This model improves the efficiency of twig query processing up to two orders of magnitude over the similar
Efficient Evaluation of n-ary Conjunctive Queries over Trees and Graphs
- In Proc. ACM Int’l. Workshop on Web Information and Data Management (WIDM
, 2006
"... N-ary conjunctive queries, i.e., queries with any number of answer variables, are the formal core of many Web query languages including XSLT, XQuery, SPARQL, and Xcerpt. Despite a considerable body of research on the optimization of such queries over tree-shaped XML data, little attention has been p ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
N-ary conjunctive queries, i.e., queries with any number of answer variables, are the formal core of many Web query languages including XSLT, XQuery, SPARQL, and Xcerpt. Despite a considerable body of research on the optimization of such queries over tree-shaped XML data, little attention has been paid so far to efficient access to graph-shaped XML, RDF, or Topic Maps. We propose the first evaluation technique for n-ary conjunctive queries that applies to both tree- and graph-shaped data and retains the same complexity as the best known approaches that are restricted to tree-shaped data only. Furthermore, the approach treats tree and graph-shaped queries uniformly without sacrificing evaluation complexity on the restricted query class. The core of the evaluation technique is based on dynamic programming using a memoization data structure, called “memoization matrix”. It can be populated and consumed in different ways. For each of population and consumption, we propose two resp. three algorithms each having their own advantages. The complexity of the algorithms is compared analytically and experimentally. Categories and Subject Descriptors: E.1[Data]: Data
Combining temporal logics for querying XML documents
- In International Conference on Database Theory
, 2006
"... Abstract. Close relationships between XML navigation and temporal logics have been discovered recently, in particular between logics LTL and CTL ⋆ and XPath navigation, and between the µ-calculus and navigation based on regular expressions. This opened up the possibility of bringing model-checking t ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Abstract. Close relationships between XML navigation and temporal logics have been discovered recently, in particular between logics LTL and CTL ⋆ and XPath navigation, and between the µ-calculus and navigation based on regular expressions. This opened up the possibility of bringing model-checking techniques into the field of XML, as documents are naturally represented as labeled transition systems. Most known results of this kind, however, are limited to Boolean or unary queries, which are not always sufficient for complex querying tasks. Here we present a technique for combining temporal logics to capture nary XML queries expressible in two yardstick languages: FO and MSO. We show that by adding simple terms to the language, and combining a temporal logic for words together with a temporal logic for unary tree queries, one obtains logics that select arbitrary tuples of elements, and can thus be used as building blocks in complex query languages. We present general results on the expressiveness of such temporal logics, study their model-checking properties, and relate them to some common XML querying tasks. 1
Xquec: A query-conscious compressed xml database
- ACM Trans. Internet Tech
"... XML compression has gained prominence recently because it counters the disadvantage of the “verbose ” representation XML gives to data. In many applications, such as data exchange and data archiving, entirely compressing and decompressing a document is acceptable. In other applications, where querie ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
XML compression has gained prominence recently because it counters the disadvantage of the “verbose ” representation XML gives to data. In many applications, such as data exchange and data archiving, entirely compressing and decompressing a document is acceptable. In other applications, where queries must be run over compressed documents, compression may not be beneficial since the performance penalty in running the query processor over compressed data outweighs the data compression benefits. While balancing the interests of compression and query processing has received significant attention in the domain of relational databases, these results do not immediately translate to XML data. In this paper, we address the problem of embedding compression into XML databases without degrading query performance. Since the setting is rather different from relational databases, the choice of compression granularity and compression algorithms must be revisited. Query execution in the compressed domain must also be rethought in the framework of XML query processing, due to the richer structure of XML data. Indeed, a proper storage design for the compressed data plays a crucial role here. The XQueC system (standing for XQuery Processor and C ompressor) covers a wide set of
Quo Vadis, Web Queries?
"... Abstract—Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, ha ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Abstract—Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of traditional query languages for XML and RDF, focused on emerging preeminent exemplars in each field, and contrasts these languages with the field of keyword querying for XML and RDF. I.
The NEXT Framework for Logical XQuery Optimization
- In VLDB
, 2004
"... Classical logical optimization techniques rely on a logical semantics of the query language. The adaptation of these techniques to XQuery is precluded by its definition as a functional language with operational semantics. We introduce Nested XML Tableaux which enable a logical foundation for XQuery ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Classical logical optimization techniques rely on a logical semantics of the query language. The adaptation of these techniques to XQuery is precluded by its definition as a functional language with operational semantics. We introduce Nested XML Tableaux which enable a logical foundation for XQuery semantics and provide the logical plan optimization framework of our XQuery processor. As a proof of concept, we develop and evaluate a minimization algorithm for removing redundant navigation within and across nested subqueries. The rich XQuery features create key challenges that fundamentally extend the prior work on the problems of minimizing conjunctive and tree pattern queries. 1

