• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

An Efficient XML Node Identification and Indexing Scheme (2003)

by J-M Bremer, M Gertz
Add To MetaCart

Tools

Sorted by:
Results 1 - 5 of 5

From region encoding to extended dewey: On efficient processing of xml twig pattern matching

by Jiaheng Lu, Tok Wang Ling, Chee-yong Chan, Ting Chen - In VLDB , 2005
"... Finding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. A number of algorithms have been proposed to process a twig query based on region encoding labeling scheme. While region encoding supports efficient determination of ancestor ..."
Abstract - Cited by 47 (10 self) - Add to MetaCart
Finding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. A number of algorithms have been proposed to process a twig query based on region encoding labeling scheme. While region encoding supports efficient determination of ancestor-descendant (or parent-child) relationship between two elements, we observe that the information within a single label is very limited. In this paper, we propose a new labeling scheme, called extended Dewey. This is a powerful labeling scheme, since from the label of an element alone, we can derive all the elements names along the path from the root to the element. Based on extended Dewey, we design a novel holistic twig join algorithm, called TJ-Fast. Unlike all previous algorithms based on region encoding, to answer a twig query, TJ-Fast only needs to access the labels of the leaf query nodes. Through this, not only do we reduce disk access, but we also support the efficient evaluation of queries with wildcards in branching nodes, which is very difficult to be answered by algorithms based on region encoding. Finally, we report our experimental results to show that our algorithms are superior to previous approaches in terms of the number of elements scanned, the size of intermediate results and query performance.

On Distributing XML Repositories

by Jan-Marco Bremer, Michael Gertz , 2003
"... XML is increasingly used not only for data exchange but also to represent arbitrary data sources as virtual XML repositories. In many application scenarios, fragments of such a repository are distributed over the Web. However, design and query models for distributed XML data have not yet been studie ..."
Abstract - Cited by 14 (1 self) - Add to MetaCart
XML is increasingly used not only for data exchange but also to represent arbitrary data sources as virtual XML repositories. In many application scenarios, fragments of such a repository are distributed over the Web. However, design and query models for distributed XML data have not yet been studied in detail.

Query optimization in XML structured-document databases

by Dunren Che, Karl Aberer, M. Tamer Özsu - THE VLDB JOURNAL , 2006
"... While the information published in the form of XML-compliant documents keeps fast mounting up, efficient and effective query processing and optimization for XML have now become more important than ever. This article reports our recent advances in XML structureddocument query optimization. In this ar ..."
Abstract - Cited by 13 (0 self) - Add to MetaCart
While the information published in the form of XML-compliant documents keeps fast mounting up, efficient and effective query processing and optimization for XML have now become more important than ever. This article reports our recent advances in XML structureddocument query optimization. In this article, we elaborate on a novel approach and the techniques developed for XML query optimization. Our approach performs heuristic-based algebraic transformations on XPath queries, represented as PAT algebraic expressions, to achieve query optimization. This article first presents a comprehensive set of general equivalences with regard to XML documents and XML queries. Based on these equivalences, we developed a large set of deterministic algebraic transformation rules for XML query optimization. Our approach is unique, in that it performs exclusively deterministic transformations on queries for fast optimization. The deterministic nature of the proposed approach straightforwardly renders high optimization efficiency and simplicity in implementation. Our approach is a logical-level one, which is independent of any particular storage model. Therefore, the optimizers developed based on our approach can be easily adapted to a broad range of XML data/information servers to achieve fast query optimization. Experimental study confirms the validity and effectiveness of the proposed approach.

H.: The bird numbering scheme for xml and tree databases – deciding and reconstructing tree relations using efficient arithmetic operations

by Felix Weigel, Klaus U. Schulz, Holger Meuss - In: Proc. Int’l. XML Database Symposium (XSym). Volume 3671 of LNCS., Springer-Verlag , 2005
"... We introduce a family of numbering schemes for the nodes of tree databases that are based on a structural summary for the database, such as the DataGuide. Using such a scheme, given the node IDs of two database nodes and the corresponding nodes in the structural summary we may decide the extended XP ..."
Abstract - Cited by 9 (2 self) - Add to MetaCart
We introduce a family of numbering schemes for the nodes of tree databases that are based on a structural summary for the database, such as the DataGuide. Using such a scheme, given the node IDs of two database nodes and the corresponding nodes in the structural summary we may decide the extended XPath relations Child, Child +, Child ∗, Following, NextSibling, NextSibling +, NextSibling ∗ for the nodes without access to the database. Similarly we can reconstruct the parent node and neighboured siblings of a given node. All decision and reconstruction steps are based on simple arithmetic operations. The BIRD scheme offers high expressivity and needs modest storage capacities. Compared to other identification schemes with similar expressivity, BIRD performs best in terms of both storage consumption and execution time for decision and reconstruction. A very attractive feature of the BIRD scheme is that all extended XPath relations can be decided and reconstructed in constant time, i.e. independent of tree position and distance of the nodes involved. 1.

Distributed XML Repositories: Top-down Design and Transparent Query Processing

by Michael Gertz, Jan-marco Bremer , 2003
"... XML is increasingly used not only for data exchange but also to represent arbitrary data sources as virtual XML repositories. In many application scenarios, fragments of such repositories are distributed over the Web. However, design and query processing models for distributed XML data have not yet ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
XML is increasingly used not only for data exchange but also to represent arbitrary data sources as virtual XML repositories. In many application scenarios, fragments of such repositories are distributed over the Web. However, design and query processing models for distributed XML data have not yet been studied in detail. The goal of this paper is to study the design and management of distributed XML repositories. Following the well-established concepts of vertical and horizontal data fragmentation schemes for relational databases, we introduce a flexible distribution design approach for XML repositories. We provide a comprehensive data allocation model with a particular focus on storage efficient index structures. These index structures encode global path information about XML fragment data at local sites and provide for an efficient, local evaluation of the most common types of global path and tree pattern queries. Finally, we describe the basic principles of a distributed query processing model based on the concept of index shipping. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University