Results 1 - 10
of
155
Archiving scientific data
- In ACM SIGMOD
, 2002
"... Archiving is important for scientific data, where it is necessary to record all past versions of a database in order to verify findings based upon a specific version. Much scientific data is held in a hierachical format and has a key structure that provides a canonical identification for each elemen ..."
Abstract
-
Cited by 97 (8 self)
- Add to MetaCart
Archiving is important for scientific data, where it is necessary to record all past versions of a database in order to verify findings based upon a specific version. Much scientific data is held in a hierachical format and has a key structure that provides a canonical identification for each element of the hierarchy. In this article, we exploit these properties to develop an archiving technique that is both efficient in its use of space and preserves the continuity of elements through versions of the database, something that is not provided by traditional minimum-edit-distance diff approaches. The approach also uses timestamps. All versions of the data are merged into one hierarchy where an element appearing in multiple versions is stored only once along with a timestamp. By identifying the semantic continuity of elements and merging them into one data structure, our technique is capable of providing meaningful change descriptions, the archive allows us to easily answer certain temporal queries such as retrieval of any specific version from the archive and finding the history of an element. This is in contrast with approaches that store a sequence of deltas where such operations may require undoing a large number of changes or significant reasoning with the
MonetDB/XQuery: a fast XQuery processor powered by a relational engine
- In SIGMOD
, 2006
"... Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-bas ..."
Abstract
-
Cited by 84 (22 self)
- Add to MetaCart
Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-based encoding of XML documents into relational tables, (ii) a compilation technique that translates XQuery into a basic relational algebra, (iii) a restricted (order) property-aware peephole relational query optimization strategy, and (iv) a mapping from XML update statements into relational updates. Thus, this system implements all essential XML database functionalities (rather than a single feature) such that we can learn from the full consequences of our architectural decisions. While implementing this system, we had to extend the state-of-theart with a number of new technical contributions, such as looplifted staircase join and efficient relational query evaluation strategies for XQuery theta-joins with existential semantics. These contributions as well as the architectural lessons learned are also deemed valuable for other relational back-end engines. The performance and scalability of the resulting system is evaluated on the XMark benchmark up to data sizes of 11 GB. The performance section also provides an extensive comparison of all major XMark results published previously, which confirm that the goal of purely relational XQuery processing, namely speed and scalability, was met. 1.
Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps
- IN PROC. OF THE 29TH INT’L CONFERENCE ON VERY LARGE DATABASES (VLDB
, 2003
"... Relational query processors derive much of their effectiveness from the awareness of specific table properties like sort order, size, or absence of duplicate tuples. This text applies (and adapts) this successful principle to database-supported XML and XPath processing: the relational system is made ..."
Abstract
-
Cited by 75 (23 self)
- Add to MetaCart
Relational query processors derive much of their effectiveness from the awareness of specific table properties like sort order, size, or absence of duplicate tuples. This text applies (and adapts) this successful principle to database-supported XML and XPath processing: the relational system is made tree aware, i.e., tree properties like subtree size, intersection of paths, inclusion or disjointness of subtrees are made explicit. We propose a local change to the database kernel, the staircase join, which encapsulates the necessary tree knowledge needed to improve XPath performance. Staircase join
XQuery on SQL Hosts
- In VLDB Conf
, 2004
"... Relational database systems may be turned into efficient XML and XPath processors if the system is provided with a suitable relational tree encoding. This paper extends this relational XML processing stack and shows that an RDBMS can also serve as a highly efficient XQuery runtime environment. Our a ..."
Abstract
-
Cited by 62 (22 self)
- Add to MetaCart
Relational database systems may be turned into efficient XML and XPath processors if the system is provided with a suitable relational tree encoding. This paper extends this relational XML processing stack and shows that an RDBMS can also serve as a highly efficient XQuery runtime environment. Our approach is purely relational: XQuery expressions are compiled into SQL code which operates on the tree encoding. The core of the compilation procedure trades XQuery’s notions of variable scopes and nested iteration (FLWOR blocks) for equi-joins. The resulting relational XQuery processor closely adheres to the language semantics, e.g., it respects node identity as well as document and sequence order, and can support XQuery’s full axis feature. The system exhibits quite promising performance figures in experiments. Somewhat unexpectedly, we will also see that the XQuery compiler can make good use of SQL’s OLAP functionality. 1
The bea/xqrl streaming xquery processor
- The VLDB Journal
, 2003
"... In this paper, we describe the design, implementation, and performance characteristics of a complete, industrial-strength XQuery engine, the BEA streaming XQuery processor. The engine was designed to provide very high performance for message processing applications, i.e., for transforming XML data s ..."
Abstract
-
Cited by 59 (3 self)
- Add to MetaCart
In this paper, we describe the design, implementation, and performance characteristics of a complete, industrial-strength XQuery engine, the BEA streaming XQuery processor. The engine was designed to provide very high performance for message processing applications, i.e., for transforming XML data streams, and it is a central component of the 8.1 release of BEA’s WebLogic Integration (WLI) product. This XQuery engine is fully compliant with the August 2002 draft of the W3C XML Query Language specification. A goal of this paper is to describe how an efficient, fully compliant XQuery engine can be built from a few relatively simple components and well-understood technologies.
MARS: A System for Publishing XML from Mixed and Redundant Storage
- In VLDB
, 2003
"... We present a system for publishing as XML data from mixed (relational+XML) proprietary storage, while supporting redundancy in storage for tuning purposes. The correspondence between public and proprietary schemas is given by a combination of LAVand GAV-style views expressed in XQuery. ..."
Abstract
-
Cited by 57 (9 self)
- Add to MetaCart
We present a system for publishing as XML data from mixed (relational+XML) proprietary storage, while supporting redundancy in storage for tuning purposes. The correspondence between public and proprietary schemas is given by a combination of LAVand GAV-style views expressed in XQuery.
An XML Query Engine for Network-Bound Data
, 2001
"... XML has become the lingua franca for data exchange and integration across administrative and enterprise boundaries. Nearly all data providers are adding XML import or export capabilities, and standard XML Schemas and DTDs are being promoted for all types of data sharing. The ubiquity of XML has rem ..."
Abstract
-
Cited by 54 (7 self)
- Add to MetaCart
XML has become the lingua franca for data exchange and integration across administrative and enterprise boundaries. Nearly all data providers are adding XML import or export capabilities, and standard XML Schemas and DTDs are being promoted for all types of data sharing. The ubiquity of XML has removed one of the major obstacles to integrating data from widely disparate sources -- namely, the heterogeneity of data formats. However, general-purpose integration of data across the wide area also requires a query processor that can query data sources on demand, receive streamed XML data from them, and combine and restructure the data into new XML output --- while providing good performance for both batch-oriented and ad-hoc, interactive queries. This is the goal of the Tukwila data integration system, the first system that focuses on network-bound, dynamic XML data sources. In contrast to previous approaches, which must read, parse, and often store entire XML objects before querying them, Tukwila can return query results even as the data is streaming into the system. Tukwila is built with a new system architecture that extends adaptive query processing and relational-engine techniques into the XML realm, as facilitated by a pair of operators that incrementally evaluate a query's input path expressions as data is read. In this paper, we describe the Tukwila architecture and its novel aspects, and we experimentally demonstrate that Tukwila provides better overall query performance and faster initial answers than existing systems, and has excellent scalability.
Accelerating XPath Evaluation in any RDBMS
- ACM Transactions on Database Systems
, 2003
"... This article is a proposal for a database index structure, the XPath accelerator, that has been specifically designed to support the evaluation of XPath path expressions. As such, the index is capable to support all XPath axes (including ancestor, following, preceding-sibling, descendant-or-self, et ..."
Abstract
-
Cited by 50 (8 self)
- Add to MetaCart
This article is a proposal for a database index structure, the XPath accelerator, that has been specifically designed to support the evaluation of XPath path expressions. As such, the index is capable to support all XPath axes (including ancestor, following, preceding-sibling, descendant-or-self, etc.). This feature lets the index stand out among related work on XML indexing structures which had a focus on the child and descendant axes only. The index has been designed with a close eye on the XPath semantics as well as the desire to engineer its internals so that it can be supported well by existing relational database query processing technology: the index (a) permits set-oriented (or, rather, sequence-oriented) path evaluation, and (b) can be implemented and queried using well-established relational index structures, notably B-trees and R-trees
A complete and efficient algebraic compiler for XQuery
- In ICDE
, 2006
"... As XQuery nears standardization, more sophisticated XQuery applications are emerging, which often exploit the entire language and are applied to non-trivial XML sources. We propose an algebra and optimization techniques that are suitable for building an XQuery compiler that is complete, correct, and ..."
Abstract
-
Cited by 36 (6 self)
- Add to MetaCart
As XQuery nears standardization, more sophisticated XQuery applications are emerging, which often exploit the entire language and are applied to non-trivial XML sources. We propose an algebra and optimization techniques that are suitable for building an XQuery compiler that is complete, correct, and efficient. We describe the compilation rules for the complete language into that algebra and present novel optimization techniques that address the needs of complex queries. These techniques include new query unnesting rewritings and specialized join algorithms that account for XQuery’s complex predicate semantics. The algebra and optimizations are implemented in the Galax XQuery engine, and yield execution plans that are up to three orders of magnitude faster than earlier versions of Galax. 1.
Tree Logical Classes for Efficient Evaluation of XQuery
- In SIGMOD
, 2004
"... XML is widely praised for its flexibility in allowing repeated and missing sub-elements. However, this flexibility makes it challenging to develop a bulk algebra, which typically manipulates sets of objects with identical structure. A set of XML elements, say of type book, may have members that vary ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
XML is widely praised for its flexibility in allowing repeated and missing sub-elements. However, this flexibility makes it challenging to develop a bulk algebra, which typically manipulates sets of objects with identical structure. A set of XML elements, say of type book, may have members that vary greatly in structure, e.g. in the number of author sub-elements. This kind of heterogeneity may permeate the entire document in a recursive fashion: e.g., different authors of the same or different book may in turn greatly vary in structure. Even when the document conforms to a schema, the flexible nature of schemas for XML still allows such significant variations in structure among elements in a collection. Bulk processing of such heterogeneous sets is problematic.

