Results 1 - 10
of
26
Query evaluation techniques for large databases
- ACM COMPUTING SURVEYS
, 1993
"... Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On ..."
Abstract
-
Cited by 592 (7 self)
- Add to MetaCart
Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate it: In order to manipulate large sets of complex objects as efficiently as today’s database systems manipulate simple records, query processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and post-relational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
, 1997
"... In semistructured databases there is no schema fixed in advance. To provide the benefits of a schema in such environments, we introduce DataGuides: concise and accurate structural summaries of semistructured databases. DataGuides serve as dynamic schemas, generated from the database; they are ..."
Abstract
-
Cited by 459 (14 self)
- Add to MetaCart
In semistructured databases there is no schema fixed in advance. To provide the benefits of a schema in such environments, we introduce DataGuides: concise and accurate structural summaries of semistructured databases. DataGuides serve as dynamic schemas, generated from the database; they are useful for browsing database structure, formulating queries, storing information such as statistics and sample values, and enabling query optimization. This paper presents the theoretical foundations of DataGuides along with an algorithm for their creation and an overview of incremental maintenance. We provide performance results based on our implementation of DataGuides in the Lore DBMS for semistructured data. We also describe the use of DataGuides in Lore, both in the user interface to enable structure browsing and query formulation, and as a means of guiding the query processor and optimizing query execution.
Index Structures for Path Expressions
, 1997
"... In recent years there has been an increased interest in managing data which does not conform to traditional data models, like the relational or object oriented model. The reasons for this non-conformance are diverse. One one hand, data may not conform to such models at the physical level: it may be ..."
Abstract
-
Cited by 255 (7 self)
- Add to MetaCart
In recent years there has been an increased interest in managing data which does not conform to traditional data models, like the relational or object oriented model. The reasons for this non-conformance are diverse. One one hand, data may not conform to such models at the physical level: it may be stored in data exchange formats, fetched from the Internet, or stored as structured les. One the other hand, it may not conform at the logical level: data may have missing attributes, some attributes may be of di erent types in di erent data items, there may be heterogeneous collections, or the data may be simply specified by a schema which is too complex or changes too often to be described easily as a traditional schema. The term semistructured data has been used to refer to such data. The data model proposed for this kind of data consists of an edge-labeled graph, in which nodes correspond to objects and edges to attributes or values. Figure 1 illustrates a semistructured database providing information about a city. Relational databases are traditionally queried with associative queries, retrieving tuples based on the value of some attributes. To answer such queries efciently, database management systems support indexes for translating attribute values into tuple ids (e.g. B-trees or hash tables). In object-oriented databases, path queries replace the simpler associative queries. Several data structures have been proposed for answering path queries e ciently: e.g., access support relations 14] and path indexes 4]. In the case of semistructured data, queries are even more complex, because they may contain generalized path expressions 1, 7, 8, 16]. The additional exibility is needed in order to traverse data whose structure is irregular, or partially unknown to the user.
A Performance Evaluation of Alternative Mapping Schemes for Storing XML Data in a Relational Database
, 1999
"... XML is emerging as one of the dominant data formats for data processing on the Internet. To query XML data, query languages like XQL, Lorel, XML-QL, or XML-GL have been proposed. In this paper, we study how XML data can be stored and queried using a standard relational database system. For this pur ..."
Abstract
-
Cited by 117 (1 self)
- Add to MetaCart
XML is emerging as one of the dominant data formats for data processing on the Internet. To query XML data, query languages like XQL, Lorel, XML-QL, or XML-GL have been proposed. In this paper, we study how XML data can be stored and queried using a standard relational database system. For this purpose, we present alternative mapping schemes to store XML data in a relational database and discuss how XML-QL queries can be translated into SQL queries for every mapping scheme. We present the results of comprehensive performance experiments that analyze the tradeos of the alternative mapping schemes in terms of database size, query performance and update performance. While our discussion is focussed on XML and XML-QL, the results of this paper are relevant for most semi-structured data models and most query languages for semi-structured data.
Heuristic and Randomized Optimization for the Join Ordering Problem
- VLDB Journal
, 1997
"... Recent developments in database technology, such as deductive database systems, have given rise to the demand for new, cost-effective optimization techniques for join expressions. In this paper many different algorithms that compute approximate solutions for optimizing join orders are studied since ..."
Abstract
-
Cited by 56 (2 self)
- Add to MetaCart
Recent developments in database technology, such as deductive database systems, have given rise to the demand for new, cost-effective optimization techniques for join expressions. In this paper many different algorithms that compute approximate solutions for optimizing join orders are studied since traditional dynamic programming techniques are not appropriate for complex problems. First, two possible solution spaces, the space of left-deep and bushy processing trees, respectively, are evaluated from a statistical point of view. The result is that the common limitation to leftdeep processing trees is only advisable for certain join graph types. Basically, optimizers from three classes are analysed: heuristic, randomized and genetic algorithms. Each one is extensively scrutinized with respect to its working principle and its fitness for the desired application. It turns out that randomized and genetic algorithms are well suited for optimizing join expressions. They generate solutions of...
Graph Grammar Engineering with PROGRES
, 1995
"... Graph-like data structures and rule-based systems play an important role within many branches of computer science. Nevertheless, their symbiosis in the form of graph rewriting systems or graph grammars are not yet popular among software engineers. This is a consequence of the fact that graph gram ..."
Abstract
-
Cited by 52 (11 self)
- Add to MetaCart
Graph-like data structures and rule-based systems play an important role within many branches of computer science. Nevertheless, their symbiosis in the form of graph rewriting systems or graph grammars are not yet popular among software engineers. This is a consequence of the fact that graph grammar tools were not available until recently and of the lack of knowledge about how to use graph grammars for software development purposes. "Graph grammar engineering" is a first attempt to establish a new graph and rule centered methodology for the development of information system components. Having its roots in the late 80's it gradually evolved from a "paper and pencil" specification formalism to a tool-assisted specification and rapid prototyping approach.
Indexing Semistructured Data
, 1998
"... This paper describes techniques for building and exploiting indexes on semistructured data: data that may not have a fixed schema and that may be irregular or incomplete. We first present a general framework for indexing values in the presence of automatic type coercion. Then based on Lore, a DBMS ..."
Abstract
-
Cited by 46 (2 self)
- Add to MetaCart
This paper describes techniques for building and exploiting indexes on semistructured data: data that may not have a fixed schema and that may be irregular or incomplete. We first present a general framework for indexing values in the presence of automatic type coercion. Then based on Lore, a DBMS for semistructured data, we introduce four types of indexes and illustrate how they are used during query processing. Our techniques and indexing structures are fully implemented and integrated into the Lore prototype. 1 Introduction We call data that is irregular or that exhibits type and structural heterogeneity semistructured, since it may not conform to a rigid, predefined schema. Such data arises frequently on the Web, or when integrating information from heterogeneous sources. In general, semistructured data can be neither stored nor queried in relational or object-oriented database management systems easily and efficiently. We are developing Lore 1 , a database management system d...
Adaptable Pointer Swizzling Strategies in Object Bases: Design, Realization, and Quantitative Analysis
, 1993
"... In this paper, different approaches are classified and evaluated for optimizing the access to main-memory resident persistent objects---techniques which are commonly referred to as "pointer swizzling ". To speed up the access along inter-object references, the persistent pointers in the form of uniq ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
In this paper, different approaches are classified and evaluated for optimizing the access to main-memory resident persistent objects---techniques which are commonly referred to as "pointer swizzling ". To speed up the access along inter-object references, the persistent pointers in the form of unique object identifiers (OIDs) are transformed (swizzled) into main-memory pointers (addresses). Pointer swizzling techniques can be directed into two classes: (1) strategies that allow replacement of swizzled objects from the buffer before the end of an application program and (2) those that outrule the displacement of swizzled objects. Whereas the latter class of pointer swizzling methods has received much attention in recent literature, the first class---i.e., techniques that take "precautions" for the replacement of swizzled objects---has not yet been thoroughly investigated. Four different pointer swizzling techniques allowing object replacement were investigated and contrasted with the p...
Indexing XML Data with ToXin
- In Proc. of WebDB 2001
, 2001
"... Indexing schemes for semistructured data have been developed in recent years to optimize path query processing by summarizing path information. However, most of these schemes can only be applied to some query processing stages whereas others only support a limited class of queries. To overcome these ..."
Abstract
-
Cited by 30 (6 self)
- Add to MetaCart
Indexing schemes for semistructured data have been developed in recent years to optimize path query processing by summarizing path information. However, most of these schemes can only be applied to some query processing stages whereas others only support a limited class of queries. To overcome these limitations we developed ToXin, an indexing scheme for XML data that fully exploits the overall path structure of the database in all query processing stages. ToXin consists of two different types of structures: a path index that summarizes all paths in the database and can be used for both forward and backward navigation starting from any node, and a value index that supports predicates over values. ToXin synthesizes ideas from object-oriented path indexes and extends them to the semistructured realm of XML data. In this paper we present the ToXin architecture, describe its current implementation, and discuss comparative performance results.
Gras, A Graph-Oriented (software) Engineering Database System
- Information Systems
, 1995
"... Modern software systems for application areas like software engineering, CAD, or office automation are usually highly interactive and deal with rather complex object structures. For the realization of these systems a nonstandard database system is needed which is able to efficiently handle different ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
Modern software systems for application areas like software engineering, CAD, or office automation are usually highly interactive and deal with rather complex object structures. For the realization of these systems a nonstandard database system is needed which is able to efficiently handle different types of coarse- and fine-grained objects (like documents and paragraphs), hierarchical and non-hierarchical relations between objects (like composition-links and cross-references), and finally attributes of rather different size (like chapter numbers and bitmaps). Furthermore, this database system should support incremental computation of derived data, undo/redo of data modifications, error recovery from system crashes, and version control mechanisms. In this paper, we describe the underlying data model and the functionality of GRAS, a database system which has been designed according to the requirements mentioned above. Furthermore, we motivate our central design decisions concerning its ...

