Results 1 - 10
of
21
The state of the art in distributed query processing
- ACM Computing Surveys
, 2000
"... Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of ..."
Abstract
-
Cited by 182 (2 self)
- Add to MetaCart
Distributed data processing is fast becoming a reality. Businesses want to have it for many reasons, and they often must have it in order to stay competitive. While much of the infrastructure for distributed data processing is already in place (e.g., modern network technology), there are a number of issues which still make distributed data processing a complex undertaking: (1) distributed systems can become very large involving thousands of heterogeneous sites including PCs and mainframe server machines � (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system� (3) legacy systems need to be integrated|such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the \textbook " architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intra-query parallelism, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses di erent kinds of distributed systems such as client-server, middleware (multi-tier), and heterogeneous database systems and shows how query processing works in these systems. Categories and subject descriptors: E.5 [Data]:Files � H.2.4 [Database Management Systems]: distributed databases, query processing � H.2.5 [Heterogeneous Databases]: data translation General terms: algorithms � performance Additional key words and phrases: query optimization � query execution � client-server databases � middleware � multi-tier architectures � database application systems � wrappers� replication � caching � economic models for query processing � dissemination-based information systems 1
Evaluating functional joins along nested reference sets in object-relational and object-oriented databases
- In Proc. of the Conf. on Very Large Data Bases (VLDB
, 1998
"... Previous work on functional joins was constrained in two ways: (1) all approaches we know assume references being implemented as physical ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
Previous work on functional joins was constrained in two ways: (1) all approaches we know assume references being implemented as physical
An Efficient XML Node Identification and Indexing Scheme
, 2003
"... Path and tree pattern queries build the core of almost all XML query languages. Current index structures that support an efficient evaluation of such queries, however, often have several deficiencies in that they (a) are limited in their support of query patterns, (b) ignore data values and readily ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Path and tree pattern queries build the core of almost all XML query languages. Current index structures that support an efficient evaluation of such queries, however, often have several deficiencies in that they (a) are limited in their support of query patterns, (b) ignore data values and readily available structural summary information about the XML data source, (c) require expensive joins for every edge in the query tree, or (d) are very space inefficient. Due to the nature of XML-...
The Vagabond Temporal OID Index: An Index Structure for OID Indexing in Temporal Object Database Systems
, 1999
"... In an object database system using logical OIDs, an OID index (OIDX) is necessary to map from logical OID to the physical location of an object. In a temporal object database system (TODB), this OIDX also contains the timestamps of the object versions. We have previously studied OIDX performance ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
In an object database system using logical OIDs, an OID index (OIDX) is necessary to map from logical OID to the physical location of an object. In a temporal object database system (TODB), this OIDX also contains the timestamps of the object versions. We have previously studied OIDX performance using a relatively simple index. However, this study showed that OIDX maintenance can be very costly, and is likely to become the bottleneck of such a system. The main reason for this, is that in a temporal ODB, the OIDX needs to be updated every time an object is updated. This has convinced us that a new index structure, particularly suitable to TODB requirements, is necessary. In this paper, we describe an OIDX for TODBs, which we call The Vagabond Temporal OID Index (VTOIDX). The main goals of the VTOIDX are 1) support for temporal data, while still having index performance close to a non-temporal (one-version) database system, 2) efficient object-relational operation, and 3) flexibl...
Optimizing OID Indexing Cost in Temporal Object-Oriented Database Systems
- In Proceedings of the 5th International Conference on Foundations of Data Organization, FODO'98
, 1998
"... In object-oriented database systems (OODB) with logical OIDs, an OID index (OIDX) is needed to map from OID to the physical location of the object. In a transaction time temporal OODB, the OIDX should also index the object versions. In this case, the index entries, which we call object descriptors ( ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
In object-oriented database systems (OODB) with logical OIDs, an OID index (OIDX) is needed to map from OID to the physical location of the object. In a transaction time temporal OODB, the OIDX should also index the object versions. In this case, the index entries, which we call object descriptors (OD), also include the commit timestamp of the transaction that created the object version. In this report, we develop an analytical model for OIDX access costs in temporal OODBs. The model includes the index page buffer as well as an OD cache. We use this model to study access cost and optimal use of memory for index page buffer and OD cache, with different access patterns. The results show that 1) the OIDX access cost can be high, and can easy become a bottleneck in large temporal OODBs, 2) the optimal OD cache size can be relatively large, and 3) the gain from using an optimal size is considerable, and because access pattern in a database system can be very dynamic, the system should be ab...
The Persistent Cache: Improving OID Indexing in Temporal Object-Oriented Database Systems
- In Proceedings of the 25th VLDB Conference
, 1999
"... In a temporal OODB, an OID index (OIDX) is needed to map from OID to the physical location of the object. In a transaction time temporal OODB, the OIDX should also index the object versions. In this case, the index entries, which we call object descriptors (OD), also include the commit timestamp of ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
In a temporal OODB, an OID index (OIDX) is needed to map from OID to the physical location of the object. In a transaction time temporal OODB, the OIDX should also index the object versions. In this case, the index entries, which we call object descriptors (OD), also include the commit timestamp of the transaction that created the object version. The OIDX in a non-temporal OODB only needs to be updated when an object is created, but in a temporal OODB, the OIDX have to be updated every time an object is updated. We have in a previous study shown that this can be a potential bottleneck, and in this report, we present the Persistent Cache (PCache), a novel approach which reduces the index update and lookup costs in temporal OODBs. In this report, we develop a cost model for the PCache, and use this to show that the use of a PCache can reduce the average access cost to only a fraction of the cost when not using the PCache. Even though the primary context of this report is OID indexing in ...
Functional Join Processing
, 2000
"... . Inter-object references are one of the key concepts of object-relational and object-oriented database systems. In this work, we investigate alternative techniques to implement inter-object references and make the best use of them in query processing, i.e., in evaluating functional joins. We will g ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
. Inter-object references are one of the key concepts of object-relational and object-oriented database systems. In this work, we investigate alternative techniques to implement inter-object references and make the best use of them in query processing, i.e., in evaluating functional joins. We will give a comprehensive overview and performance evaluation of all known techniques for simple (singlevalued) as well as multi-valued functional joins. Furthermore, we will describe special order-preserving functionaljoin techniques that are particularly attractive for decision support queries that require ordered results. While most of the presentation of this paper is focused on object-relational and object-oriented database systems, some of the results can also be applied to plain relational databases because index nested-loop joins along key/foreign-key relationships, as they are frequently found in relational databases, are just one particular way to execute a functional join. Key words: O...
Efficient Use of Signatures in Object-Oriented Database Systems
- In Proceedings of Advances in Databases and Information Systems, ADBIS'99
, 1999
"... . Signatures are bit strings, which are generated by applying some hash function on some or all of the attributes of an object. The signatures of the objects can be stored separately from the objects themselves, and can later be used to filter out candidate objects during perfect match queries. In a ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
. Signatures are bit strings, which are generated by applying some hash function on some or all of the attributes of an object. The signatures of the objects can be stored separately from the objects themselves, and can later be used to filter out candidate objects during perfect match queries. In an object-oriented database system (OODB) using logical OIDs, an object identifier index (OIDX) is needed to map from logical OID to the physical location of the object. In this report we show how the signatures can be stored in the OIDX, and used to reduce the average object access cost in a system. We also extend this approach to transaction time temporal OODBs (TOODB), where this approach is even more beneficial, because maintaining signatures comes virtually for free. We develop a cost model that we use to analyze the performance of the proposed approaches, and this analysis shows that substantial gain can be achieved. Keywords: Signatures, object-oriented database systems, temporal objec...
An analytical study of object identifier indexing
- In Proceedings of the 9th International Conference on Database and Expert Systems Applications, DEXA’98
, 1998
"... The object identifier index of an object-oriented database system is typically 20 % of the size of the database itself, and for large databases, only a small part of the index fits in main memory. To avoid index retrievals becoming a bottleneck, efficient buffering strategies are needed to minimize ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
The object identifier index of an object-oriented database system is typically 20 % of the size of the database itself, and for large databases, only a small part of the index fits in main memory. To avoid index retrievals becoming a bottleneck, efficient buffering strategies are needed to minimize the number of disk accesses. In this report, we develop analytical cost models which we use to find optimal sizes of index page buffer and index entry cache, for different memory sizes, index sizes, and access patterns. Because existing buffer hit estimation models are not applicable for index page buffering in the case of tree based indexes, we have also developed an analytical model for index page buffer performance. The cost gain from using the results in this report is typically in the order of 200-300%. Thus, the results should be of valuable use in optimizers and tools for configuration and tuning of object-oriented database systems. 1
On the Cost of Monitoring and Reorganization of Object Bases for Clustering
- SIGMOD Record
, 1996
"... Clustering is one of the most effective means to enhance the performance of object base applications. Consequently, many proposals exist for algorithms computing good object placements depending on the application profile. However, in an effective object base reorganization tool the clustering algor ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Clustering is one of the most effective means to enhance the performance of object base applications. Consequently, many proposals exist for algorithms computing good object placements depending on the application profile. However, in an effective object base reorganization tool the clustering algorithm is only one constituent. In this paper, we report on our object base reorganization tool that covers all stages of reorganizing the objects: the application profile is determined by a monitoring tool, the object placement is computed from the monitored access statistics utilizing a variety of clustering algorithms and, finally, the reorganization tool restructures the object base accordingly. The costs as well as the effectiveness of these tools is quantitatively evaluated on the basis of the OO1-benchmark. 1 Introduction Ever since the "early days" of database management systems, clustering has proven to be one of the most effective performance enhancement techniques. Therefore, many...

