Results 1 - 10
of
22
The Gamma database machine project
- IEEE Transactions on Knowledge and Data Engineering
, 1990
"... This paper describes the design of the Gamma database machine and the techniques employed in its implementation. Gamma is a relational database machine currently operating on an Intel iPSC/2 hypercube with 32 processors and 32 disk drives. Gamma employs three key technical ideas which enable the arc ..."
Abstract
-
Cited by 203 (27 self)
- Add to MetaCart
This paper describes the design of the Gamma database machine and the techniques employed in its implementation. Gamma is a relational database machine currently operating on an Intel iPSC/2 hypercube with 32 processors and 32 disk drives. Gamma employs three key technical ideas which enable the architecture to be scaled to 100s of processors. First, all relations are horizontally partitioned across multiple disk drives enabling relations to be scanned in parallel. Second, novel parallel algorithms based on hashing are used to implement the complex relational operators such as join and aggregate functions. Third, dataflow scheduling techniques are used to coordinate multioperator queries. By using these techniques it is possible to control the execution of very complex queries with minimal coordination- a necessity for configurations involving a very large number of processors. In addition to describing the design of the Gamma software, a thorough performance evaluation of the iPSC/2 hypercube version of Gamma is also presented. In addition to measuring the effect of relation size and indices on the response time for selection, join, aggregation, and update queries, we also analyze the performance of Gamma relative to the number of processors employed when the sizes of the input relations are kept constant (speedup) and when the sizes of the input relations are increased proportionally to the number of processors (scaleup). The speedup results obtained for both selection and join queries are linear; thus, doubling the number of processors
Capriccio: Scalable Threads for Internet Services
- In Proceedings of the 19th ACM Symposium on Operating Systems Principles
, 2003
"... This paper presents Capriccio, a scalable thread package for use with high-concurrency servers. While recent work has advocated event-based systems, we believe that threadbased systems can provide a simpler programming model that achieves equivalent or superior performance. ..."
Abstract
-
Cited by 130 (5 self)
- Add to MetaCart
This paper presents Capriccio, a scalable thread package for use with high-concurrency servers. While recent work has advocated event-based systems, we believe that threadbased systems can provide a simpler programming model that achieves equivalent or superior performance.
Practical Skew Handling in Parallel Joins
- IN PROCEEDINGS OF THE 18TH VLDB CONFERENCE
, 1992
"... We present an approach to dealing with skew in parallel joins in database systems. Our approach is easily implementable within current parallel DBMS, and performs well on skewed data without degrading the performance of the system on non-skewed data. The main idea is to use multiple algorithms, each ..."
Abstract
-
Cited by 85 (8 self)
- Add to MetaCart
We present an approach to dealing with skew in parallel joins in database systems. Our approach is easily implementable within current parallel DBMS, and performs well on skewed data without degrading the performance of the system on non-skewed data. The main idea is to use multiple algorithms, each specialized for a di erent degree of skew, and to use a small sample of the relations being joined to determine which algorithm is appropriate. We developed, implemented, and experimented with four new skew-handling parallel join algorithms; one, which wecall virtual processor range partitioning, was the clear winner in high skew cases, while traditional hybrid hash join was the clear winner in lower skew or no skew cases. We present experimental results from an implementation of all four algorithms on the Gamma parallel database machine. To our knowledge, these are the rst reported skew-handling numbers from an actual implementation.
Parallel sorting on a shared-nothing architecture using probabilistic splitting
, 1991
"... We consider the problem of external sorting in a shared-nothing multiprocessor. A critical step in the algorithms we consider is to determine the range of sort keys to be handled by each processor. We consider two techniques for determining these ranges of sort keys: exact splitting, using a paralle ..."
Abstract
-
Cited by 76 (1 self)
- Add to MetaCart
We consider the problem of external sorting in a shared-nothing multiprocessor. A critical step in the algorithms we consider is to determine the range of sort keys to be handled by each processor. We consider two techniques for determining these ranges of sort keys: exact splitting, using a parallel version of the algorithm proposed by Iyer, Ricard, and Varman; and probabilistic splitting, which uses sampling to estimate quantiles. We present analytic results showing that probabilistic splitting performs better than exact splitting. Finally, we present experimental results from an implementation of sorting via probabilistic splitting in the Gamma parallel database machine.
The Wisconsin benchmark: Past, present, and future. The Benchmark Handbook for Database and Transaction Processing Systems
, 1991
"... In 1981 as we were completing the implementation of the DIRECT database machine [DEWI79, BORA82], attention turned to evaluating its performance. At that time no standard database benchmark existed. There were only a few application-specific benchmarks. While application-specific benchmarks measure ..."
Abstract
-
Cited by 45 (2 self)
- Add to MetaCart
In 1981 as we were completing the implementation of the DIRECT database machine [DEWI79, BORA82], attention turned to evaluating its performance. At that time no standard database benchmark existed. There were only a few application-specific benchmarks. While application-specific benchmarks measure which database system is best for a particular
Multiprocessor Support for Event-Driven Programs
, 2003
"... This paper presents a new asynchronous programming library (libasync-smp) that allows event-driven applications to take advantage of multiprocessors by running code for event handlers in parallel. To control the concurrency between events, the programmer can specify a color for each event: events wi ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
This paper presents a new asynchronous programming library (libasync-smp) that allows event-driven applications to take advantage of multiprocessors by running code for event handlers in parallel. To control the concurrency between events, the programmer can specify a color for each event: events with the same color (the default case) are handled serially; events with different colors can be handled in parallel. The programmer can incrementally expose parallelism in existing event-driven applications by assigning different colors to computationally-intensive events that do not share mutable state. An
An Overview of Real-Time Database Systems
- Advances in Real-Time Systems
, 1994
"... Introduction Traditionally, real-time systems manage their data (e.g., chamber temperature, aircraft locations) in application-dependent structures. As real-time systems evolve, their applications become more complex and require access to more data. It thus becomes necessary to manage the data in a ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
Introduction Traditionally, real-time systems manage their data (e.g., chamber temperature, aircraft locations) in application-dependent structures. As real-time systems evolve, their applications become more complex and require access to more data. It thus becomes necessary to manage the data in a systematic and organized fashion. Database management systems provide tools for such organization, so in recent years there has been interest in "merging" database and real-time technology. The resulting integrated system, which provides database operations with real-time constraints, is generally called a real-time database system (RTDBS) [1]. Like a conventional database system, a RTDBS functions as a repository of data, provides efficient storage, and performs retrieval and manipulation of information. However, as a part of a real-time system, whose "tasks" are associated with time constraints, a RTDBS has the added burden of ensuring some degree of 1 This chapte
Locking and Latching in a Memory-Resident Database System
- In Proc. of the Int'l Conf. on Very Large Databases
, 1992
"... As part of the Starburst extensible database project developed at the IBM Almaden Research Center, we designed and implemented a memory-resident storage component that co-exists with Starburst's diskoriented storage component. The two storage components share the same common services, such as query ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
As part of the Starburst extensible database project developed at the IBM Almaden Research Center, we designed and implemented a memory-resident storage component that co-exists with Starburst's diskoriented storage component. The two storage components share the same common services, such as query optimization, transaction management, etc. However, the memory-resident storage component is faster than the disk-oriented storage component and hence needs faster run-time services. This paper examines two runtime services, the lock manager and the latch mechanism, and investigates possible cost-cutting measures. We propose the use of of a single latch for protecting a table, all of its indexes, and all of its related lock information, in order to reduce storage component latch costs. We then show that although a table-level latch is a large granule latch, it does not significantly restrict concurrency. We also examine traditional lock manager design and suggest a different design that is a...
Dynamic Load Balancing in Hierarchical Parallel Database Systems
, 1996
"... We consider the execution of multi-join queries in a hierarchical parallel system, i.e., a shared-nothing system whose nodes are shared-memory multiprocessors. In this context, load balancing must be addressed at two levels, locally among the processors of each shared-memory node and globally among ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
We consider the execution of multi-join queries in a hierarchical parallel system, i.e., a shared-nothing system whose nodes are shared-memory multiprocessors. In this context, load balancing must be addressed at two levels, locally among the processors of each shared-memory node and globally among all nodes. In this paper, we propose a dynamic execution model that maximizes local load balancing within shared-memory nodes and minimizes the need for load sharing across nodes. This is obtained by allowing each processor to execute any operator that can be processed locally, thereby taking full advantage of inter- and intra-operator parallelism. We conducted a performance evaluation using an implementation on a 72-processor KSR1 computer. The experiments with many queries and large relations show very good speedup results, even with highly skewed data. We show that, in shared-memory, our execution model performs as well as a dedicated model and can scale up very well to deal with multi...
Main-Memory Scan Sharing For Multi-Core CPUs
"... Computer architectures are increasingly based on multi-core CPUs and large memories. Memory bandwidth, which has not kept pace with the increasing number of cores, has become the primary processing bottleneck, replacing disk I/O as the limiting factor. To address this challenge, we provide novel alg ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Computer architectures are increasingly based on multi-core CPUs and large memories. Memory bandwidth, which has not kept pace with the increasing number of cores, has become the primary processing bottleneck, replacing disk I/O as the limiting factor. To address this challenge, we provide novel algorithms for increasing the throughput of Business Intelligence (BI) queries, as well as for ensuring fairness and avoiding starvation among a concurrent set of such queries. To maximize throughput, we propose a novel FullSharing scheme that allows all concurrent queries, when performing base-table I/O, to share the cache belonging to a given core. We then generalize this approach to a BatchSharing scheme that avoids thrashing on ”agg-tables”—hash tables that are used for aggregation processing—caused by execution of too many queries on a core. This scheme partitions queries into batches such that the working-set of agg-table entries for each batch can fit into a cache; an efficient sampling technique is used to estimate selectivities and working-set sizes for purposes of query partitioning. Finally, we use lottery-scheduling techniques to ensure fairness and impose a hard upper bound on staging time to avoid starvation. On our 8-core testbed, we were able to completely remove the memory I/O bottleneck, increasing throughput by a factor of 2 to 2.5, while also maintaining fairness and avoiding starvation. 1.

