Results 1 -
6 of
6
EXECUTING PARALLEL PROGRAMS WITH SYNCHRONIZATION BOTTLENECKS EFFICIENTLY
"... We propose a scheme within which parallel programs with potential synchronization bottlenecks run efficiently. In the straightforward implementations which use basic locking schemes, the execution time for the program parts with bottlenecks increases significantly when the number of processors incre ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We propose a scheme within which parallel programs with potential synchronization bottlenecks run efficiently. In the straightforward implementations which use basic locking schemes, the execution time for the program parts with bottlenecks increases significantly when the number of processors increases. Our scheme makes the parallel performance for the bottleneck parts of programs close to the sequential performance while maintaining the e ciency with which the nonbottleneck parts run. Experiments with a 64-processor SMP and a 128-processor DSM machine confirmed that parallel programs implemented with our scheme perform much better than parallel programs implemented with other widely-used locking schemes.
Lightweight Software Transactions for Games
"... To realize the performance potential of multiple cores, software developers must architect their programs for concurrency. Unfortunately, for many applications, threads and locks are difficult to use efficiently and correctly. Thus, researchers have proposed transactional memory as a simpler alterna ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
To realize the performance potential of multiple cores, software developers must architect their programs for concurrency. Unfortunately, for many applications, threads and locks are difficult to use efficiently and correctly. Thus, researchers have proposed transactional memory as a simpler alternative. To investigate if and how software transactional memory (STM) can help a programmer to parallelize applications, we perform a case study on a game application called SpaceWars3D. After experiencing suboptimal performance, we depart from classic STM designs and propose a programming model that uses long-running, abort-free transactions that rely on user specifications to avoid or resolve conflicts. With this model we achieve the combined goal of competitive performance and improved programmability. 1.
Compiler and Runtime Support for Shared Memory Parallelization of Data Mining Algorithms
, 2002
"... Data mining techniques focus on finding novel and useful patterns or models from large datasets. Because of the volume of the data to be analyzed, the amount of computation involved, and the need for rapid or even interactive analysis, data mining applications require the use of parallel machines. W ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Data mining techniques focus on finding novel and useful patterns or models from large datasets. Because of the volume of the data to be analyzed, the amount of computation involved, and the need for rapid or even interactive analysis, data mining applications require the use of parallel machines. We have been developing compiler and runtime support for developing scalable implementations of data mining algorithms. Our work encompasses shared memory parallelization, distributed memory parallelization, and optimizations for processing disk-resident datasets.
Achieving High Performance for Parallel Programs that Contain Unscalable Modules
, 2000
"... This thesis is a description of a compiler and runtime technique for the efficient management of threads including their mutual exclusion. The target area for this work is parallel languages for shared-memory multiprocessors. The goal of this work is to achieve a situation in which the execution tim ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This thesis is a description of a compiler and runtime technique for the efficient management of threads including their mutual exclusion. The target area for this work is parallel languages for shared-memory multiprocessors. The goal of this work is to achieve a situation in which the execution time either decreases or remains unchanged as the number of processors is increased. We call this performance model the satisfactory performance model. Existing parallel programming systems do not always perform according to this satisfactory model. This is the case when there are modules in the program such that concurrent invocations of the modules are serialized. We call these modules bottleneck modules. When bottleneck modules are present they prevent operation according to the satisfactory performance model since the overhead incurred because of bottleneck modules increases with the number of processors. This overhead includes communications with memory for the sharing of memory objects am...
Eliminating Synchronization Bottlenecks Using Adaptive Replication
, 2003
"... This article presents a new technique, adaptive replication, for automatically eliminating synchronization bottlenecks in multithreaded programs that perform atomic operations on objects. Synchronization bottlenecks occur when multiple threads attempt to concurrently update the same object. It is of ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This article presents a new technique, adaptive replication, for automatically eliminating synchronization bottlenecks in multithreaded programs that perform atomic operations on objects. Synchronization bottlenecks occur when multiple threads attempt to concurrently update the same object. It is often possible to eliminate synchronization bottlenecks by replicating objects. Each thread can then update its own local replica without synchronization and without interacting with other threads. When the computation needs to access the original object, it combines the replicas to produce the correct values in the original object. One potential problem is that eagerly replicating all objects may lead to performance degradation and excessive memory consumption. Adaptive
Fusion of Concurrent Invocations
- Transactions of Information Processing Society of Japan, 42(SIG 2 (PRO 9)):13-- 25, February 2001. In Japanese
, 2001
"... This paper describes a mechanism for "fusing" concurrent invocations of exclusive methods. The target of our work is object-oriented languages with concurrent extensions. In the languages, concurrent invocations of exclusive methods are serialized; only one invocation executes immediately and th ..."
Abstract
- Add to MetaCart
This paper describes a mechanism for "fusing" concurrent invocations of exclusive methods. The target of our work is object-oriented languages with concurrent extensions. In the languages, concurrent invocations of exclusive methods are serialized; only one invocation executes immediately and the others wait for their turn. The mechanism fuses multiple waiting invocations to a cheaper operation such as a single invocation.

