Results 11 - 20
of
20
A Comparative Evaluation of Cache Coherence Schemes Based on Virtual Memory Support
, 1992
"... This paper presents an evaluation of a new class of software cache coherence schemes that use virtual memory (VM) support to maintain multiprocessor cache coherence. Traditional VM translation hardware in each processor detects memory access attempts that would violate cache coherence and system sof ..."
Abstract
- Add to MetaCart
This paper presents an evaluation of a new class of software cache coherence schemes that use virtual memory (VM) support to maintain multiprocessor cache coherence. Traditional VM translation hardware in each processor detects memory access attempts that would violate cache coherence and system software is used to enforce coherence. The implementation of this class of coherence schemes is extremely economical: it requires neither special multiprocessor hardware nor compiler support, and easily incorporates different consistency models. We have evaluated four consistency models for the VM-based approach: sequential consistency, singlewriter release consistency, release consistency, and lazy release consistency. Our trace-driven simulation results show that the VM-based cache coherence schemes are practical for small-scale multiprocessors and that the performance of lazy release consistency for multi-threaded parallel programs is close to the snoopy-cache invalidation-based coherence ap...
An Evaluation of Multriprocessor Cache Coherence Based on Virtual Memory Support
, 1992
"... This paper presents an evaluation of the impact of several architectural parameters on the performance of Virtual Memory (VM) based cache coherence schemes for shared-memory multiprocessors. The VM-based cache coherence schemes use the traditional VM translation hardware on each processor to detect ..."
Abstract
- Add to MetaCart
This paper presents an evaluation of the impact of several architectural parameters on the performance of Virtual Memory (VM) based cache coherence schemes for shared-memory multiprocessors. The VM-based cache coherence schemes use the traditional VM translation hardware on each processor to detect memory access attempts that might leave caches incoherent, and maintain coherence through VM-level system software. The implementation of this class of coherence schemes is flexible and economical: It allows different consistency models, requires no special hardware for multiprocessor cache coherence, and supports arbitrary interconnection networks. We used trace-driven simulations to evaluate the effect of different architectural parameters on the performance of the VM-based schemes. These parameters include VM page sizes, write-back and writethrough caches, memory access latencies, bus and crossbar interconnections, and different cache sizes. Our results show that for appropriate parameter...
SPAM: A Multiprocessor Execution Driven Simulation Kernel
- International Journal in Computer Simulation
, 1996
"... : Trace driven simulation is a well known technique for performance evaluation of single processor computers. However, trace driven simulation introduces distortions when used to simulate multiprocessor architectures. Execution driven simulation is the only technique that gives accurate simulation r ..."
Abstract
- Add to MetaCart
: Trace driven simulation is a well known technique for performance evaluation of single processor computers. However, trace driven simulation introduces distortions when used to simulate multiprocessor architectures. Execution driven simulation is the only technique that gives accurate simulation results for multiprocessor architectures though it is difficult to implement. This paper presents SPAM, a simulation kernel that simplifies the construction of execution driven simulators for shared memory multiprocessors. The kernel provides a tracing tool and a set of primitives which allow the execution, tracing and simulation of shared memory parallel applications on a single processor computer. The performance of the kernel allows the simulation of real sized parallel applications in a reasonnable time. Key-words: execution driven simulation, parallel address traces, shared memory multiprocessors (R'esum'e : tsvp) email :Alain.Gefflaut@irisa.fr, Philippe.Joubert@irisa.fr Unit e de re...
I R I S a
- International Journal in Computer Simulation
, 1993
"... : Trace driven simulation is a well known technique for performance evaluation of single processor computers. However, trace driven simulation introduces distortions when used to simulate multiprocessor architectures. Execution driven simulation is the only technique that gives accurate simulation r ..."
Abstract
- Add to MetaCart
: Trace driven simulation is a well known technique for performance evaluation of single processor computers. However, trace driven simulation introduces distortions when used to simulate multiprocessor architectures. Execution driven simulation is the only technique that gives accurate simulation results for multiprocessor architectures though it is difficult to implement. This paper presents SPAM, a simulation kernel that simplifies the construction of execution driven simulators for shared memory multiprocessors. The kernel provides a tracing tool and a set of primitives which allow the execution, tracing and simulation of shared memory parallel applications on a single processor computer. The performance of the kernel allows the simulation of real sized parallel applications in a reasonnable time. Key-words: execution driven simulation, parallel address traces, shared memory multiprocessors (R'esum'e : tsvp) email :Alain.Gefflaut@irisa.fr, Philippe.Joubert@irisa.fr Centre Natio...
Characterizing the Parallel Execution Behavior of some SPLASH-2 Applications on Multiprocessors
, 1999
"... : In order to evaluate the benefits of parallel systems, it is necessary to know how real parallel programs behave. The SPLASH-2 applications provide us with a realistic workload for such systems. We have instrumented the PARMACS macros used by SPLASH-2 applications in order to study their parall ..."
Abstract
- Add to MetaCart
: In order to evaluate the benefits of parallel systems, it is necessary to know how real parallel programs behave. The SPLASH-2 applications provide us with a realistic workload for such systems. We have instrumented the PARMACS macros used by SPLASH-2 applications in order to study their parallel behavior, focusing on the overhead introduced by synchronization and parallelism management. The information obtained can be used to take the proper scheduling decisions, both at user-level and system-level. We have studied techniques to increase the multiprocessor usage in order to increase the performance of parallel systems. KEYWORDS: Microkernel, multithreaded, shared-memory multiprocessor, scheduling, synchronization, parallel applications. 1 Introduction In order to evaluate the benefits of parallel systems, it is necessary to know how parallel programs behave when running on such systems. Evaluation cannot rely on very simple and unrealistic test codes. It is desirable to t...
Virtual Clusters: Resource Mangement on Large Shared-Memory Multiprocessors
, 2000
"... Despite the fact that large scale shared-memory multiprocessors have been commercially available for several years, system software that fully utilizes all of their features is still not available. These machines require system software that is scalable, supports fault containment, and provides scal ..."
Abstract
- Add to MetaCart
Despite the fact that large scale shared-memory multiprocessors have been commercially available for several years, system software that fully utilizes all of their features is still not available. These machines require system software that is scalable, supports fault containment, and provides scalable resource management. Software supporting these features is currently unavailable, mostly due to the complexity and cost of making the required changes to the operating system. One proposed alternative is to partition the hardware into small units
Performance Evaluation For Multiprocessors
"... We present a classification of synchronization delays inherent in multiprocessor systems programmed using the monitor paradigm. This characterization is useful in relating performance of such systems to algorithmic parameters in subproblems such as domain decomposition. We apply this approach to a p ..."
Abstract
- Add to MetaCart
We present a classification of synchronization delays inherent in multiprocessor systems programmed using the monitor paradigm. This characterization is useful in relating performance of such systems to algorithmic parameters in subproblems such as domain decomposition. We apply this approach to a parallel, adaptive grid code for solving the equations of one-dimensional gas dynamics implemented on shared memory multiprocessors such as the Encore Multimax.
A Technique for Collecting Simultaneous Multithreaded Traces
, 2006
"... This paper presents a public tool for generating and collecting traces in multithreaded environments, which are suitable for simulating and studying Simultaneous Multithreading (SMT) cache organizations. ..."
Abstract
- Add to MetaCart
This paper presents a public tool for generating and collecting traces in multithreaded environments, which are suitable for simulating and studying Simultaneous Multithreading (SMT) cache organizations.
Reporting Computational Experiments . . .
- ORSA JOURNAL ON COMPUTING
, 1992
"... Accompanying the increasing availability of parallel computing technology is a corresponding growth of research into the development, implementation, and testing of parallel algorithms. This paper examines issues involved in reporting on the empirical testing of parallel mathematical programming alg ..."
Abstract
- Add to MetaCart
Accompanying the increasing availability of parallel computing technology is a corresponding growth of research into the development, implementation, and testing of parallel algorithms. This paper examines issues involved in reporting on the empirical testing of parallel mathematical programming algorithms, both optimizing and heuristic. We examine the appropriateness of various performance metrics and explore the effects of testing variability, machine influences, testing biases, and the effects of tuning parameters. Some of these difficulties were explored further in a survey sent to leading computational mathematical programming researchers for their reactions and suggestions. A summary of the survey and proposals for conscientious reporting are presented.

