Results 1 - 10
of
12
A Case for User-Level Dynamic Page Migration
, 2000
"... This paper presents user-level dynamic page migration, a runtime technique which transparently enables parallel pro-grams to tune their memory performance on distributed shared memory multiprocessors, with feedback obtained from dynamic monitoring of memory activity. Our technique exploits the itera ..."
Abstract
-
Cited by 18 (8 self)
- Add to MetaCart
This paper presents user-level dynamic page migration, a runtime technique which transparently enables parallel pro-grams to tune their memory performance on distributed shared memory multiprocessors, with feedback obtained from dynamic monitoring of memory activity. Our technique exploits the iterative nature of parallel programs and information available to the program both at compile time and at runtime in order to improve the accuracy and the timeliness of page migrations, as well as amortize better the overhead, compared to page migration engines implemented in the operating system. We present an adaptive page migration algorithm based on a competitive and a predictive criterion. The competitive criterion is used to correct poor page placement decisions of the operating system, while the predictive criterion makes the algorithm respon-sive to scheduling events that necessitate immediate page migrations, such as preemptions and migrations of threads. We also present a new technique for preventing page ping-pong and a mechanism for monitoring the performance of page migration algorithms at runtime and tuning their sen-sitive parameters accordingly. Our experimental evidence on a SGI Origin2000 shows that unmodified OpenMP codes linked with our runtime system for dynamic page migration are effectively immune to the page placement strategy of the operating system and the associated problems with data locality. Furthermore, our runtime system achieves solid performance improvements compared to the IRIX 6.5.5 page migration engine, for single parallel OpenMP codes and multiprogrammed workloads.
UPMlib: A Runtime System for Tuning the Memory Performance of OpenMP Programs on Scalable Shared-Memory Multiprocessors
- In Proc. of the 5th ACM Workshop on Languages, Compilers and Runtime Systems for Scalable Computers (LCR’2000), LNCS
, 2000
"... Abstract. We present the design and implementation of UPMLIB, a runtime system that provides transparent facilities for dynamically tuning the memory performance of OpenMP programs on scalable shared-memory multiprocessors with hardware cache-coherence. UPMLIB integrates information from the compile ..."
Abstract
-
Cited by 18 (9 self)
- Add to MetaCart
(Show Context)
Abstract. We present the design and implementation of UPMLIB, a runtime system that provides transparent facilities for dynamically tuning the memory performance of OpenMP programs on scalable shared-memory multiprocessors with hardware cache-coherence. UPMLIB integrates information from the compiler and the operating system, to implement algorithms that perform accurate and timely page migrations. The algorithms and the associated mechanisms correlate memory reference information with the semantics of parallel programs and scheduling events that break the association between threads and data for which threads have memory affinity at runtime. Our experimental evidence shows that UPMLIB makes OpenMP programs immune to the page placement strategy of the operating system, thus obviating the need for introducing data placement directives in OpenMP. Furthermore, UPMlib provides solid improvements of throughput in multiprogrammed execution environments.
Exploiting Data Locality on Scalable Shared Memory Machines with Data Parallel Programs
- In Proc. of the 6th International EuroPar Conference (EuroPar'2000
, 2000
"... . OpenMP offers a high-level interface for parallel programming on scalable shared memory (SMP) architectures providing the user with simple work-sharing directives while relying on the compiler to generate parallel programs based on thread parallelism. However, the lack of language features for ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
. OpenMP offers a high-level interface for parallel programming on scalable shared memory (SMP) architectures providing the user with simple work-sharing directives while relying on the compiler to generate parallel programs based on thread parallelism. However, the lack of language features for exploiting data locality often results in poor performance since the non-uniform memory access times on scalable SMP machines cannot be neglected. HPF, the de-facto standard for data parallel programming, offers a rich set of data distribution directives in order to exploit data locality, but has mainly been targeted towards distributed memory machines. In this paper we describe an optimized execution model for HPF programs on SMP machines that avails itself with the mechanisms provided by OpenMP for work sharing and thread parallelism while exploiting data locality based on user-specified distribution directives. This execution model has been implemented in the ADAPTOR HPF compil...
Mixed mode MPI/OpenMP programming
- UK High-End Computing Technology Report
, 2000
"... Shared memory architectures are gradually becoming more prominent in the HPC market, as advances in technology have allowed larger numbers of CPUs to have access to a single memory ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Shared memory architectures are gradually becoming more prominent in the HPC market, as advances in technology have allowed larger numbers of CPUs to have access to a single memory
A transparent runtime data distribution engine for
- OpenMP
, 2000
"... This paper makes two important contributions. First, the paper investigates the performance implications of data placement in OpenMP programs running on modern NUMA multiprocessors. Data locality and minimization of the rate of remote memory accesses are critical for sustaining high performance on ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
This paper makes two important contributions. First, the paper investigates the performance implications of data placement in OpenMP programs running on modern NUMA multiprocessors. Data locality and minimization of the rate of remote memory accesses are critical for sustaining high performance on these systems. We show that due to the low remoteto-local memory access latency ratio of contemporary NUMA architectures, reasonably balanced page placement schemes, such as round-robin or random distribution, incur modest performance losses. Second, the paper presents a transparent, user-level page migration engine with an ability to gain back any performance loss that stems from suboptimal placement of pages in iterative OpenMP programs. The main body of the paper describes how our OpenMP runtime environment uses page migration for implementing implicit data distribution and redistribution schemes without programmer intervention. Our experimental results verify the effectiveness of the proposed framework and provide a proof of concept that it is not necessary to introduce data distribution directives in OpenMP and warrant the simplicity or the portability of the programming model.
Leveraging transparent data distribution in OpenMP via user-level dynamic page migration
- Lecture Notes in Computer Science
, 1940
"... Abstract. This paper describes transparent mechanisms for emulating some of the data distribution facilities offered by traditional data-parallel programming models, such as High Performance Fortran, in OpenMP. The vehicle for implementing these facilities in OpenMP without modifying the programming ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract. This paper describes transparent mechanisms for emulating some of the data distribution facilities offered by traditional data-parallel programming models, such as High Performance Fortran, in OpenMP. The vehicle for implementing these facilities in OpenMP without modifying the programming model or exporting data distribution details to the programmer is user-level dynamic page migration [9,10]. We have implemented a runtime system called UPMlib, which allows the compiler to inject into the application a smart user-level page migration engine. The page migration engine improves transparently the locality of memory references at the page level on behalf of the application. This engine can accurately and timely establish effective initial page placement schemes for OpenMP programs. Furthermore, it incorporates mechanisms for tuning page placement across phase changes in the application communication pattern. The effectiveness of page migration in these cases depends heavily on the overhead of page movements, the duration of phases in the application code and architectural characteristics. In general, dynamic page migration between phases is effective if the duration of a phase is long enough to amortize the cost of page movements.
Development of mixed mode MPI / OpenMP
- In WOMPAT 2000
, 2000
"... This paper discusses the implementation, development and performance of mixed mode MPI / OpenMP applications ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper discusses the implementation, development and performance of mixed mode MPI / OpenMP applications
unknown title
"... 1. INTRODUCTION The current trends in parallel computing indicate that shared memory multiprocessor architectures converge to a common model in which multiple single-processor or symmetric multiprocessor (SMP) nodes are interconnected via a ..."
Abstract
- Add to MetaCart
(Show Context)
1. INTRODUCTION The current trends in parallel computing indicate that shared memory multiprocessor architectures converge to a common model in which multiple single-processor or symmetric multiprocessor (SMP) nodes are interconnected via a
CS497 Report, First Semester 2003-2004
"... Automatic parallelization of sequential code for a cluster of multiprocessors ..."
Abstract
- Add to MetaCart
(Show Context)
Automatic parallelization of sequential code for a cluster of multiprocessors
SEE PROFILE
"... The protective effects of ginsenoside Rg1 against hypertension target-organ damage in spontaneously hypertensive rats ..."
Abstract
- Add to MetaCart
(Show Context)
The protective effects of ginsenoside Rg1 against hypertension target-organ damage in spontaneously hypertensive rats