Results 1 - 10
of
17
Program and Data Transformations for Efficient Execution on Distributed Memory Architectures
, 1993
"... This report is concerned with the efficient execution of array computation on Distributed Memory Architectures by applying compiler-directed program and data transformations. By translating a sub-set of a single-assignment language, Sisal, into a linear algebraic framework it is possible to transfor ..."
Abstract
-
Cited by 20 (6 self)
- Add to MetaCart
This report is concerned with the efficient execution of array computation on Distributed Memory Architectures by applying compiler-directed program and data transformations. By translating a sub-set of a single-assignment language, Sisal, into a linear algebraic framework it is possible to transform a program so as to reduce load imbalance and non-local memory access. A new test is presented which allows the construction of transformations to reduce load imbalance. By a new expression of data alignment, transformations to reduce non-local access are derived. A new pre-fetching procedure, which prevents redundant non-local accesses, is presented and forms the basis of a new data partitioning methodology. By applying these transformations in a straightforward manner to some well known scientific programs, it is shown that this approach is competitive with hand-crafted methods. Preface The author graduated from Aston University in 1987 with an upper second B.Sc.(Hons.) in Computationa...
Code Generations, Evaluations, and Optimizations in Multithreaded Executions
, 1995
"... OF DISSERTATION CODE GENERATIONS, EVALUATIONS, AND OPTIMIZATIONS IN MULTITHREADED EXECUTIONS Efficient large-scale parallel processing can result only from proper handling of latency. Latency arises either from remote memory accesses or synchronizations. Multithreading is an execution model that can ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
OF DISSERTATION CODE GENERATIONS, EVALUATIONS, AND OPTIMIZATIONS IN MULTITHREADED EXECUTIONS Efficient large-scale parallel processing can result only from proper handling of latency. Latency arises either from remote memory accesses or synchronizations. Multithreading is an execution model that can effectively deal with latency by switching among a set of ready threads. This model has been proposed in a variety of forms: a unit of storage can be based on either a collection of threads or a single thread, threads can be either blocking or non-blocking, and synchronization can be either implicit or explicit. This dissertation describes research in the evaluation and optimization of various issues in multithreading. Issues of particular interest are the development of a multithreaded execution model to be used as a test-bed and a hybrid code generation scheme where threads are generated in a top-down manner and then optimized in a bottom-up fashion. Various forms of locality are also ide...
Software challenges in extreme scale systems
- Journal of Physics: Conference Series
, 2009
"... Abstract. Computer systems anticipated in the 2015 – 2020 timeframe are referred to as Extreme Scale because they will be built using massive multi-core processors with 100’s of cores per chip. The largest capability Extreme Scale system is expected to deliver Exascale performance of the order of 10 ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Abstract. Computer systems anticipated in the 2015 – 2020 timeframe are referred to as Extreme Scale because they will be built using massive multi-core processors with 100’s of cores per chip. The largest capability Extreme Scale system is expected to deliver Exascale performance of the order of 10 18 operations per second. These systems pose new critical challenges for software in the areas of concurrency, energy efficiency and resiliency. In this paper, we discuss the implications of the concurrency and energy efficiency challenges on future software for Extreme Scale Systems. From an application viewpoint, the concurrency and energy challenges boil down to the ability to express and manage parallelism and locality by exploring a range of strong scaling and new-era weak scaling techniques. For expressing parallelism and locality, the key challenges are the ability to expose all of the intrinsic parallelism and locality in a programming model, while ensuring that this expression of parallelism and locality is portable across a range of systems. For managing parallelism and locality, the OS-related challenges include parallel scalability, spatial partitioning of OS and application functionality, direct hardware access for inter-processor communication, and asynchronous rather than interrupt-driven events, which are accompanied by runtime system challenges for scheduling, synchronization, memory management, communication, performance monitoring, and power management. We conclude by discussing the importance of software-hardware codesign in addressing the fundamental challenges for application enablement on Extreme Scale systems. 1.
Sassy: A Language and Optimizing Compiler for Image Processing on Reconfigurable Computing Systems
- in International Conference on Vision Systems. 1999. Las Palmas de Gran Canaria
, 1999
"... This paper presents Sassy, a single-assignment variant of the C programming language developed in concert with Khoral Inc. and designed to exploit both coarse-grain and ne-grain parallelism in image processing applications. Sassy programs are written in the Khoros software development environment, ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
This paper presents Sassy, a single-assignment variant of the C programming language developed in concert with Khoral Inc. and designed to exploit both coarse-grain and ne-grain parallelism in image processing applications. Sassy programs are written in the Khoros software development environment, and can be manipulated inside Cantata (the Khoros GUI). The Sassy language supports image processing with true multidimensional arrays, sophisticated array access and windowing mechanisms, and built-in reduction operators (e.g. histogram). At the same time, Sassy restricts C so as to enable compiler optimizations for parallel execution environments, with the goal of reducing data traffic, code size and execution time. In particular, the Sassy language and its optimizing compiler target reconfigurable systems, which are fine-grain parallel processors. Recongurable systems consist of field-programmable gate arrays (FPGAs), memories and interconnection hardware, and can be used as inexpensive co-processors with conventional workstations or PCs. The compiler optimizations needed to generate highly optimal host, FPGA, and communication code, are discussed. The massive parallelism and high throughput of reconfigurable systems makes them well-suited to image processing tasks, but they have not previously been used in this context because they are typically programmed in hardware description languages such as VHDL. Sassy was developed as part of the Cameron project, with the goal of elevating the programming level for reconfigurable systems from hardware circuits to programming language.
A Formal Semantics and an Interactive Environment for Sisal
-
, 1995
"... We present a formal definition of the dynamic semantics of a significant part of the language Sisal 2.0 in the structural operational style of Natural Semantics, using Typol inference rules within the Centaur system, a generic specification environment. Sisal is a strongly typed, applicative, single ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
We present a formal definition of the dynamic semantics of a significant part of the language Sisal 2.0 in the structural operational style of Natural Semantics, using Typol inference rules within the Centaur system, a generic specification environment. Sisal is a strongly typed, applicative, single assignment language in use on a variety of parallel processors, including conventional multiprocessors, vector machines and data-flow machines. The motivations of our work are, with a formal semantic description of Sisal, to provide a firm foundation for understanding and evaluating language design issues, aid the elimination of ambiguities in the language, and provide a valuable reference for both implementors and programmers. At the same time, Centaur specifications automatically yield a structure editor and an interpreter for Sisal, which can be developed into an interactive environment for Sisal programming. Besides classical dynamic semantic aspects of functional programming languages such as the absence of side-effects and aliasing, the notion of referential transparency, and higher-order functions, we have characterized specific semantic aspects of the Sisal language such as arrays, infinite streams, sequential and parallel loops. From this semantic definition, we intend to formally define program transformations, particularly parallelization techniques and algorithms for Sisal compilation, and to incorporate such techniques into a program development and visualization environment for Sisal programming.
D-OSC: A SISAL Compiler for Distributed-Memory Machines
- In Proceedings of the 2nd Parallel Computation and Scheduling Workshop
, 1997
"... In this paper we present the results of the implementation of D-OSC: a prototype SISAL compiler for distributed-memory machines. The compiler is an extension of the Optimizing SISAL Compiler (OSC) version 12.9.1, it generates C code with calls to the message passing library MPI. The main goals of th ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper we present the results of the implementation of D-OSC: a prototype SISAL compiler for distributed-memory machines. The compiler is an extension of the Optimizing SISAL Compiler (OSC) version 12.9.1, it generates C code with calls to the message passing library MPI. The main goals of this implementation are to obtain a SISAL compiler that generates code for distributed-memory machines and to identify the compiler optimizations that can improve the performance of the code generated for the target architecture. 1 Introduction Two issues involved in writing programs for distributedmemory machines are to identify the available parallelism in the program and to implement the program on the target architecture such that the parallelism is exploited. During the past decade the programmer was responsible for identifying the opportunities for parallelism and deciding on the best implementation that exploited such parallelism. Parallel programming is not an easy task, since it usu...
From a Formal Dynamic Semantics of Sisal to a Sisal Environment
- IN 28TH HAWAII INTERNATIONAL CONFERENCE OF SYSTEM SCIENCES (HICSS), MAUI
, 1995
"... We present a formal definition of the dynamic semantics of a significant part of the language Sisal 2.0 in the structural operational style of Natural Semantics, using Typol inference rules within the Centaur system, a generic specification environment. Sisal is a strongly typed, applicative, sin ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present a formal definition of the dynamic semantics of a significant part of the language Sisal 2.0 in the structural operational style of Natural Semantics, using Typol inference rules within the Centaur system, a generic specification environment. Sisal is a strongly typed, applicative, single assignment language in use on a variety of parallel processors, including conventional multiprocessors, vector machines and data-flow machines. The motivations of our work are, with a formal semantic description of Sisal, to provide a firm foundation for understanding and evaluating language design issues, aid the elimination of ambiguities in the language, provide a valuable reference for both implementors and programmers, and facilitate comparison of Sisal with other parallel functional languages. At the same time, Centaur specifications automatically yield a structure editor and an interpreter for Sisal, which can be developed into an interactive environment for Sisal programmi...
Distributed Runtime Support For Task And Data Management
, 1993
"... OF PH.D. DISSERTATION DISTRIBUTED RUNTIME SUPPORT FOR TASK AND DATA MANAGEMENT High-performance computer architectures are evolving into larger and faster systems and, in particular, distributed memory multiprocessors represent the most powerful class of computers built today. Their available resour ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
OF PH.D. DISSERTATION DISTRIBUTED RUNTIME SUPPORT FOR TASK AND DATA MANAGEMENT High-performance computer architectures are evolving into larger and faster systems and, in particular, distributed memory multiprocessors represent the most powerful class of computers built today. Their available resources provide a programmer with the potential for exploiting massive amounts of parallelism in an application, and yet support for highlevel programming languages on these machines is sparse. Thus the need is great for software systems that can free the programmer from the implementation details of an architecture. This dissertation focuses on the design of software support for a high-level functional language on conventional distributed memory multiprocessors. Specifically, we present the design and implementation of a runtime system that provides implicit support for both thread management and data management, and study the effects of latency avoidance and latency tolerance on a set of sampl...
A Formal Semantics for Sisal Arrays
"... We present a formal definition of the dynamic semantics of arrays in the functional language Sisal 2.0. We adopt a logical setting: the structural operational style of Natural Semantics, using the Typol inference rules within the Centaur system, a generic programming environment. From the formal ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present a formal definition of the dynamic semantics of arrays in the functional language Sisal 2.0. We adopt a logical setting: the structural operational style of Natural Semantics, using the Typol inference rules within the Centaur system, a generic programming environment. From the formal specifications, a development and visualization environment for Sisal programming is generated. This semantic definition should allow for a precise comparison of array facilities in similar languages. Moreover, this work is the basis for a formal description of program transformations (e.g. parallelizations) which are crucial in the compilation techniques of functional languages such as Sisal.
Optimizing Sisal Programs: a Formal Approach
"... . We formally describe optimization techniques for the com pilation of the language Sisal 2.0. More precisely, we translate Sisal programs into data-AEow IF1 graphs and optimize these graphs. An in teractive visualization environment for IF1 graphs is also provided. 1 Introduction Software engi ..."
Abstract
- Add to MetaCart
. We formally describe optimization techniques for the com pilation of the language Sisal 2.0. More precisely, we translate Sisal programs into data-AEow IF1 graphs and optimize these graphs. An in teractive visualization environment for IF1 graphs is also provided. 1 Introduction Software engineering of parallel programming is becoming an important issue in computer science; one focus has been to design compiler schemes that will pro vide eOEcient parallel code running on various parallel architectures. Sometimes complementary, sometimes orthogonal, the increasing importance of tools and environments dedicated to parallel computing is another signicant aspect (see [9, 1, 18] for recent developments). The trade-ooe between the programmer's and compiler's part in designing parallel programs has been debated at least for a decade. One solution to en sure portability and reusability is a well-specied language with a high-level of abstraction; moreover, a compiler for such language m...

