Results 1  10
of
13
NESL: A nested dataparallel language (version 2.6
, 1993
"... The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Wright Laboratory or the U. S. Government. Keywords: Dataparallel, parallel algorithms, supe ..."
Abstract

Cited by 95 (7 self)
 Add to MetaCart
The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Wright Laboratory or the U. S. Government. Keywords: Dataparallel, parallel algorithms, supercomputers, nested parallelism, This report describes Nesl, a stronglytyped, applicative, dataparallel language. Nesl is intended to be used as a portable interface for programming a variety of parallel and vector computers, and as a basis for teaching parallel algorithms. Parallelism is supplied through a simple set of dataparallel constructs based on sequences, including a mechanism for applying any function over the elements of a sequence in parallel and a rich set of parallel functions that manipulate sequences. Nesl fully supports nested sequences and nested parallelism—the ability to take a parallel function and apply it over multiple instances in parallel. Nested parallelism is important for implementing algorithms with irregular nested loops (where the inner loop lengths depend on the outer iteration) and for divideandconquer algorithms. Nesl also provides a performance model for calculating the asymptotic performance of a program on
Static dependent costs for estimating execution time
 In Proc. of the 1994 ACM Conference on LISP and functional programming
, 1994
"... We present the first system for estimating and using datadependent expression execution times in a language with firstclass procedures and imperative constructs. Thepresence of firstclass procedures and imperative constructs makes cost estimation a global problem that can benefit from type informa ..."
Abstract

Cited by 46 (0 self)
 Add to MetaCart
We present the first system for estimating and using datadependent expression execution times in a language with firstclass procedures and imperative constructs. Thepresence of firstclass procedures and imperative constructs makes cost estimation a global problem that can benefit from type information. We estimate expression costs with the aid of an algebraic type reconstruction system that assigns every procedure atype that includes a static dependent cost. A static dependent cost describes the execution time of a procedure in terms of its inputs. In particular, a procedure’s static dependent cost can depend on the size of input data structures and the cost of input firstclass procedures. Our cost system produces symbolic cost expressions that contain free variables describing the size and cost of the procedure’s inputs. At runtime, a cost estimate is dynamically computed from the statically determined cost expression and runtime cost and size information. We present experimental results that validate our cost system onthreecompilers and architectures. We experimentally demonstrate the utility of cost estimates in making dynamic parallelization decisions. In our experience, dynamic parallelization meets or exceeds the parallel performance of any fixed number of processors. 1
LCM: Memory System Support for Parallel Language Implementation
 In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VI
, 1994
"... Higher1evel parallel programming languages can be difficult to implement efficiently on parallel machines. This paper shows how a flexible, compilercontrolled memory system can help achieve good performance for language constructs that previously appeared too costly to be practical. Our compilerc ..."
Abstract

Cited by 38 (6 self)
 Add to MetaCart
Higher1evel parallel programming languages can be difficult to implement efficiently on parallel machines. This paper shows how a flexible, compilercontrolled memory system can help achieve good performance for language constructs that previously appeared too costly to be practical. Our compilercontrolled memory system is called Loosely Coherent Memory (LCM). It is an example of a larger class of Reconcilable Shared Memory (RSM) systems, which generalize the replication and merge policies of cachecoherent sharedmemory. RSM protocols differ in the action taken by a processor in response to a request for a location and the way in which a processor reconcdes multiple outstanding copies of a location, LCM memory becomes temporarily inconsistent to implement the semantics of C* * parallel functions efficiently. RSM provides a compiler with control over memorysystem policies, which it can use to implement a language’s semantics, improve performance, or detect errors. We illustrate the first two points with LCM and our compiler for the dataparallel language C**.
Abstract interpretation based formal methods and future challenges, invited paper
 Informatics — 10 Years Back, 10 Years Ahead, volume 2000 of Lecture Notes in Computer Science
, 2001
"... Abstract. In order to contribute to the solution of the software reliability problem, tools have been designed to analyze statically the runtime behavior of programs. Because the correctness problem is undecidable, some form of approximation is needed. The purpose of abstract interpretation is to f ..."
Abstract

Cited by 29 (6 self)
 Add to MetaCart
Abstract. In order to contribute to the solution of the software reliability problem, tools have been designed to analyze statically the runtime behavior of programs. Because the correctness problem is undecidable, some form of approximation is needed. The purpose of abstract interpretation is to formalize this idea of approximation. We illustrate informally the application of abstraction to the semantics of programming languages as well as to static program analysis. The main point is that in order to reason or compute about a complex system, some information must be lost, that is the observation of executions must be either partial or at a high level of abstraction. In the second part of the paper, we compare static program analysis with deductive methods, modelchecking and type inference. Their foundational ideas are briefly reviewed, and the shortcomings of these four methods are discussed, including when they should be combined. Alternatively, since program debugging is still the main program verification
Shape Checking of Array Programs
 In Computing: the Australasian Theory Seminar, Proceedings
, 1997
"... Shape theory provides a framework for the study of data types in which shape and data can be manipulated separately. This paper is concerned with shape checking, i.e. the detection of shape errors, such as array bound errors, without handling the data stored within. It can be seen as a form of parti ..."
Abstract

Cited by 20 (5 self)
 Add to MetaCart
Shape theory provides a framework for the study of data types in which shape and data can be manipulated separately. This paper is concerned with shape checking, i.e. the detection of shape errors, such as array bound errors, without handling the data stored within. It can be seen as a form of partial evaluation in which data computations are ignored. We construct a simplytyped lambdacalculus that supports a vector type constructor, whose iteration yields types of arrays. It is expressive enough to construct all of the usual linear algebra operations. All shape errors in a term t can be detected by evaluating its shape #t. Evaluation of #t will terminate if that of t does. Keywords shape analysis, partial evaluation, arrays, higherorder. 1 Introduction Shape theory explores the consequences of manipulating shape and data separately (Jay [14]). Shape refers to the data structure in which the data is stored. For example, the shape of a threedimensional regular array is a tuple of...
Eekelen. Polynomial size analysis of firstorder functions
, 2007
"... Abstract. We present a sizeaware type system for firstorder shapely function definitions. Here, a function definition is called shapely when the size of the result is determined exactly by a polynomial in the sizes of the arguments. Examples of shapely function definitions may be matrix multiplica ..."
Abstract

Cited by 17 (9 self)
 Add to MetaCart
Abstract. We present a sizeaware type system for firstorder shapely function definitions. Here, a function definition is called shapely when the size of the result is determined exactly by a polynomial in the sizes of the arguments. Examples of shapely function definitions may be matrix multiplication and the Cartesian product of two lists. The type checking problem for the type system is shown to be undecidable in general. We define a natural syntactic restriction such that the type checking becomes decidable, even though size polynomials are not necessarily linear or monotonic. Furthermore, a method that infers polynomial size dependencies for a nontrivial class of function definitions is suggested. 1
Efficient parallel algorithms for closest point problems
, 1994
"... This dissertation develops and studies fast algorithms for solving closest point problems. Algorithms for such problems have applications in many areas including statistical classification, crystallography, data compression, and finite element analysis. In addition to a comprehensive empirical study ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
This dissertation develops and studies fast algorithms for solving closest point problems. Algorithms for such problems have applications in many areas including statistical classification, crystallography, data compression, and finite element analysis. In addition to a comprehensive empirical study of known sequential methods, I introduce new parallel algorithms for these problems that are both efficient and practical. I present a simple and flexible programming model for designing and analyzing parallel algorithms. Also, I describe fast parallel algorithms for nearestneighbor searching and constructing Voronoi diagrams. Finally, I demonstrate that my algorithms actually obtain good performance on a wide variety of machine architectures. The key algorithmic ideas that I examine are exploiting spatial locality, and random sampling. Spatial decomposition provides allows many concurrent threads to work independently of one another in local areas of a shared data structure. Random sampling provides a simple way to adaptively decompose irregular problems, and to balance workload among many threads. Used together, these techniques result in effective algorithms for a wide range of geometric problems. The key
A Transformational Approach which Combines Size Inference and Program Optimization
 Semantics, Applications, and Implementation of Program Generation (SAIG’01), Lecture Notes in Computer Science 2196
, 2001
"... If functional programs are to be used for highperformance computing, efficient data representations and operations must be provided. Our contribution is a calculus for the analysis of the lengths of (nested) lists and a transformation into a form which is liberated from the chain of consoperations ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
If functional programs are to be used for highperformance computing, efficient data representations and operations must be provided. Our contribution is a calculus for the analysis of the lengths of (nested) lists and a transformation into a form which is liberated from the chain of consoperations and which sometimes permits array implementations even if the length depends on runtime values. A major advantage of functional programs vs. imperative programs is that dependence analysis is much easier, due to the absence of reassignments. One severe disadvantage of functional programs as of yet is that efficient, machineoriented data structures (like the array) absolutely necessary for highperformance computing play a minor role in many language implementations since they do not harmonize with functional evaluation schemata (like graph reduction), which are at a higher level of abstraction. We propose to construct programs by composition of skeletons, i.e., functi...
A FunctionComposition Approach to Synthesize Fortran 90 Array Operations
 Journal of Parallel and Distributed Computing
, 1998
"... An increasing number of programming languages, such as Fortran 90 and APL, are providing a rich set of intrinsic array functions and array expressions. These constructs which constitute an important part of data parallel languages provide excellent opportunities for compiler optimizations. In this p ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
An increasing number of programming languages, such as Fortran 90 and APL, are providing a rich set of intrinsic array functions and array expressions. These constructs which constitute an important part of data parallel languages provide excellent opportunities for compiler optimizations. In this paper, we present a new approach to combine consecutive array operations or array expressions into a composite access function of the source arrays. Our scheme is based on the composition of access functions, which is analogous to a composition of mathematic functions. Our new scheme can handle not only data movements of arrays with different numbers of dimensions and with multipleclause array operations but also masked array expressions and multiplesource array operations. As a result, our proposed scheme is the first synthesis scheme which can collectively synthesize Fortran 90 RESHAPE, EOSHIFT, MERGE, array reduction operations, and WHERE constructs. In addition, we also discuss the case...