Results 1 
8 of
8
Implementation of StackBased Languages on Register Machines
, 1996
"... Languages with programmervisible stacks (stackbased languages) are used widely, as intermediate languages (e.g., JavaVM, FCode), and as languages for human programmers (e.g., Forth, PostScript). However, the prevalent computer architecture is the register machine. This poses the problem of efficie ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
Languages with programmervisible stacks (stackbased languages) are used widely, as intermediate languages (e.g., JavaVM, FCode), and as languages for human programmers (e.g., Forth, PostScript). However, the prevalent computer architecture is the register machine. This poses the problem of efficiently implementing stackbased languages on register machines. A straightforward implementation of the stack consists of a memory area that contains the stack items, and a pointer to the topofstack item. The basic
Scheduling Expression DAGs for Minimal Register Need
, 1998
"... Generating schedules for expression DAGs that use a minimal number of registers is a classical NPcomplete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial O(n!) enumeration algorithm. We present a new algorithm with ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Generating schedules for expression DAGs that use a minimal number of registers is a classical NPcomplete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial O(n!) enumeration algorithm. We present a new algorithm with worstcase complexity O(n2 2n ) and very good average behaviour. Applying a dynamic programming scheme and reordering techniques, our algorithm is able to defer the combinatorial explosion and to generate an optimal schedule not only for small DAGs but also for mediumsized ones with up to 50 nodes, a class that contains nearly all DAGs encountered in typical application programs. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithm generates optimal schedules quite fast. We extend our algorithm to cope with delay slots and multiple functional units, two common features of modern superscalar processors. Key words:...
Scheduling Vector Straight Line Code on Vector Processors
 Graham (Ed.): Code Generation  Concepts, Tools, Techniques. Springer Workshops in Computing Series (WICS
, 1992
"... We present an algorithm to schedule basic blocks of vector threeaddressinstructions. This algorithm is suited for a special class of vector processors containing a buffer (register file) which may be partitioned arbitrarily into vector registers by the user. The algorithm computes the best ratio of ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We present an algorithm to schedule basic blocks of vector threeaddressinstructions. This algorithm is suited for a special class of vector processors containing a buffer (register file) which may be partitioned arbitrarily into vector registers by the user. The algorithm computes the best ratio of vector register spilling to strip mining, taking the vector length and the buffer size into consideration, as well as several machine parameters of the target architecture. We apply the algorithm to groups of vector instructions within a basic block that are quasiscalar, i.e. all vectors occurring in the group must have one fixed length L. 1 Introduction For scalar processors register allocation is widely accepted as one of the most important optimizations in compiler construction. In [1] Allen and Kennedy claim that register allocation is even more important on vector processors with vector registers. They argue that by an effective use of the vector registers, the system performance can ...
The SPARK 2.0 system  a special purpose vector processor with a VectorPASCAL compiler
"... This paper describes the architecture of the Spark 2.0 processor and introduces a compiler for VectorPascal. Features of the architecture are the flexible address generation during vector operations and the large memories closely connected to the functional units. The source language allows to writ ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
This paper describes the architecture of the Spark 2.0 processor and introduces a compiler for VectorPascal. Features of the architecture are the flexible address generation during vector operations and the large memories closely connected to the functional units. The source language allows to write programs with vector statements avoiding scalar inner loops. The compiler employs several optimizing strategies to utilize the architectural benefits efficiently. 1 Introduction In the scientific world exist many applications (e. g. molecular dynamics [17]) with extensive use of index table driven algorithms. The computational power often required by such applications can only be achieved by parallel systems with powerful node processors. The Spark 2.0 processor reaches a very high sustained /peak performance ratio on index addressed vector operations and seems to be a good candidate for a node (The sustained/peak performance ratio on our molecular dynamics simulation program is five time...
Generating Optimal Contiguous Evaluations for Expression DAGs
"... We consider the NPcomplete problem of generating contiguous evaluations for expression DAGs with a minimal number of registers. We present two algorithms that generate optimal contiguous evaluation for a given DAG. The first is a modification of a complete search algorithm that omits the generation ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We consider the NPcomplete problem of generating contiguous evaluations for expression DAGs with a minimal number of registers. We present two algorithms that generate optimal contiguous evaluation for a given DAG. The first is a modification of a complete search algorithm that omits the generation of redundant evaluations. The second algorithm generates only the most promising evaluations by splitting the DAG into trees with import and export nodes and evaluating the trees with a modified labeling scheme. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithms generate optimal contiguous evaluations quite fast.
Efficient Register Allocation for Large Basic Blocks
"... Let v 1 ; : : : ; v d be the decision nodes of the DAG, and (3) let fi = (fi 1 ; : : : ; fi d ) 2 f0; 1g d be a bitvector. (4) forall 2 d different fi 2 f0; 1g d do (5) start dfs(root) with each fi, such that for 1 i d (6) if fi i = 0 in the call dfs(v i ), (7) then the left son of v i is ..."
Abstract
 Add to MetaCart
Let v 1 ; : : : ; v d be the decision nodes of the DAG, and (3) let fi = (fi 1 ; : : : ; fi d ) 2 f0; 1g d be a bitvector. (4) forall 2 d different fi 2 f0; 1g d do (5) start dfs(root) with each fi, such that for 1 i d (6) if fi i = 0 in the call dfs(v i ), (7) then the left son of v i is evaluated first (8) else the right son of v i is evaluated first fi od run time: O(n2 d ), with d n \Gamma 2.<F27
Register Allocation for General Purpose Architectures
, 2000
"... Over the years, register allocation has been recognized as one of the vital problems to solve with respect to compiler design. Putting critical data values and/or addresses into the small but very fast memories that registers are, is essential for achieving highspeed code. This document gives an ..."
Abstract
 Add to MetaCart
Over the years, register allocation has been recognized as one of the vital problems to solve with respect to compiler design. Putting critical data values and/or addresses into the small but very fast memories that registers are, is essential for achieving highspeed code. This document gives an overview of the most important work in the field, detailing some breakthrough approaches while briefly mentioning a lot of results that where derived from them.
Optimal Contiguous Expression DAG Evaluations
"... Generating evaluations for expression DAGs with a minimal number of registers is NPcomplete. We present two algorithms that generate optimal contiguous evaluation for a given DAG. The first is a modification of a complete search algorithm that omits redundant evaluations. The second algorithm gener ..."
Abstract
 Add to MetaCart
Generating evaluations for expression DAGs with a minimal number of registers is NPcomplete. We present two algorithms that generate optimal contiguous evaluation for a given DAG. The first is a modification of a complete search algorithm that omits redundant evaluations. The second algorithm generates only the most promising evaluations by splitting the DAG into trees with import and export nodes and evaluating the trees with a modified labeling scheme. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithms generate optimal contiguous evaluations quite fast.