Results 1 - 10
of
42
Linear Scan Register Allocation
- ACM Transactions on Programming Languages and Systems
, 1999
"... this article we use depth-first order. The choice of instruction ordering does not a#ect the correctness of the algorithm, but it may a#ect the quality of allocation. We discuss alternative orderings in Section 6. ..."
Abstract
-
Cited by 108 (4 self)
- Add to MetaCart
this article we use depth-first order. The choice of instruction ordering does not a#ect the correctness of the algorithm, but it may a#ect the quality of allocation. We discuss alternative orderings in Section 6.
Heterogeneous Process Migration: The Tui System
- Software Practice and Experience
, 1997
"... Heterogeneous Process Migration is a technique whereby an active process is moved from one machine to another. It must then continue normal execution and communication. The source and destination processors can have a different architecture, that is, different instruction sets and data formats. B ..."
Abstract
-
Cited by 64 (0 self)
- Add to MetaCart
Heterogeneous Process Migration is a technique whereby an active process is moved from one machine to another. It must then continue normal execution and communication. The source and destination processors can have a different architecture, that is, different instruction sets and data formats. Because of this heterogeneity, the entire process memory image must be translated during the migration. "Tui" is a migration system that is able to translate the memory image of a program (written in ANSI-C) between four common architectures (m68000, SPARC, i486 and PowerPC). This requires detailed knowledge of all data types and variables used with the program. This is not always possible in non-type-safe (but popular) languages such as ANSI-C, Pascal and Fortran. The important features of the Tui algorithm are discussed in great detail. This includes the method by which a program's entire set of data values can be located, and eventually reconstructed on the target processor. Perfo...
Proper Tail Recursion and Space Efficiency
, 1998
"... The IEEE/ANSI standard for Scheme requires implementations to be properly tail recursive. This ensures that portable code can rely upon the space efficiency of continuation-passing style and other idioms. On its face, proper tail recursion concerns the efficiency of procedure calls that occur within ..."
Abstract
-
Cited by 53 (1 self)
- Add to MetaCart
The IEEE/ANSI standard for Scheme requires implementations to be properly tail recursive. This ensures that portable code can rely upon the space efficiency of continuation-passing style and other idioms. On its face, proper tail recursion concerns the efficiency of procedure calls that occur within a tail context. When examined closely, proper tail recursion also depends upon the fact that garbage collection can be asymptotically more space-efficient than Algol-like stack allocation. Proper tail recursion is not the same as ad hoc tail call optimization in stack-based languages. Proper tail recursion often precludes stack allocation of variables, but yields a well-defined asymptotic space complexity that can be relied upon by portable programs. This paper offers a formal and implementation-independent definition of proper tail recursion for Scheme. It also shows how an entire family of reference implementations can be used to characterize related safe-for-space properties, and proves ...
Fast Module Mapping and Placement for Datapaths in FPGAs
- In ACM/SIGDA International Symposium on Field Programmable Gate Arrays
, 1998
"... By tailoring a compiler tree-parsing tool for datapath module mapping, we produce good quality results for datapath synthesis in very fast run time. Rather than flattening the design to gates, we preserve the datapath structure; this allows exploitation of specialized datapath features in FPGAs, ret ..."
Abstract
-
Cited by 40 (2 self)
- Add to MetaCart
By tailoring a compiler tree-parsing tool for datapath module mapping, we produce good quality results for datapath synthesis in very fast run time. Rather than flattening the design to gates, we preserve the datapath structure; this allows exploitation of specialized datapath features in FPGAs, retains regularity, and also results in a smaller problem size. To further achieve high mapping speed, we formulate the problem as tree covering and solve it efficiently with a linear-time dynamic programming algorithm. In a novel extension to the tree-covering algorithm, we perform module placement simultaneously with the mapping, still in linear time. Integrating placement has the potential to increase the quality of the result since we can optimize total delay including routing delays. To our knowledge this is the first effort to leverage a grammarbased tree covering tool for datapath module mapping. Further, it is the first work to integrate simultaneous placement with module mapping in a w...
Refining Data Flow Information using Infeasible Paths
, 1997
"... . Experimental evidence indicates that large programs exhibit significant amount of branch correlation amenable to compile-time detection. Branch correlation gives rise to infeasible paths, which in turn make data flow information overly conservative. For example, def-use pairs that always span infe ..."
Abstract
-
Cited by 33 (6 self)
- Add to MetaCart
. Experimental evidence indicates that large programs exhibit significant amount of branch correlation amenable to compile-time detection. Branch correlation gives rise to infeasible paths, which in turn make data flow information overly conservative. For example, def-use pairs that always span infeasible paths cannot be tested by any program input, preventing 100% def-use testing coverage. We present an algorithm for identifying infeasible program paths and a data flow analysis technique that improves the precision of traditional def-use pair analysis by incorporating the information about infeasible paths into the analysis. Infeasible paths are computed using branch correlation analysis, which can be performed either intra- or inter-procedurally. The efficiency of our technique is achieved through demand-driven formulation of both the infeasible paths detection and the def-use pair analysis. Our experiments indicate that even when a simple form of intraprocedural branch correlation i...
Sequencing Run-Time Reconfigured Hardware with Software
- Software”, ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
, 1996
"... Run-Time Reconfigured systems offer additional hardware resources to systems based on reconfigurable FPGAs. These systems, however, are often difficult to build and must tolerate substantial reconfiguration times. A processor based architecture has been built to simplify the development of these sys ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
Run-Time Reconfigured systems offer additional hardware resources to systems based on reconfigurable FPGAs. These systems, however, are often difficult to build and must tolerate substantial reconfiguration times. A processor based architecture has been built to simplify the development of these systems by providing programmable control of hardware sequencing while retaining the performance of hardware. Configuration overhead of this system is reduced by "caching" hardware on the reconfigurable resource. An image processing application was developed on this system to demonstrate both the performance improvements of custom hardware and the ease of software development. 1 Introduction The high bandwidth of data and computational load of digital signal processing algorithms generally overwhelm even the highest performance generalpurpose processors. Achieving real-time execution rates typically requires custom hardware. SRAMbased Field-Programmable Gate Arrays (FPGAs) are often used to ...
Source-Level Debugging of Scalar Optimized Code
- SIGPLAN Notices
, 1996
"... Although compiler optimizations play a crucial role in the performance of modern computer systems, debugger technology has lagged behind in its support of optimizations. Yet debugging the unoptimized translation is often impossible or futile, so handling of code optimizations in the debugger is nece ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
Although compiler optimizations play a crucial role in the performance of modern computer systems, debugger technology has lagged behind in its support of optimizations. Yet debugging the unoptimized translation is often impossible or futile, so handling of code optimizations in the debugger is necessary. But compiler optimizations make it difficult to provide source-level debugger functionality: Global optimizations can cause the runtime value of a variable to be inconsistent with the source-level value expected at a breakpoint; such variables are called endangered variables. A debugger must detect and warn the user of endangered variables otherwise the user may draw incorrect conclusions about the program. This paper presents a new algorithm for detecting variables that are endangered due to global scalar optimizations. Our approach provides more precise classifications of variables and is still simpler than past approaches. We have implemented and evaluated our techniques in the con...
Comparison Checking: An Approach to Avoid Debugging of Optimized Code
- PROCEEDINGS OF FOUNDATION OF SOFTWARE ENGINEERING
, 1999
"... We present a novel approach to avoid the debugging of optimized code through comparison checking. In the technique presented, both the unoptimized and optimized versions of an application program are executed, and computed values are compared to ensure the behaviors of the two versions are the s ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
We present a novel approach to avoid the debugging of optimized code through comparison checking. In the technique presented, both the unoptimized and optimized versions of an application program are executed, and computed values are compared to ensure the behaviors of the two versions are the same under the given input. If the values are different, the comparison checker displays where in the application program the differences occurred and what optimizations were involved. The user can utilize this information and a conventional debugger to determine if an error is in the unoptimized code. If the error is in the optimized code, the user can turn off those offending optimizations and leave the other optimizations in place. We implemented our comparison checking scheme, which executes the unoptimized and optimized versions of C programs, and ran experiments that demonstrate the approach is effective and practical.
Reconfigurable computing: architectures and design methods
- IEE Proceedings - Computers and Digital Techniques
, 2005
"... Abstract: Reconfigurable computing is becoming increasingly attractive for many applications. This survey covers two aspects of reconfigurable computing: architectures and design methods. The paper includes recent advances in reconfigurable architectures, such as the Alters Stratix II and Xilinx Vir ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Abstract: Reconfigurable computing is becoming increasingly attractive for many applications. This survey covers two aspects of reconfigurable computing: architectures and design methods. The paper includes recent advances in reconfigurable architectures, such as the Alters Stratix II and Xilinx Virtex 4 FPGA devices. The authors identify major trends in general-purpose and specialpurpose
A Retargetable C compiler
- Design and Implementation. Benjamin/Cummings Publishing
, 1995
"... language design research not only because it shares many characteristics with Java, the current language of choice for such research, but also because it’s likely to see wide use. Language research needs a large investment in infrastructure, even for relatively small studies. This paper describes a ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
language design research not only because it shares many characteristics with Java, the current language of choice for such research, but also because it’s likely to see wide use. Language research needs a large investment in infrastructure, even for relatively small studies. This paper describes a new C # compiler designed specifically to provide that infrastructure. The overall design is deceptively simple. The parser is generated automatically from a possibly ambiguous grammar, accepts C # source, perhaps with new features, and produces an abstract syntax tree, or AST. Subsequent phases—dubbed visitors—traverse the AST, perhaps modifying it, annotating it or emitting output, and pass it along to the next visitor. Visitors are specified entirely at compilation time and are loaded dynamically as needed. There is no fixed set of visitors, and visitors are completely unconstrained. Some visitors perform traditional compilation phases, but the more interesting ones do code analysis, emit non-traditional data such as XML, and display data structures for debugging. Indeed, most usage to date has been for tools, not for language design experiments. Such experiments use source-to-source transformations or extend existing visitors to handle new language features. These approaches are illustrated by adding a statement that switches on a type instead of a value, which can be implemented in a few hundred lines. The compiler also exemplifies the value of dynamic loading and of type reflection.

