Results 1 -
6 of
6
Memory Forwarding: Enabling Aggressive Layout Optimizations by Guaranteeing the Safety of Data Relocation
- IN PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE
, 1999
"... By optimizing data layout at run-time, we can potentially enhance the performance of caches by actively creating spatial locality, facilitating prefetching, and avoiding cache conflicts and false sharing. Unfortunately, it is extremely difficult to guarantee that such optimizations are safe in pract ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
By optimizing data layout at run-time, we can potentially enhance the performance of caches by actively creating spatial locality, facilitating prefetching, and avoiding cache conflicts and false sharing. Unfortunately, it is extremely difficult to guarantee that such optimizations are safe in practice on today's machines, since accurately updating all pointers to an object requires perfect alias information, which is well beyond the scope of the compiler for languages such as C. To overcome this limitation, we proposea technique called memory forwarding which effectively adds a new layer of indirection within the memory system whenever necessary to guarantee that data relocation is always safe. Because actual forwarding rarely occurs (it exists as a safety net), the mechanism can be implemented as an exception in modern superscalar processors. Our experimental results demonstrate that the aggressive layout optimizations enabled by memory forwarding can result in significant speedups--...
Unrolling Lists
, 1994
"... Lists are ubiquitous in functional programs, thus supporting lists efficiently is a major concern to compiler writers for functional languages. Lists are normally represented as linked cons cells, with each cons cell containing a car (the data) and a cdr (the link); this is inefficient in the use of ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
Lists are ubiquitous in functional programs, thus supporting lists efficiently is a major concern to compiler writers for functional languages. Lists are normally represented as linked cons cells, with each cons cell containing a car (the data) and a cdr (the link); this is inefficient in the use of space, because 50% of the storage is used for links. Loops and recursions on lists are slow on modern machines because of the long chains of control dependences (in checking for nil) and data dependences (in fetching cdr fields). We present a data structure for "unrolled lists," where each cell has several data items (car fields) and one link (cdr). This reduces the memory used for links, and it significantly shortens the length of control-dependence and data-dependence chains in operations on lists. We further present an efficient compile-time analysis that transforms programs written for "ordinary" lists into programs on unrolled lists. The use of our new representation requires no change...
Compiling Standard ML For Efficient Execution On Modern Machines
, 1994
"... Many language theoreticians have taken great efforts in designing higher-level programming languages that are more elegant and more expressive than conventional languages. However, few of these new languages have been implemented very efficiently. The result is that most software engineers still pre ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
Many language theoreticians have taken great efforts in designing higher-level programming languages that are more elegant and more expressive than conventional languages. However, few of these new languages have been implemented very efficiently. The result is that most software engineers still prefer to use conventional languages, even though the new higherlevel languages offer a better and simpler programming model. This dissertation concentrates on improving the performance of programs written in Standard ML (SML)---a statically typed functional language---on today's RISC machines. SML poses tough challenges to efficient implementations: very frequent function calls, polymorphic types, recursive data structures, higher-order functions, and first-class continuations. This dissertation presents the design and evaluation of several new compilation techniques that meet these challenges by taking advantage of some of the higher-level language features in SML. Type-directed compilation ...
Fast Functional Lists, Hash-Lists, Deques and Variable Length Arrays
- In Implementation of Functional Languages, 14th International Workshop
, 2002
"... This paper introduces a new data structure, the VList, that is compact, thread safe and significantly faster to use than Linked Lists for nearly all list operations. Space usage can be reduced by 50% to 90% and in typical list operations speed improved by factors ranging from 4 to 20 or more. Some ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
This paper introduces a new data structure, the VList, that is compact, thread safe and significantly faster to use than Linked Lists for nearly all list operations. Space usage can be reduced by 50% to 90% and in typical list operations speed improved by factors ranging from 4 to 20 or more. Some important operations such as indexing and length are typically changed from O(N) to O(1) and O(lgN) respectively. A language interpreter Visp, using a dialect of Common Lisp, has been implemented using VLists and the benchmark comparison with OCAML reported. It is also shown how to adapt the structure to create variable length arrays, persistent deques and functional hash tables. The VArray requires no resize copying and has an average O(1) random access time. Comparisons are made with previous resizable one dimensional arrays, Hash Array Trees (HAT) Sitarski [1996], and Brodnik, Carlsson, Demaine, Munro, and Sedgewick [1999]
Using Compact Data Representations for Languages Based on Catamorphisms
, 1995
"... We describe a new method for improving the performance of functional programs based on catamorphisms. The method relies on using a compact vector representation for the recursive structure over which the catamorphism operates. This saves space and allows catamorphisms to be implemented in tail-recur ..."
Abstract
- Add to MetaCart
We describe a new method for improving the performance of functional programs based on catamorphisms. The method relies on using a compact vector representation for the recursive structure over which the catamorphism operates. This saves space and allows catamorphisms to be implemented in tail-recursive fashion even in cases where the standard linked structure representation requires non-tail-recursive evaluation. Preliminary experimental measurements show substantial improvements are possible with our approach. Keywords: program transformation, compilation methods, data representations, catamorphisms. 1 Introduction Most functional languages provide higher-order library functions that capture common computation patterns over recursive data structures. These operators allow algorithms to be expressed at a higher level of abstraction than explicitly recursive programs that manipulate the data structure "one piece at a time." Perhaps the most useful of these operators is the catamorph...
unknown title
"... ABSTRACT Security concerns on embedded devices like cellular phones makeJava an extremely attractive technology for providing third-party and user-downloadable functionality. However, garbage collectorshave typically required several times the maximum live data set size (which is the minimum possibl ..."
Abstract
- Add to MetaCart
ABSTRACT Security concerns on embedded devices like cellular phones makeJava an extremely attractive technology for providing third-party and user-downloadable functionality. However, garbage collectorshave typically required several times the maximum live data set size (which is the minimum possible heap size) in order to run well. Inaddition, the size of the virtual machine (ROM) image and the size of the collector's data structures (metadata) have not been a concernfor server- or workstation-oriented collectors. We have implemented two different collectors specifically de-signed to operate well on small embedded devices. We have also developed a number of algorithmic improvements and compressiontechniques that allow us to eliminate almost all of the per-object overhead that the virtual machine and the garbage collector require.We describe these optimizations and present measurements of the Java embedded benchmarks (EEMBC) of our implementations onboth an IA32 laptop and an ARM-based PDA.

