Results 1 - 10
of
70
Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions
- PLDI 2000
, 2000
"... This paper presents a novel framework for the symbolic bounds analysis of pointers, array indices, and accessed memory regions. Our framework formulates each analysis problem as a system of inequality constraints between symbolic bound polynomials. It then reduces the constraint system to a linear p ..."
Abstract
-
Cited by 100 (14 self)
- Add to MetaCart
This paper presents a novel framework for the symbolic bounds analysis of pointers, array indices, and accessed memory regions. Our framework formulates each analysis problem as a system of inequality constraints between symbolic bound polynomials. It then reduces the constraint system to a linear program. The solution to the linear program provides symbolic lower and upper bounds for the values of pointer and array index variables and for the regions of memory that each statement and procedure accesses. This approach eliminates fundamental problems associated with applying standard xed-point approaches to symbolic analysis problems. Experimental results from our implemented compiler show that the analysis can solve several important problems, including static race detection, automatic parallelization, static detection of array bounds violations, elimination of array bounds checks, and reduction of the number of bits used to store computed values.
Region-Based Shape Analysis with Tracked Locations
- POPL '05
, 2005
"... This paper proposes a novel approach to shape analysis: using local reasoning about individual heap locations instead of global reasoning about entire heap abstractions. We present an inter-procedural shape analysis algorithm for languages with destructive updates. The key feature is a novel memory ..."
Abstract
-
Cited by 71 (1 self)
- Add to MetaCart
This paper proposes a novel approach to shape analysis: using local reasoning about individual heap locations instead of global reasoning about entire heap abstractions. We present an inter-procedural shape analysis algorithm for languages with destructive updates. The key feature is a novel memory abstraction that differs from traditional abstractions in two ways. First, we build the shape abstraction and analysis on top of a pointer analysis. Second, we decompose the shape abstraction into a set of independent configurations, each of which characterizes one single heap location. Our approach: 1) leads to simpler algorithm specifications, because of local reasoning about the single location; 2) leads to efficient algorithms, because of the smaller granularity of the abstraction; and 3) makes it easier to develop context-sensitive, demand-driven, and incremental shape analyses. We also show that the analysis can be used to enable the static detection of memory errors in programs with explicit deallocation. We have built a prototype tool that detects memory leaks and accesses through dangling pointers in C programs. The experiments indicate that the analysis is sufficiently precise to detect errors with low false positive rates; and is sufficiently lightweight to scale to larger programs. For a set of three popular C programs, the tool has analyzed about 70K lines of code in less than 2 minutes and has produced 97 warnings, 38 of which were actual errors.
Dynamic Management of Scratch-Pad Memory Space
- In Design Automation Conference
, 2001
"... Optimizations aimed at improving the efficiency of on-chip memories are extremely important. We propose a compiler-controlled dynamic on-chip scratch-pad memory (SPM) management framework that uses both loop and data transformations. Experimental results obtained using a generic cost model indicate ..."
Abstract
-
Cited by 58 (5 self)
- Add to MetaCart
Optimizations aimed at improving the efficiency of on-chip memories are extremely important. We propose a compiler-controlled dynamic on-chip scratch-pad memory (SPM) management framework that uses both loop and data transformations. Experimental results obtained using a generic cost model indicate significant reductions in data transfer activity between SPM and off-chip memory.
A Core Library For Robust Numeric and Geometric Computation
- In 15th ACM Symp. on Computational Geometry
, 1999
"... Nonrobustness is a well-known problem in many areas of computational science. Until now, robustness techniques and the construction of robust algorithms have been the province of experts in this field of research. We describe a new C/C++ library (Core) for robust numeric and geometric computation ba ..."
Abstract
-
Cited by 56 (8 self)
- Add to MetaCart
Nonrobustness is a well-known problem in many areas of computational science. Until now, robustness techniques and the construction of robust algorithms have been the province of experts in this field of research. We describe a new C/C++ library (Core) for robust numeric and geometric computation based on the principles of Exact Geometric Computation (EGC). Through our library, for the first time, any programmer can write robust and efficient algorithms. The Core Library is based on a novel numerical core that is powerful enough to support EGC for algebraic problems. This is coupled with a simple delivery mechanism which transparently extends conventional C/C++ programs into robust codes. We are currently addressing efficiency issues in our library: (a) at the compiler and language level, (b) at the level of incorporating EGC techniques, as well as the (c) the system integration of both (a) and (b). Pilot experimental results are described. The basic library is available at http://cs.nyu.edu...
Automatic Parallelization of Divide and Conquer Algorithms
- In Proceedings of the 7th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, 1999
"... Divide and conquer algorithms are a good match for modern parallel machines: they tend to have large amounts of inherent parallelism and they work well with caches and deep memory hierarchies. But these algorithms pose challenging problems for parallelizing compilers. They are usually coded as recur ..."
Abstract
-
Cited by 47 (7 self)
- Add to MetaCart
Divide and conquer algorithms are a good match for modern parallel machines: they tend to have large amounts of inherent parallelism and they work well with caches and deep memory hierarchies. But these algorithms pose challenging problems for parallelizing compilers. They are usually coded as recursive procedures and often use pointers into dynamically allocated memory blocks and pointer arithmetic. All of these features are incompatible with the analysis algorithms in traditional parallelizing compilers. This paper presents the design and implementation of a compiler that is designed to parallelize divide and conquer algorithms whose subproblems access disjoint regions of dynamically allocated arrays. The foundation of the compiler is a flow-sensitive, context-sensitive, and interprocedural pointer analysis algorithm. A range of symbolic analysis algorithms build on the pointer analysis information to extract symbolic bounds for the memory regions accessed by (potentially recursive) procedures that use pointers and pointer arithmetic. The symbolic bounds information allows the compiler to find procedure calls that can execute in parallel without violating the data dependences. The compiler generates code that executes these calls in parallel. We have used the compiler to parallelize several programs that use divide and conquer algorithms. Our results show that the programs perform well and exhibit good speedup. 1
OpenMP for Networks of SMPs
"... In this paper, we present the first system that implements OpenMP on a network of shared-memory multiprocessors. This system enables the programmer to rely on a single, standard, shared-memory API for parallelization within a multiprocessor and between multiprocessors. It is implemented via a transl ..."
Abstract
-
Cited by 45 (0 self)
- Add to MetaCart
In this paper, we present the first system that implements OpenMP on a network of shared-memory multiprocessors. This system enables the programmer to rely on a single, standard, shared-memory API for parallelization within a multiprocessor and between multiprocessors. It is implemented via a translator that converts OpenMP directives to appropriate calls to a modified version of the TreadMarks software distributed memory system (SDSM). In contrast to previous SDSM systems for SMPs, the modified TreadMarks uses POSIX threads for parallelism within an SMP node. This approach greatly simplifies the changes required to the SDSM in order to exploit the intra-node hardware shared memory. We present performance results for six applications (SPLASH-2 Barnes-Hut and Water, NAS 3D-FFT, SOR, TSP and MGS) running on an SP2 with four four-processor SMP nodes. A comparison between the threaded implementation and the original implementation of TreadMarks shows that using the hardware shared memory within an SMP node significantly reduces the amount of data and the number of messages transmitted between nodes, and consequently achieves speedups up to 30 % better than the original versions. We also compare SDSM against message passing. Overall, the speedups of multithreaded TreadMarks programs are within 7–30 % of the MPI versions.
DRAM Energy Management Using Software and Hardware Directed Power Mode Control
- IN PROC. THE 7TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTER ARCHITECTURE
, 2001
"... While there have been several studies and proposals for energy conservation for CPUs and peripherals, energy optimization techniques for selective operating mode control of DRAMs have not been fully explored. It has been shown that as much as 90% of overall system energy (excluding I/O) is consumed ..."
Abstract
-
Cited by 44 (10 self)
- Add to MetaCart
While there have been several studies and proposals for energy conservation for CPUs and peripherals, energy optimization techniques for selective operating mode control of DRAMs have not been fully explored. It has been shown that as much as 90% of overall system energy (excluding I/O) is consumed by the DRAM modules, serving as a good candidate for energy optimizations. Further, DRAM technology has also matured to provide several low energy operating modes (power modes), making it an opportunistic moment to conduct studies exploring the potential benefits of mode control techniques. This paper conducts an in-depth investigation of software and hardware techniques to avail of the DRAM mode control capabilities at a module granularity for energy savings.
Specialization tools and techniques for systematic optimization of system software
- ACM Transactions on Computer Systems
, 2001
"... Specialization has been recognized as a powerful technique for optimizing operating systems. However, specialization has not been broadly applied beyond the research community because current techniques, based on manual specialization, are time-consuming and error-prone. The goal of the work describ ..."
Abstract
-
Cited by 38 (13 self)
- Add to MetaCart
Specialization has been recognized as a powerful technique for optimizing operating systems. However, specialization has not been broadly applied beyond the research community because current techniques, based on manual specialization, are time-consuming and error-prone. The goal of the work described in this paper is to help operating system tuners perform specialization more easily. We have built a specialization toolkit that assists the major tasks of specializing operating systems. We demonstrate the effectiveness of the toolkit by applying it to three diverse operating system components. We show that using tools to assist specialization enables significant performance optimizations without errorprone manual modifications. Our experience with the toolkit suggests new ways of designing systems that combine high performance and clean structure. 1
Compiler-Directed Scratch Pad Memory Hierarchy Design and Management
- In DAC ’02: Proceedings of the 39th conference on Design automation
, 2002
"... One of the primary challenges in embedded system design is designing the memory hierarchy and restructuring the application to take advantage of it. This task is particularly important for embedded image and video processing applications that make heavy use of large multidimensional arrays of signal ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
One of the primary challenges in embedded system design is designing the memory hierarchy and restructuring the application to take advantage of it. This task is particularly important for embedded image and video processing applications that make heavy use of large multidimensional arrays of signals and nested loops. In this paper, we show that a simple reuse vector/matrix abstraction can provide compiler with useful information in a concise form. Using this information, compiler can either adapt application to an existing memory hierarchy or can come up with a memory hierarchy. Our initial results indicate that the compiler is very successful in both optimizing code for a given memory hierarchy and designing a hierarchy with reasonable performance /size ratio.
Credible Compilation
- In Proceedings of CC 2001: International Conference on Compiler Construction
, 1999
"... This paper presents an approach to compiler correctness in which the compiler generates a proof that the transformed program correctly implements the input program. A simple proof checker can then verify that the program was compiled correctly. We call a compiler that produces such proofs a credible ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
This paper presents an approach to compiler correctness in which the compiler generates a proof that the transformed program correctly implements the input program. A simple proof checker can then verify that the program was compiled correctly. We call a compiler that produces such proofs a credible compiler, because it produces verifiable evidence that it is operating correctly.

