• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Success and limitations in automatic parallelization of the Perfect benchmarks TM programs (1992)

by William Joseph Blume
Add To MetaCart

Tools

Sorted by:
Results 1 - 7 of 7

Optimization within a Unified Transformation Framework

by Wayne Anthony Kelly , 1996
"... ..."
Abstract - Cited by 29 (0 self) - Add to MetaCart
Abstract not found

Symbolic Analysis Techniques For Effective Automatic Parallelization

by William Joseph Blume , 1995
"... ..."
Abstract - Cited by 20 (1 self) - Add to MetaCart
Abstract not found

Evaluation Of Programs And Parallelizing Compilers Using Dynamic Analysis Techniques

by Paul Marx Petersen, Paul Marx Petersen, Paul Marx Petersen, Ph. D , 1993
"... results for an unlimited number of processors. Upper and lower bounds of the inherent parallelism, for the case of limited processors, can be derived from the processor activity histogram, which records the number of concurrent operations during each time period. Stress analysis is a derivative of ..."
Abstract - Cited by 15 (1 self) - Add to MetaCart
results for an unlimited number of processors. Upper and lower bounds of the inherent parallelism, for the case of limited processors, can be derived from the processor activity histogram, which records the number of concurrent operations during each time period. Stress analysis is a derivative of critical path analysis that determines the locations in a program that have the largest contribution to the critical path. Inductions are a computation that introduce an internal stress. A specific method is presented which measures the effects of removing the serializing effects of inductions on the inherent parallelism. Dependence analysis is crucial to the effective operation of parallelizing compilers. Static and dynamic evaluation of the effectiveness of compile-time data dependence analysis is presented, the evaluation compares the existing techniques against each other, and against the theoretical optimal results. Special attention is paid to the dependences which serialize interproce

Reducing The Impact Of Register Pressure On Software Pipelined Loops

by Josep Llosa, Margarita Espuny , 1996
"... This work deals with the problems caused by the high register requirements of software pipelined loops. The main contributions of this work are: * Register requirements of software pipelined loops are evaluated. * Several heuristics to perform register-constrained software pipelining are proposed * ..."
Abstract - Cited by 12 (8 self) - Add to MetaCart
This work deals with the problems caused by the high register requirements of software pipelined loops. The main contributions of this work are: * Register requirements of software pipelined loops are evaluated. * Several heuristics to perform register-constrained software pipelining are proposed * The effects of register requirements on performance under register constraints are evaluated * HRMS is proposed to perform software pipelining with resource constraints and reduced register requirements * Two new register file organizations are proposed to allow for a large number of registerse with low area cost and fast access time.

Global Value Propagation Through Value Flow Graph and Its Use in Dependence Analysis

by Vadim Maslov, Vadim Maslov
"... As recent studies show, state-of-the-art parallelizing compilers produce no noticeable speedup for 9 out of 12 PERFECT benchmark codes, while the speedup that was reached by manually applying certain automatable techniques ranges from 10 to 50. In this paper we introduce the Global Value Propagation ..."
Abstract - Add to MetaCart
As recent studies show, state-of-the-art parallelizing compilers produce no noticeable speedup for 9 out of 12 PERFECT benchmark codes, while the speedup that was reached by manually applying certain automatable techniques ranges from 10 to 50. In this paper we introduce the Global Value Propagation algorithm that unifies several of these techniques. Global propagation is performed using program abstraction called Value Flow Graph (VFG). VFG is an acyclic graph in which vertices and arcs are parametrically specified using F-relations. The distinctive features of our propagation algorithm are: (1) It propagates not only values carried by scalar variables, but also values carried by individual array elements. (2) We do not have to transform a program in order to use propagation results in program analysis. In this paper we focus on use of the VFG and global value propagation in array dataflow analysis. F-relations are used to represent values produced by uninterpreted function symbols th...

Enhancing Array Dataflow Dependence Analysis with On-Demand Global Value Propagation

by Vadim Maslov Pimmit, Vadim Maslov - In Proc. International Conference on Supercomputing , 1995
"... As recent studies show, state-of-the-art parallelizing compilers produce no noticeable speedup for 9 out of 12 PERFECT benchmark codes, while the speedup that was reached by manually applying certain automatable constraint propagation techniques ranges from 10 to 50 times. In this paper we show h ..."
Abstract - Add to MetaCart
As recent studies show, state-of-the-art parallelizing compilers produce no noticeable speedup for 9 out of 12 PERFECT benchmark codes, while the speedup that was reached by manually applying certain automatable constraint propagation techniques ranges from 10 to 50 times. In this paper we show how a subset of these much-desired techniques can be automated. We describe an algorithm that is a combination of exact array dataflow dependence analysis and on-demand global value propagation. Propagating values to the references that make the dependence problem non-affine, the algorithm in many cases can affinize the dependence problem. Affine dependence problems result in exact dependence information and therefore lead to new opportunities in propagation. We also present three algorithms for global value propagation and discuss their merits and applications. The propagation is performed on the acyclic parametrized value flow graph of the program represented by F-relations (also in...

Memory Latency Rediction via Data Prefetching and Data Forwarding in Shared Memory Multiprocessors

by David Kristian Poulsen, David Kristian Poulsen, Ph. D , 1994
"... This dissertation considers the use of data prefetching and an alternative mechanism, data forwarding, for reducing memory latency due to interprocessor communication in cache coherent, shared memory multiprocessors. The benefits of prefetching and forwarding are considered for large, numerical appl ..."
Abstract - Add to MetaCart
This dissertation considers the use of data prefetching and an alternative mechanism, data forwarding, for reducing memory latency due to interprocessor communication in cache coherent, shared memory multiprocessors. The benefits of prefetching and forwarding are considered for large, numerical application codes with loop-level and vector parallelism. Data prefetching is applied to these applications using two different multiprocessor prefetching algorithms implemented within a parallelizing compiler. Data forwarding considers array references involved in communication-related accesses between successive parallel loops, rather than within a single loop nest. A hybrid prefetching and forwarding scheme and a compiler algorithm for data forwarding are also presented
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University