• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Machine-description driven compilers for EPIC and VLIW processors. Design Automation for Embedded Systems (1999)

by B R Rau, V Kathail, S Aditya
Add To MetaCart

Tools

Sorted by:
Results 11 - 13 of 13

Register Files Constraint Satisfaction during Scheduling of DSP Code

by Carlos A. Alba Pinto, Bart Mesman, Koen Van Eijk - In Symposium on Integrated Circuits and Systems Design , 1999
"... Algorithms in digital signal processing (DSP) impose tight timing constraints that the compiler has to respect while considering the limited capacity of the available register files in a target DSP processor. Traditional code generation methods that schedule spill code to satisfy storage capacity ma ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
Algorithms in digital signal processing (DSP) impose tight timing constraints that the compiler has to respect while considering the limited capacity of the available register files in a target DSP processor. Traditional code generation methods that schedule spill code to satisfy storage capacity may take many iterations and are usually not capable of satisfying the timing constraints. In this paper we present a new method to handle register file capacity constraints during scheduling. The method identifies potential bottlenecks for register binding and subsequently serializes the lifetimes of values until it can be guaranteed that all capacity constraints will be satisfied after scheduling. Experiments show that we efficiently obtain high quality instruction schedules for DSP kernels. 1. Introduction Embedded digital signal processors offer good performance for application domains such as communication and multimedia. There are roughly two communities that attempt to design these pr...

Cluster assignment for high-performance embedded vliw processors

by Viktor S. Lapinskii, Margarida F. Jacome, Gustavo A. De Veciana - ACM Trans. Des. Autom. Electron. Syst , 2002
"... Clustering is an effective method to increase the available parallelism in VLIW datapaths without incurring severe penalties associated with a large number of register file ports. Efficient utilization of a clustered datapath requires careful binding/assignment of operations to clusters. The article ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Clustering is an effective method to increase the available parallelism in VLIW datapaths without incurring severe penalties associated with a large number of register file ports. Efficient utilization of a clustered datapath requires careful binding/assignment of operations to clusters. The article proposes a binding algorithm that effectively explores trade-offs between in-cluster operation serialization and delays associated with data transfers between clusters. Extensive experimental evidence is provided showing that the algorithm generates high quality solutions for representative kernels, with up to 33 % improvement over a state-of-the-art binding algorithm. Categories and Subject Descriptors: D.3.4 [Programming Languages]: Processors—Code generation;

High-Level Synthesis of Nonprogrammable Hardware Accelerators

by B. Ramakrishna Rau, Vinod Kathail, Robert Schreiber, Robert Schreiber, Shail Aditya, Shail Aditya, B. Ramakrishna, Rau Vinod Kathail, Scott Mahlke, Scott Mahlke, Santosh Abraham, Santosh Abraham, Greg Snider, Greg Snider - Journal of VLSI Signal Processing , 2000
"... The PICO-N system automatically synthesizes embedded nonprogrammable accelerators to be used as co-processors for functions expressed as loop nests in C. The output is synthesizable VHDL that defines the accelerator at the register transfer level (RTL). The system generates a synchronous array of cu ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
The PICO-N system automatically synthesizes embedded nonprogrammable accelerators to be used as co-processors for functions expressed as loop nests in C. The output is synthesizable VHDL that defines the accelerator at the register transfer level (RTL). The system generates a synchronous array of customized VLIW (very-long instruction word) processors, their controller, local memory, and interfaces. The system also modifies the user's application software to make use of the generated accelerator. The user indicates the throughput to be achieved by specifying the number of processors and their initiation interval. In experimental comparisons, PICO-N designs are slightly more costly than hand-designed accelerators with the same performance.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University