• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 131
Next 10 →

Thread-Sensitive Modulo Scheduling for Multicore Processors ∗

by Lin Gao, Quan Hoang Nguyen, Lian Li, Jingling Xue, Tin-fook Ngai
"... This paper describes a generalisation of modulo scheduling to parallelise loops for SpMT processors that exploits simultaneously both instruction-level parallelism and thread-level parallelism while preserving the simplicity and effectiveness of modulo scheduling. Our generalisation is simple, drops ..."
Abstract - Add to MetaCart
This paper describes a generalisation of modulo scheduling to parallelise loops for SpMT processors that exploits simultaneously both instruction-level parallelism and thread-level parallelism while preserving the simplicity and effectiveness of modulo scheduling. Our generalisation is simple

Reuse-Aware Modulo Scheduling for Stream Processors

by Li Wang, Jingling Xue, Xuejun Yang
"... Abstract—This paper presents reuse-aware modulo scheduling to maximizing stream reuse and improving concurrency for stream-level loops running on stream processors. The novelty lies in the development of a new representation for an unrolled and software-pipelined stream-level loop using a set of reu ..."
Abstract - Add to MetaCart
Abstract—This paper presents reuse-aware modulo scheduling to maximizing stream reuse and improving concurrency for stream-level loops running on stream processors. The novelty lies in the development of a new representation for an unrolled and software-pipelined stream-level loop using a set

Distributed Modulo Scheduling

by Marcio Merino Fernandes, Josep Llosa, Nigel Topham , 1999
"... Wide-issue ILP machines can be built using the VLIW approach as many of the hardware complexities found in superscalar processors can be transferred to the compiler. However, the scalability of VLIW architectures is still constrained by the size and number of ports of the register file required by a ..."
Abstract - Cited by 34 (6 self) - Add to MetaCart
Wide-issue ILP machines can be built using the VLIW approach as many of the hardware complexities found in superscalar processors can be transferred to the compiler. However, the scalability of VLIW architectures is still constrained by the size and number of ports of the register file required

Modulo Scheduling With Isomorphic Control Transformations

by Nancy Jeanne Warter , 1994
"... ... over other software pipelining techniques based on global scheduling. The ICTs are applied to Modulo Scheduling to schedule loops with conditional branches. Experimental results show that this approach allows more flexible scheduling and thus better performance than Modulo Scheduling with Hierar ..."
Abstract - Cited by 29 (0 self) - Add to MetaCart
with Hierarchical Reduction. Modulo Scheduling with ICTs targets processors with no or limited support for conditional execution such as superscalar processors. However, in processors that do not require instruction set compatibility, support for Predicated Execution can be used. This dissertation shows that Modulo

A unified modulo scheduling and register allocation technique for clustered processors

by Josep M. Codina, Jesús Sánchez, Antonio González - In Proceedings of the 10th International Conference on Parallel Architectures and Compilation Techniques , 2001
"... This work presents a modulo scheduling framework for clustered ILP processors that integrates the cluster assignment, instruction scheduling and register allocation steps in a single phase. This unified approach is more effective than traditional approaches based on sequentially performing some (or ..."
Abstract - Cited by 20 (4 self) - Add to MetaCart
This work presents a modulo scheduling framework for clustered ILP processors that integrates the cluster assignment, instruction scheduling and register allocation steps in a single phase. This unified approach is more effective than traditional approaches based on sequentially performing some (or

Unrolling-Based Optimizations for Modulo Scheduling

by Daniel M. Lavery, Wen-mei W. Hwu - in Proceedings of the 28th International Symposium on Microarchitecture , 1995
"... Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput modulo scheduled loop depends on the resource requirements, the dependence pattern, and the reg ..."
Abstract - Cited by 34 (3 self) - Add to MetaCart
Modulo scheduling is a method for overlapping successive iterations of a loop in order to find sufficient instruction-level parallelism to fully utilize high-issue-rate processors. The achieved throughput modulo scheduled loop depends on the resource requirements, the dependence pattern

Modulo Schedule Buffers

by Matthew C. Merten, Wen-mei W. Hwu - In Proc. of the 34th Annual International Symposium on Microarchitecture , 2001
"... As VLIW/EPIC processors are increasingly used in realtime, signal-processing, and embedded applications, the importance of minimizing code size and reducing power is growing. This paper describes a new architectural mechanism, called the Modulo Schedule Buffers, that provides an elegant interface fo ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
As VLIW/EPIC processors are increasingly used in realtime, signal-processing, and embedded applications, the importance of minimizing code size and reducing power is growing. This paper describes a new architectural mechanism, called the Modulo Schedule Buffers, that provides an elegant interface

Register constrained modulo scheduling

by Javier Zalamea, Josep Llosa, Eduard Ayguadé, Mateo Valero - IEEE Trans. Parallel Distrib. Syst
"... Abstract—Software pipelining is an instruction scheduling technique that exploits the instruction level parallelism (ILP) available in loops by overlapping operations from various successive loop iterations. The main drawback of aggressive software pipelining techniques is their high register requir ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
propose a set of heuristics to improve the spilling process and to better decide between adding spill code or directly decreasing the execution rate of iterations. The experimental evaluation, over a large number of representative loops and for a processor configuration, reports an increase in performance

Preprocessing Strategy for Effective Modulo Scheduling on Multi-Issue Digital Signal Processors

by Doosan Cho, Ravi Ayyagari, Gang-ryung Uh, Yunheung Paek
"... Abstract. To achieve high resource utilization for multi-issue Digital Signal Processors (DSPs), production compilers commonly include variants of the iterative modulo scheduling algorithm. However, excessive cyclic data dependences, which exist in communication and media processing loops, often pre ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
Abstract. To achieve high resource utilization for multi-issue Digital Signal Processors (DSPs), production compilers commonly include variants of the iterative modulo scheduling algorithm. However, excessive cyclic data dependences, which exist in communication and media processing loops, often

A Fault-Tolerant Permutation Network Modulo Arithmetic Processor

by Ming-Bo Lin , Senior Member, IEEE A Yavuz Orug - IEEE Trans. on VLSI Systems , 1994
"... Abstract-Conventional fault-tolerant modulo arithmetic processors rely on the properties of a residue number system with L redundant moduli to detect up to L / 2 errors. In this paper, we propose a new scheme that combines r-out-of-s residue codes with Berger codes to concurrently detect any number ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Abstract-Conventional fault-tolerant modulo arithmetic processors rely on the properties of a residue number system with L redundant moduli to detect up to L / 2 errors. In this paper, we propose a new scheme that combines r-out-of-s residue codes with Berger codes to concurrently detect any
Next 10 →
Results 1 - 10 of 131
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University