• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Architecture Evaluator’s Work Bench and Its Application to Microprocessor Floating Point Units (1995)

by S Fu, N Quach, M Flynn
Add To MetaCart

Tools

Sorted by:
Results 1 - 2 of 2

Design Issues in Division and Other Floating-Point Operations

by Stuart F. Oberman, Student Member, Michael J. Flynn - IEEE Transactions on Computers , 1997
"... Floating-point division is generally regarded as a low frequency, high latency operation in typical floating-point applications. However, in the worst case, a high latency hardware floating-point divider can contribute an additional 0.50 CPI to a system executing SPECfp92 applications. This paper ..."
Abstract - Cited by 19 (7 self) - Add to MetaCart
Floating-point division is generally regarded as a low frequency, high latency operation in typical floating-point applications. However, in the worst case, a high latency hardware floating-point divider can contribute an additional 0.50 CPI to a system executing SPECfp92 applications. This paper presents the system performance impact of floating-point division latency for varying instruction issue rates. It also examines the performance implications of shared multiplication hardware, shared square root, on-the-fly rounding and conversion, and fused functional units. Using a system level study as a basis, it is shown how typical floating-point applications can guide the designer in making implementation decisions and trade-offs.

Time and Area Optimization in Processor Architecture

by M. J. Flynn - In Proceedings of ARCS'97 , 1997
"... For specified program behavior and clocking overhead, there is an optimum cycle time. This can be improved somewhat by using wave pipelining, but program unpredictability ultimately limits performance by restricting both cycle time and instruction level parallelism. Algorithm and application impleme ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
For specified program behavior and clocking overhead, there is an optimum cycle time. This can be improved somewhat by using wave pipelining, but program unpredictability ultimately limits performance by restricting both cycle time and instruction level parallelism. Algorithm and application implementation should be based on understanding of program behavior, CAD tools, and technology. System on a chip can be realized as die potential increases. This system die then consists of collecting a variety of functional implementations and chip. These include core processor, floating point unit signal processors, cache, message compression and encryption, etc. Functional implementations involve selecting particular algorithms so that total application execution time is minimized under the constraints of fixed die area. Underlying all improvements in processor architecture are fundamental notions of the optimum use of time and space. In silicon CMOS technologies, the notion of optimum cost-- p...
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University