• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 2,547
Next 10 →

Complexity-effective superscalar processors

by Subbarao Palacharla, J. E. Smith, et al. - IN PROCEEDINGS OF THE 24TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE , 1997
"... The performance tradeoff between hardware complexity and clock speed is studied. First, a generic superscalar pipeline is defined. Then the specific areas of register renaming, instruction window wakeup and selection logic, and operand bypassing are ana-lyzed. Each is modeled and Spice simulated for ..."
Abstract - Cited by 467 (5 self) - Add to MetaCart
The performance tradeoff between hardware complexity and clock speed is studied. First, a generic superscalar pipeline is defined. Then the specific areas of register renaming, instruction window wakeup and selection logic, and operand bypassing are ana-lyzed. Each is modeled and Spice simulated

Simultaneous Multithreading: Maximizing On-Chip Parallelism

by Dean M. Tullsen , Susan J. Eggers, Henry M. Levy , 1995
"... This paper examines simultaneous multithreading, a technique permitting several independent threads to issue instructions to a superscalar’s multiple functional units in a single cycle. We present several models of simultaneous multithreading and compare them with alternative organizations: a wide s ..."
Abstract - Cited by 823 (48 self) - Add to MetaCart
superscalar, a fine-grain multithreaded processor, and single-chip, multiple-issue multiprocessing architectures. Our results show that both (single-threaded) superscalar and fine-grain multithreaded architectures are limited in their ability to utilize the resources of a wide-issue processor. Simultaneous

Adding a Vector Unit to a Superscalar Processor

by Francisca Quintana, Jesus Corbal, Roger Espasa, Mateo Valero , 1999
"... The focus of this paper is on adding a vector unit to a superscalar core, as a way to scale current state of the art superscalar processors. The proposed architecture has a vector register file that shares functional units both with the integer datapath and with the floating point datapath. A key po ..."
Abstract - Cited by 22 (11 self) - Add to MetaCart
The focus of this paper is on adding a vector unit to a superscalar core, as a way to scale current state of the art superscalar processors. The proposed architecture has a vector register file that shares functional units both with the integer datapath and with the floating point datapath. A key

The Case for a Single-Chip Multiprocessor

by Kunle Olukotun, Basem A. Nayfeh, Lance Hammond, Ken Wilson, Kunyung Chang - IEEE Computer , 1996
"... Advances in IC processing allow for more microprocessor design options. The increasing gate density and cost of wires in advanced integrated circuit technologies require that we look for new ways to use their capabilities effectively. This paper shows that in advanced technologies it is possible to ..."
Abstract - Cited by 440 (6 self) - Add to MetaCart
to implement a single-chip multiproces-sor in the same area as a wide issue superscalar processor. We find that for applications with little parallelism the performance of the two microarchitectures is comparable. For applications with large amounts of parallelism at both the fine and coarse grained levels

Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor

by Dean M. Tullsen, Susan J. Eggers, Joel S. Emer , Henry M. Levy, Jack L. Lo, Rebecca L. Stamm - IN PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE , 1996
"... Simultaneous multithreading is a technique that permits multiple independent threads to issue multiple instructions each cycle. In previous work we demonstrated the performance potential of simultaneous multithreading, based on a somewhat idealized model. In this paper we show that the throughput ga ..."
Abstract - Cited by 382 (37 self) - Add to MetaCart
over an unmodified superscalar with similar hardware resources. This speedup is enhanced by an advantage of multithreading previously unexploited in other architectures: the ability to favor for fetch and issue those threads most efficiently using the processor each cycle, thereby providing the “best

The Superblock: An effective technique for VLIW and superscalar compilation

by Wen-mei W. Hwu, Scott A. Mahlke, William Y. Chen, Pohua P. Chang, Nancy J. Warter, Roger A. Bringmann, Roland G. Ouellette, Richard E. Hank, Tokuzo Kiyohara, Grant E. Haab, John G. Holm, Daniel M. Lavery - THE JOURNAL OF SUPERCOMPUTING , 1993
"... A compiler for VLIW and superscalar processors must expose sufficient instruction-level parallelism (ILP) to eddectively utilize the parallel hardware. However, ILP within basic blocks is extremely limited for control-intensive programs. We have developed a set of techniques for exploiting ILP acros ..."
Abstract - Cited by 289 (28 self) - Add to MetaCart
A compiler for VLIW and superscalar processors must expose sufficient instruction-level parallelism (ILP) to eddectively utilize the parallel hardware. However, ILP within basic blocks is extremely limited for control-intensive programs. We have developed a set of techniques for exploiting ILP

Performance study of the filter data cache on a superscalar processor architecture

by Julio Sahuquillo, Salvador Petit, Ana Pont - IEEE Computer Society Technical Committee on Computer Architecture Newsletter (TCCA) News , 2001
"... The goal of cache design is to exploit data localities; however, the means to this end vary widely among the cache proposals. This paper concentrates on those cache models that split the first level data cache into two independent organizations that are accessed at the same time by the processor. In ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
other schemes that split the first level data cache according to the criterion of data locality (NTS and SSNS). In this paper, we check the performance of the Filter Data Cache on a state-of-the-art superscalar processor that achieves a better speedup and a higher number of instructions per cycle (IPC

Effective Compiler Support for Predicated Execution Using the Hyperblock

by Scott A. Mahlke, et al. , 1992
"... Predicated execution is an effective technique for dealing with conditional branches in application programs. However, there are several problems associated with conventional compiler support for predicated execution. First, all paths of control are combined into a single path regardless of their ex ..."
Abstract - Cited by 374 (24 self) - Add to MetaCart
to utilize predicated execution for both compile-time optimization and scheduling. Preliminary experimental results show that the hyperblock is highly effective for a wide range of superscalar and VLIW processors.

The Microarchitecture of Superscalar Processors

by James E. Smith, Gurindar S. Sohi , 1995
"... Superscalar processing is the latest in a long series of innovations aimed at producing ever-faster microprocessors. By exploiting instruction-level parallelism, superscalar processors are capable of executing more than one instruction in a clock cycle. This paper discusses the microarchitecture of ..."
Abstract - Cited by 99 (1 self) - Add to MetaCart
Superscalar processing is the latest in a long series of innovations aimed at producing ever-faster microprocessors. By exploiting instruction-level parallelism, superscalar processors are capable of executing more than one instruction in a clock cycle. This paper discusses the microarchitecture

Quantifying the Complexity of Superscalar Processors.

by Subbarao Palacharla , Norman P Jouppi , James E Smith , 1996
"... Abstract To characterize future performance limitations of superscalar processors, the delays of key pipeline structures in superscalar processors are studied. First, a generic superscalar pipeline is defined. Then the specific areas of register renaming, instruction window wakeup and selection log ..."
Abstract - Cited by 85 (0 self) - Add to MetaCart
Abstract To characterize future performance limitations of superscalar processors, the delays of key pipeline structures in superscalar processors are studied. First, a generic superscalar pipeline is defined. Then the specific areas of register renaming, instruction window wakeup and selection
Next 10 →
Results 1 - 10 of 2,547
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University