• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 2,781
Next 10 →

SMARTS: Accelerating Microarchitecture Simulation via Rigorous Statistical Sampling

by Roland E. Wunderlich, Thomas F. Wenisch, Babak Falsafi, James C. Hoe - in Proceedings of the 30th annual international symposium on Computer architecture , 2003
"... Current software-based microarchitecture simulators are many orders of magnitude slower than the hardware they simulate. Hence, most microarchitecture design studies draw their conclusions from drastically truncated benchmark simulations that are often inaccurate and misleading. This paper presents ..."
Abstract - Cited by 258 (25 self) - Add to MetaCart
benchmarks, running with average speedups of 35 and 60 over detailed simulation of 8-way and 16-way out-of-order processors, respectively. 1.

Statistical sampling of microarchitecture simulation

by Roland E. Wunderlich, Thomas F. Wenisch, Babak Falsafi, James C. Hoe - In 20th International Parallel and Distributed Processing Symposium (IPDPS , 2006
"... Current software-based microarchitecture simulators are many orders of magnitude slower than the hardware they simulate. Hence, most microarchitecture design studies draw their conclusions from drastically truncated benchmark simulations that are often inaccurate and misleading. This article present ..."
Abstract - Cited by 9 (1 self) - Add to MetaCart
benchmarks, running with average speedups of 35 and 60 over detailed simulation of 8-way and 16-way out-of-order processors, respectively. Categories and Subject Descriptors: C.4 [Performance of Systems]—Measurement techniques,

Out-of-order commit processors

by Adrian Cristal, Josep Llosa, Mateo Valero, Cataluña Hewlett, Packard Labs - In Proceedings of the 10th International Symposium on High Performance Computer Architecture , 2004
"... Modern out-of-order processors tolerate long latency memory operations by supporting a large number of inflight instructions. This is particularly useful in numerical applications where branch speculation is normally not a problem and where the cache hierarchy is not capable of delivering the data s ..."
Abstract - Cited by 59 (12 self) - Add to MetaCart
Modern out-of-order processors tolerate long latency memory operations by supporting a large number of inflight instructions. This is particularly useful in numerical applications where branch speculation is normally not a problem and where the cache hierarchy is not capable of delivering the data

Caches in Out-of-order Processors

by Sheng Li, Ke Chen, Jay B. Brockman, Sheng Li, Ke Chen, Jay B. Brockman
"... Non-blocking caches are an effective technique for tolerating cache-miss latency. They can reduce miss-induced processor stalls by buffering the misses and continuing to serve other independent access requests. Previous research on the complexity and performance of non-blocking caches supporting non ..."
Abstract - Add to MetaCart
-ahead capability, a perfect branch predictor, fixed 16-cycle memory latency, single-cycle latency for floating point operations, and write-through and write-no-allocate caches. These assumptions are very different from today's high performance out-of-order processors such as the Intel Nehalem. Thus

Out-of-Order Superscalar Processor

by Robin Schmutz, Karine Brifault, François Bodin, Systèmes Numériques, Robin Schmutz, Karine Brifault, François Bodin, Systèmes Numériques, Projet Projet Caps
"... apport de recherche ISSN 0249-6399A structural model for WCET estimation of Simple ..."
Abstract - Add to MetaCart
apport de recherche ISSN 0249-6399A structural model for WCET estimation of Simple

Out-of-Order Vector Architectures

by Roger Espasa, Mateo Valero, James E. Smith , 1997
"... Register renaming and out-of-order instruction issue are now commonly used in superscalar processors. These techniques can also be used to significant advantage in vector processors, as this paper shows. Performance is improved and available memory bandwidth is used more effectively. Using a trace d ..."
Abstract - Cited by 59 (21 self) - Add to MetaCart
Register renaming and out-of-order instruction issue are now commonly used in superscalar processors. These techniques can also be used to significant advantage in vector processors, as this paper shows. Performance is improved and available memory bandwidth is used more effectively. Using a trace

Runahead execution: An alternative to very large instruction windows for out-of-order processors

by Onur Mutlu, Jared Stark, Chris Wilkerson, Yale N. Patt - In HPCA-9 , 2003
"... Today’s high performance processors tolerate long latency operations by means of out-of-order execution. However, as latencies increase, the size of the instruction window must increase even faster if we are to continue to tolerate these latencies. We have already reached the point where the size of ..."
Abstract - Cited by 175 (22 self) - Add to MetaCart
of an instruction window that can handle these latencies is prohibitively large, in terms of both design complexity and power consumption. And, the problem is getting worse. This paper proposes runahead execution as an effective way to increase memory latency tolerance in an out-of-order processor, without

Out-of-Order Execution of . . .

by Yvon Jegou, Yvon J , 1993
"... : The superscalar execution model extracts independent instructions from a restricted window. When pipeline latencies go beyond some limit, out-oforder execution becomes necessary to fully exploit the independency of instructions. On the other hand, multithreading merges the execution of independent ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
of independent instruction flows. The simultaneous use of these two techniques is expected to produce a high degree of concurrent activities in future processors. But, the codes must be interruptible and restartable. Classical program state construction is incompatible with out-of-order of execution

Performance of Database Workloads on Shared-Memory Systems with Out-of-Order Processors

by Parthasarathy Ranganathan, Kourosh Gharachorloo, Sarita V. Adve, Luiz Andre Barroso , 1998
"... Database applications such as online transaction processing (OLTP) and decision support systems (DSS) constitute the largest and fastest-growing segment of the market for multiprocessor servers. However, most current system designs have been optimized to perform well on scientific and engineering wo ..."
Abstract - Cited by 97 (3 self) - Add to MetaCart
with aggressive out-of-order processors, and considers simple optimizations that can provide further performance improvements. Our study is based on detailed simulations of the Oracle commercial database engine. The results show that the combination of out-of-order execution and multiple instruction issue

Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors

by Pierre Michaud, Andre Seznec , 2001
"... The performance of out-of-order processors increases with the instruction window size. In conventional processors, the effective instruction window cannot be larger than the issue buffer. Determining which instructions from the issue buffer can be launched to the execution units is a timecritical op ..."
Abstract - Cited by 82 (0 self) - Add to MetaCart
The performance of out-of-order processors increases with the instruction window size. In conventional processors, the effective instruction window cannot be larger than the issue buffer. Determining which instructions from the issue buffer can be launched to the execution units is a timecritical
Next 10 →
Results 1 - 10 of 2,781
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University