• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

A Trace Cache Microarchitecture and Evaluation (1999)

Cached

  • Download as a PDF

Download Links

  • [www.ece.ucdavis.edu]
  • [www.ece.ucdavis.edu]
  • [www.ee.ryerson.ca]
  • [www.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [www.ee.ryerson.ca]
  • [www.tinker.ncsu.edu]
  • [www.ecst.csuchico.edu]
  • [www.ecst.csuchico.edu]
  • [www.ecst.csuchico.edu]
  • [www.ecst.csuchico.edu]
  • [www.tinker.ncsu.edu]
  • [people.engr.ncsu.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Eric Rotenberg , Steve Bennett , James E. Smith
Venue:IEEE Transactions on Computers
Citations:55 - 3 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@ARTICLE{Rotenberg99atrace,
    author = {Eric Rotenberg and Steve Bennett and James E. Smith},
    title = {A Trace Cache Microarchitecture and Evaluation},
    journal = {IEEE Transactions on Computers},
    year = {1999}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

As the instruction issue width of superscalar proces-sors increases, instruction fetch bandwidth requirements will also increase. It will eventually become necessary to fetch multiple basic blocks per clock cycle. Conventional in-struction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. Trace caches overcome this limitation by caching traces of the dynamic instruction stream, so instructions that are otherwise noncontiguous appear contiguous. In this paper we present and evaluate a microarchitecture incorporating a trace cache. The microarchitecture provides high instruc-tion fetch bandwidth with low latency by explicitly sequenc-ing through the program at the higher level of traces, both in terms of (1) control flow prediction and (2) instruction supply. For the SPEC95 integer benchmarks, trace-level se-quencing improves performance from 15 % to 35 % over an otherwise equally-sophisticated, but contiguous multiple-block fetch mechanism. Most of this performance improve-ment is due to the trace cache. However, for one benchmark whose performance is limited by branch mispredictions, the performance gain is due almost entirely to improved predic-tion accuracy.

Keyphrases

trace cache microarchitecture    trace cache    long instruction sequence    low latency    instruction supply    contiguous multiple-block fetch mechanism    predic-tion accuracy    contiguous cache location    instruction fetch bandwidth requirement    conventional in-struction cache    trace-level se-quencing improves performance    performance gain    control flow prediction    clock cycle    superscalar proces-sors increase    branch mispredictions    high instruc-tion fetch bandwidth    multiple basic block    instruction issue width    spec95 integer benchmark    performance improve-ment    dynamic instruction stream   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University