• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Memory Barriers: a Hardware View for Software Hackers (2009)

Cached

  • Download as a PDF

Download Links

  • [www.rdrop.com]
  • [www.puppetmastertrading.com]
  • [www.rdrop.com]
  • [irl.cs.ucla.edu]
  • [irl.cs.ucla.edu]
  • [www.rdrop.com]
  • [www.rdrop.com]
  • [www.rdrop.com]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Paul E. Mckenney
Citations:9 - 0 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Mckenney09memorybarriers:,
    author = {Paul E. Mckenney},
    title = {Memory Barriers: a Hardware View for Software Hackers},
    year = {2009}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

So what possessed CPU designers to cause them to inflict memory barriers on poor unsuspecting SMP software designers? In short, because reordering memory references allows much better performance, and so memory barriers are needed to force ordering in things like synchronization primitives whose correct operation depends on ordered memory references. Getting a more detailed answer to this question requires a good understanding of how CPU caches work, and especially what is required to make caches really work well. The following sections: 1. present the structure of a cache, 2. describe how cache-coherency protocols ensure that CPUs agree on the value of each location in memory, and, finally, 3. outline how store buffers and invalidate queues help caches and cache-coherency protocols achieve high performance. We will see that memory barriers are a necessary evil that is required to enable good performance and scalability, an evil that stems from the fact that CPUs are orders of magnitude faster than are both the interconnects between them and the memory they are attempting to access. 1 Cache Structure Modern CPUs are much faster than are modern memory systems. A 2006 CPU might be capable of executing ten instructions per nanosecond, but will require many tens of nanoseconds to fetch a data item from main memory. This disparity in speed — more than two orders of magnitude — has resulted in the multimegabyte caches found on modern CPUs. These caches are associated with the CPUs as shown in Figure 1, and can typically be accessed in a few cycles. 1

Keyphrases

memory barrier    software hacker    hardware view    modern memory system    invalidate queue    cache structure modern cpu    main memory    ten instruction    store buffer    good performance    memory reference    synchronization primitive    high performance    modern cpu    good understanding    cpu designer    detailed answer    poor unsuspecting smp software designer    necessary evil    many ten    cache-coherency protocol    ordered memory reference    correct operation    data item    following section   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2018 The Pennsylvania State University