DOC PREVIEW
U of U CS 6810 - Multiprocessors

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Page 1 1 CS6810 School of Computing University of Utah Multiprocessors Today’s topics: SMP cache coherence general cache coherence issues snooping protocols Improved interaction lots of questions warning – I’m going to wait for answers granted it’s an experiment pace will be SLOWer 2 CS6810 School of Computing University of Utah SMP Review • Characteristics  global physical address space » UMA and hence “symmetric”  each processor has it’s own cache » for now let’s just assume 1 level to simplify things  physically shared main memory » easy export of shared memory programming modelPage 2 3 CS6810 School of Computing University of Utah Bus Based Coherence • Cache coherence  for shared lines: simple version » all copies of the cached line have the same contents  simultaneous update is hard: complex version » for any read: return value of the last write  problem: 2 processors write to same value at the same time » how is order determined? » need a single atomic “decider” 4 CS6810 School of Computing University of Utah Bus Based Coherence • Cache coherence  for shared lines: simple version » all copies of the cached line have the same contents  simultaneous update is hard: complex version » for any read: return value of the last write  problem: 2 processors write to same value at the same time » how is order determined? » need a single atomic “decider” [Bush’ism ack’d] • Bus – single thing so it becomes the “decider”  limited scalability » even 4 cores is a stretch at today’s clock speeds  clear broadcast win » all caches see whatever happens on the bus • bus order is the write order • not good enough then the programmer needs to synchronizePage 3 5 CS6810 School of Computing University of Utah Private vs. Shared Data • SMP should support both  private » normal cache policies and benefits  shared: 2 options » NCC-UMA • forces all shared data to be via main memory – too slow – forces programmer to deal with all synchronization • requires write- and read-no-allocate instructions – otherwise caching could create a problem – how? » CC-UMA • today’s focus • How to partition shared vs. private? 6 CS6810 School of Computing University of Utah Private vs. Shared Data • SMP should support both  private » normal cache policies and benefits  shared: 2 options » NCC-UMA • forces all shared data to be via main memory – too slow – forces programmer to deal with all synchronization • requires write- and read-no-allocate instructions – otherwise caching could create a problem – how? » CC-UMA • today’s focus • How to partition shared vs. private?  variable declarations in the code  partition by page or segmentPage 4 7 CS6810 School of Computing University of Utah Other Sharing Issues • Consider conventional cache wisdom  write-back is good (faster) » problems?  large line sizes help exploit spatial locality » problems?  valid and dirty tag bits » are they enough?  TLB » what changes with page sized partitioning pvt:shared?  bus requests » normally always mastered from the cache side » what changes? 8 CS6810 School of Computing University of Utah Consistency vs. Coherence • Terminology  some confusion in literature » but it’s rare so be clear and avoid “mutt” status  key is that they are different • Coherence  defines what value is returned by a read » e.g. value of the last write • Consistency  defines when things are coherent  bigger issue as systems get bigger  sequential consistency  value of the last write » as determined by the “decider” • Both are critical for correctness  varies as to whether consistency is exposed to programmer » sequential consistency doesn’t need to be exposed • same as usual sequential programming modelPage 5 9 CS6810 School of Computing University of Utah Coherence Implications • Additional cost  caches now need to snoop the bus » watch for writes, tag compare and “update” if they have a copy • update options? • Ordering constraints  reordering reads is OK » but not involving writes • same as uniprocessor world  writes must finish in program order » EVEN if they are independent • since there may be a hidden dependency in the other processors • also because cache management is by line not variable » this can be relaxed • more on this later 10 CS6810 School of Computing University of Utah 2 SMP Protocol Options • Write-invalidate  writer needs exclusive copy » write forces other copies to be invalidated » next read by others is a miss and they get new fresh line  2 writers » one win’s bus arbitration and the “decider” has spoken  bus broadcast » doesn’t need to broadcast write value – only address • Write-update  broadcast write value & address  if other copies exist » then appropriate line is updated • What haven’t we considered so far?  hint: LOTSPage 6 11 CS6810 School of Computing University of Utah Consider All Cases • X product  (read, write) (miss, hit) (valid copy in cache, memory)  (write invalidate, write update) • Simple with write-through caches  memory always has an updated copy  new writer gets valid copy » either by cache to cache transfer or from memory • Harder with write-back caches  good idea if cache is mostly holding private data » but memory may not be up to date • force invalidate of write back to memory – snoop grabs latest copy • cache-to-cache copy and no-update of memory – if write update and previous owner keeps copy then must clear D bit – key: only 1 D-bit can exist max  single “exclusive” owner • What happens?  write miss, read miss 12 CS6810 School of Computing University of Utah Performance Issues • Too many to exhaustively list • Key protocol choice issues  multiple writes to the same line write invalidate » less bus traffic • 1st write  bus invalidate – and


View Full Document

U of U CS 6810 - Multiprocessors

Documents in this Course
Caches

Caches

13 pages

Pipelines

Pipelines

14 pages

Load more
Download Multiprocessors
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Multiprocessors and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Multiprocessors 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?