DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 21 Directory-Based Cache Protocols

This preview shows page 1-2-3-24-25-26 out of 26 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Recap: Snoopy Cache ProtocolsSlide 3Performance of Symmetric Shared-Memory MultiprocessorsCoherency MissesExample: True v. False Sharing v. Hit?Slide 7Slide 8A Cache Coherent System Must:Bus-based CoherenceScalable Approach: DirectoriesBasic Operation of DirectoryCS152 AdministriviaDirectory Cache Protocol (Handout 6)Cache StatesHome directory statesProtocol MessagesCache State Transitions (from invalid state)Cache State Transitions (from shared state)Cache State Transitions (from exclusive state)Cache Transitions (from pending)Home Directory State TransitionsHome Directory State TransitionsHome Directory State TransitionsHome Directory State TransitionsAcknowledgementsApril 20, 2010 CS152, Spring 2010CS 152 Computer Architectureand Engineering Lecture 21: Directory-Based Cache ProtocolsKrste AsanovicElectrical Engineering and Computer SciencesUniversity of California, Berkeleyhttp://www.eecs.berkeley.edu/~krstehttp://inst.cs.berkeley.edu/~cs152April 20, 2010 CS152, Spring 20102Recap: Snoopy Cache Protocols Use snoopy mechanism to keep all processors’ view of memory coherentM1M2M3Snoopy CacheDMAPhysical MemoryMemory BusSnoopy CacheSnoopy Cache DISKSApril 20, 2010 CS152, Spring 20103MESI: An Enhanced MSI protocol increased performance for private dataM ES IM: Modified ExclusiveE: Exclusive but unmodifiedS: Shared I: InvalidEach cache line has a tagAddress tagstate bitsWrite missOther processorintent to writeRead miss,sharedOther processorintent to writeP1 writeRead by any processorOther processor readsP1 writes backP1 readP1 writeor readCache state in processor P1P1 intent to writeRead miss, not sharedOther processorreadsOther processor intent to write, P1 writes backApril 20, 2010 CS152, Spring 20104Performance of Symmetric Shared-Memory MultiprocessorsCache performance is combination of:1. Uniprocessor cache miss traffic2. Traffic caused by communication –Results in invalidations and subsequent cache misses•Adds 4th C: coherence miss–Joins Compulsory, Capacity, Conflict–(Sometimes called a Communication miss)April 20, 2010 CS152, Spring 20105Coherency Misses1. True sharing misses arise from the communication of data through the cache coherence mechanism•Invalidates due to 1st write to shared block•Reads by another CPU of modified block in different cache•Miss would still occur if block size were 1 word2. False sharing misses when a block is invalidated because some word in the block, other than the one being read, is written into•Invalidation does not cause a new value to be communicated, but only causes an extra cache miss•Block is shared, but no word in block is actually shared  miss would not occur if block size were 1 wordApril 20, 2010 CS152, Spring 20106Example: True v. False Sharing v. Hit?Time P1 P2True, False, Hit? Why?1 Write x12 Read x23 Write x14 Write x25 Read x2• Assume x1 and x2 in same cache block. P1 and P2 both read x1 and x2 before.True miss; invalidate x1 in P2False miss; x1 irrelevant to P2False miss; x1 irrelevant to P2False miss; x1 irrelevant to P2True miss; invalidate x2 in P1April 20, 2010 CS152, Spring 20107MP Performance 4 Processor Commercial Workload: OLTP, Decision Support (Database), Search Engine• True sharing and false sharing unchanged going from 1 MB to 8 MB (L3 cache)• Uniprocessor cache missesimprove withcache size increase (Instruction, Capacity/Conflict, Compulsory)April 20, 2010 CS152, Spring 20108MP Performance 2MB Cache Commercial Workload: OLTP, Decision Support (Database), Search Engine• True sharing,false sharing increase going from 1 to 8 CPUsApril 20, 2010 CS152, Spring 20109A Cache Coherent System Must:•Provide set of states, state transition diagram, and actions•Manage coherence protocol–(0) Determine when to invoke coherence protocol–(a) Find info about state of address in other caches to determine action»whether need to communicate with other cached copies–(b) Locate the other copies–(c) Communicate with those copies (invalidate/update)•(0) is done the same way on all systems–state of the line is maintained in the cache–protocol is invoked if an “access fault” occurs on the line•Different approaches distinguished by (a) to (c)April 20, 2010 CS152, Spring 201010Bus-based Coherence•All of (a), (b), (c) done through broadcast on bus–faulting processor sends out a “search” –others respond to the search probe and take necessary action•Could do it in scalable network too–broadcast to all processors, and let them respond•Conceptually simple, but broadcast doesn’t scale with number of processors, P–on bus, bus bandwidth doesn’t scale–on scalable network, every fault leads to at least P network transactions•Scalable coherence:–can have same cache states and state transition diagram–different mechanisms to manage protocolApril 20, 2010 CS152, Spring 201011Scalable Approach: Directories• Every memory block has associated directory information–keeps track of copies of cached blocks and their states–on a miss, find directory entry, look it up, and communicate only with the nodes that have copies if necessary–in scalable networks, communication with directory and copies is through network transactions•Many alternatives for organizing directory informationApril 20, 2010 CS152, Spring 201012Basic Operation of Directory• k processors. • With each cache-block in memory: k presence-bits, 1 dirty-bit• With each cache-block in cache: 1 valid bit, and 1 dirty (owner) bit• Read from main memory by processor i:• If dirty-bit OFF then { read from main memory; turn p[i] ON; }• if dirty-bit ON then { recall line from dirty proc (downgrade cache state to shared); update memory; turn dirty-bit OFF; turn p[i] ON; supply recalled data to i;}• Write to main memory by processor i:• If dirty-bit OFF then {send invalidations to all caches that have the block; turn dirty-bit ON; supply data to i; turn p[i] ON; ... }April 20, 2010 CS152, Spring 201013CS152 Administrivia•Final quiz, Thursday April 29–Multiprocessors, Memory models, Cache coherence–Lectures 19-21, PS 5, Lab 5•Next lecture, “Virtual Machines”, Thursday April 22•Last lecture, “Putting it all Together”, Tuesday April 27–Summary of the course–Case Study: Intel Nehalem–HKN Course SurveyApril 20, 2010 CS152, Spring 201014Directory Cache Protocol(Handout 6)•Assumptions: Reliable network, FIFO message delivery between any given


View Full Document

Berkeley COMPSCI 152 - Lecture 21 Directory-Based Cache Protocols

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture 21 Directory-Based Cache Protocols
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 21 Directory-Based Cache Protocols and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 21 Directory-Based Cache Protocols 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?