DOC PREVIEW
Berkeley COMPSCI 152 - Lecture Notes

This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 152 Computer Architectureand Engineering Lecture 21: Directory-BasedCache ProtocolsKrste AsanovicElectrical Engineering and Computer SciencesUniversity of California, Berkeleyhttp://www.eecs.berkeley.edu/~krstehttp://inst.cs.berkeley.edu/~cs1524/29/20082CS152-Spring!08Recap: Snoopy Cache Protocols Use snoopy mechanism to keep all processors’view of memory coherentM1M2M3Snoopy CacheDMAPhysical MemoryMemory BusSnoopy CacheSnoopy Cache DISKS4/29/20083CS152-Spring!08Recap: MESI: An Enhanced MSI protocol increased performance for private dataM ES IM: Modified ExclusiveE: Exclusive, unmodifiedS: Shared I: InvalidEach cache line has a tagAddress tagstate bitsWrite missOther processorintent to writeRead miss,sharedOther processorintent to writeP1 writeRead by any processorOther processor readsP1 writes backP1 readP1 writeor readCache state inprocessor P1P1 intent to writeRead miss,not shared4/29/20084CS152-Spring!08Performance of Symmetric Shared-MemoryMultiprocessorsCache performance is combination of:1. Uniprocessor cache miss traffic2. Traffic caused by communication– Results in invalidations and subsequent cache misses• Adds 4th C: coherence miss– Joins Compulsory, Capacity, Conflict– (Sometimes called a Communication miss)4/29/20085CS152-Spring!08Coherency Misses1. True sharing misses arise from the communicationof data through the cache coherence mechanism• Invalidates due to 1st write to shared block• Reads by another CPU of modified block in different cache• Miss would still occur if block size were 1 word2. False sharing misses when a block is invalidatedbecause some word in the block, other than the onebeing read, is written into• Invalidation does not cause a new value to be communicated, butonly causes an extra cache miss• Block is shared, but no word in block is actually shared ! miss would not occur if block size were 1 word4/29/20086CS152-Spring!08Example: True v. False Sharing v.Hit?Read x25Write x24Write x13Read x22Write x11True, False, Hit? Why?P2P1Time• Assume x1 and x2 in same cache block. P1 and P2 both read x1 and x2 before.True miss; invalidate x1 in P2False miss; x1 irrelevant to P2False miss; x1 irrelevant to P2False miss; x1 irrelevant to P2True miss; invalidate x2 in P14/29/20087CS152-Spring!08MP Performance 4 ProcessorCommercial Workload: OLTP, DecisionSupport (Database), Search Engine00.250.50.7511.251.51.7522.252.52.7533.251 MB 2 MB 4 MB 8 MBCache sizeMemory cycles per instructionInstructionCapacity/ConflictColdFalse SharingTrue Sharing• True sharing andfalse sharingunchanged goingfrom 1 MB to 8MB (L3 cache)• Uniprocessorcache missesimprove withcache sizeincrease(Instruction,Capacity/Conflict,Compulsory)4/29/20088CS152-Spring!08MP Performance 2MB CacheCommercial Workload: OLTP, DecisionSupport (Database), Search Engine• Truesharing,false sharingincreasegoing from 1to 8 CPUs00.511.522.531 2 4 6 8Processor countMemory cycles per instructionInstructionConflict/CapacityColdFalse SharingTrue Sharing4/29/20089CS152-Spring!08A Cache Coherent System Must:• Provide set of states, state transition diagram, andactions• Manage coherence protocol– (0) Determine when to invoke coherence protocol– (a) Find info about state of block in other caches to determineaction» whether need to communicate with other cached copies– (b) Locate the other copies– (c) Communicate with those copies (invalidate/update)• (0) is done the same way on all systems– state of the line is maintained in the cache– protocol is invoked if an “access fault” occurs on the line• Different approaches distinguished by (a) to (c)4/29/200810CS152-Spring!08Bus-based Coherence• All of (a), (b), (c) done through broadcast on bus– faulting processor sends out a “search”– others respond to the search probe and take necessary action• Could do it in scalable network too– broadcast to all processors, and let them respond• Conceptually simple, but broadcast doesn’t scale withnumber of processors, P– on bus, bus bandwidth doesn’t scale– on scalable network, every fault leads to at least P networktransactions• Scalable coherence:– can have same cache states and state transition diagram– different mechanisms to manage protocol4/29/200811CS152-Spring!08Scalable Approach: Directories• Every memory block has associated directoryinformation– keeps track of copies of cached blocks and their states– on a miss, find directory entry, look it up, and communicate onlywith the nodes that have copies if necessary– in scalable networks, communication with directory and copies isthrough network transactions• Many alternatives for organizing directory information4/29/200812CS152-Spring!08Basic Operation of Directory• k processors.• With each cache-block in memory:k presence-bits, 1 dirty-bit• With each cache-block in cache:1 valid bit, and 1 dirty (owner) bit•••PPCacheCacheMemory Directorypresence bits dirty bitInterconnection Network• Read from main memory by processor i:• If dirty-bit OFF then { read from main memory; turn p[i] ON; }• if dirty-bit ON then { recall line from dirty proc (cache state toshared); update memory; turn dirty-bit OFF; turn p[i] ON; supplyrecalled data to i;}• Write to main memory by processor i:• If dirty-bit OFF then {send invalidations to all caches that have theblock; turn dirty-bit ON; supply data to i; turn p[i] ON; ... }4/29/200813CS152-Spring!08CS152 Administrivia• No lecture, Thursday May 1– (Faculty retreat)• Last lecture, Tuesday May 6• Final quiz, Thursday May 8• Informal course feedback– Want to hear your opinion of new format– What worked, and what didn’t work, especially in labs4/29/200814CS152-Spring!08Directory Cache Protocol(Handout 6)• Assumptions: Reliable network, FIFO messagedelivery between any given source-destination pairCPUCacheInterconnection NetworkDirectoryControllerDRAM BankDirectoryControllerDRAM BankCPUCacheCPUCacheCPUCacheCPUCacheCPUCacheDirectoryControllerDRAM BankDirectoryControllerDRAM Bank4/29/200815CS152-Spring!08Cache StatesFor each cache line, there are 4 possible states:– C-invalid (= Nothing): The accessed data is not resident in thecache.– C-shared (= Sh): The accessed data is resident in the cache,and possibly also cached at other sites. The data in memoryis valid.– C-modified (= Ex): The accessed data is exclusively residentin this cache, and has been modified. Memory does not havethe most up-to-date


View Full Document

Berkeley COMPSCI 152 - Lecture Notes

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?