Unformatted text preview:

11/7/2007 1G22.2243-001High Performance Computer ArchitectureLecture 11Multiprocessing (Cont’d)November 7, 200711/7/2007 2Outline• Announcements– HW Assignment 3 due back today– Lab Assignment 3 due in a week: Nov 14• Multiprocessors– Coherence protocols• Snooping-based protocols (review)• Directory-based protocols[ Hennessy/Patterson CA:AQA (4th Edition): Chapter 4]11/7/2007 3Snooping - Cache State Machine: CombinedState machinefor CPU requestsfor each cache block andfor bus requestsfor each cache blockWrite BackBlock; (abortmemory access)Place read misson busInvalidShared(read only)Exclusive(read/write)CPU ReadCPU WriteCPU Read hitPlace Write Miss on busCPU read missWrite back block,Place read misson busCPU WritePlace Write Miss on BusCPU Read missPlace read miss on busCPU Write MissWrite back cache blockPlace write miss on busCPU read hitCPU write hitWrite miss for this blockWrite miss for this blockRead miss for this blockWrite BackBlock; (abortmemory access)11/7/2007 4Larger MPs• Separate Memory per Processor• Local or Remote access via memory controller• One Cache Coherency solution: non-cached pages • Alternative: directoryper cache that tracks state of every block in every cache– Which caches have copies of block, dirty vs. clean, ...• Info per memory block vs. per cache block?– simpler protocol (centralized/one location)– directory is ƒ(memory size x number of processors) vs. ƒ(cache size)• Prevent directory as bottleneck? distribute directory entries with memory, each keeping track of which Processors have copies of their memory blocks and in what state11/7/2007 5Interconnection NetworkProcessor+ cacheMemoryI/ODirectoryProcessor+ cacheMemoryI/ODirectoryProcessor+ cacheMemoryI/ODirectoryProcessor+ cacheMemoryI/ODirectoryProcessor+ cacheMemoryI/ODirectoryProcessor+ cacheMemoryI/ODirectoryProcessor+ cacheMemoryI/ODirectoryProcessor+ cacheMemoryI/ODirectoryDistributed Directory MPsProcessor& CachesMemory I/OProcessor& CachesMemory I/OProcessor& CachesMemory I/OProcessor& CachesMemory I/OInterconnection network11/7/2007 6Directory Protocol• Similar to Snooping Protocol: Three states– Shared: ≥ 1 processors have data, memory up-to-date– Uncached:(no processor has it; not valid in any cache)– Exclusive: 1 processor (owner) has data; memory out-of-date• In addition to cache state, must track which processors have data when in the shared state (usually bit vector, 1 if processor has copy)• Keep it simple(r):– Writes to non-exclusive data Æwrite miss– Processor blocks until access completes– Assume messages received and acted upon in order sent11/7/2007 7Directory Protocol (Cont’d)• No bus and don’t want to broadcast:– interconnect no longer single arbitration point– all messages have explicit responses• Typically 3 processors involved– Local node where a request originates– Home node where the memory location of an address resides– Remote node has a copy of a cache block, whether exclusive or shared• Example messages on next slide: P = processor number, A = address11/7/2007 8Directory Protocol MessagesMessage type Source Destination Msg ContentRead miss Local cache Home directory P, A– Processor P has a read miss at address A; request data and make P a read sharer Write miss Local cache Home directory P, A– Processor P has a write miss at address A; request data and make P exclusive owner Invalidate Home directory Remote caches A– Invalidate a shared copy of data at address AFetch Home directory Remote cache A– Fetch block at address A & send it to its home directory; change state to shared at remoteFetch/Invalidate Home directory Remote cache A– Fetch block at address A & send it to its home directory; invalidate the block in the cacheData value reply Home directory Local cache Data– Return a data value from the home memoryData write-back Remote cache Home directory A, Data– Write-back a data value for address A11/7/2007 9State Transition Diagram for an Individual Cache Block in a Directory Based System• States identical to snooping case• transactions very similar• Transitions caused by read misses, write misses, invalidates, and data fetch requests• Generates read miss & write miss messages to home directory• Write misses that were broadcast on the bus for snooping– explicit invalidate & data fetch requests11/7/2007 10CPU - Cache State Machine• State machine foreach Cache block• Invalid state if in memoryFetch/Invalidatesend Data Write Back messageto home directoryInvalidateInvalid(Uncached)Shared(read only)Exclusive(read/writ)CPU Read hitCPU ReadSend Read MissMessage to h.d.CPU Write:Send Write Miss msg to h.d.CPU Write:Send Write Miss messageto home directoryCPU read hitCPU write hitFetch: send Data Write Back message to home directoryCPU read miss:Send Read MissCPU write miss:send Data Write Back message and Write Miss to home directoryCPU read miss: send Data Write Back message and read miss to home directory11/7/2007 11State Transition Diagram for the Directory • Same states & structure as the transition diagram for an individual cache• Two actions: update of directory state & send messages to satisfy requests • Tracks all copies of memory block• Also indicates an action that updates the sharing set, Sharers, as well as sending a message11/7/2007 12Example Directory Protocol• Message sent to directory causes two actions:– Update the directory– More messages to satisfy request• Block is in Uncached state: the copy in memory is the current value; only possible requests for that block are:– Read miss: requesting processor sent data from memory & requestor made (the first) sharing node; state of block made Shared– Write miss: requesting processor is sent the value. The block is made Exclusive to indicate that the only valid copy is cached. Sharers indicates the identity of the owner. •Block is Shared => the memory value is up-to-date:– Read miss: requesting processor is sent back the data from memory & requesting processor is added to the sharing set.– Write miss: requesting processor is sent the value. All processors in the set Sharers are sent invalidate messages & Sharers is set to identity of requesting processor. The state of the block is made Exclusive.11/7/2007 13Example Directory Protocol (Cont’d)•Block is Exclusive: current value of the block is held in the cache of the processor identified by the set Sharers (the owner).•


View Full Document

NYU CSCI-GA 2243 - Multiprocessing

Download Multiprocessing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Multiprocessing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Multiprocessing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?