DOC PREVIEW
U of U CS 7810 - Directory-Based Coherence

This preview shows page 1-2-21-22 out of 22 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide Number 1Slide Number 2Slide Number 3Slide Number 4Slide Number 5Slide Number 6Slide Number 7Slide Number 8Slide Number 9Slide Number 10Slide Number 11Slide Number 12Slide Number 13Slide Number 14Slide Number 15Slide Number 16Slide Number 17Slide Number 18Slide Number 19Slide Number 20Slide Number 21Slide Number 221Lecture 4: Directory-Based Coherence• Details of memory-based (SGI Origin) and cache-based(Sequent NUMA-Q) directory protocols2Handling Reads• When the home receives a read request, it looks upmemory (speculative read) and directory in parallel• Actions taken for each directory state:¾ shared or unowned: memory copy is clean, datais returned to requestor, state is changed to excl ifthere are no other sharers¾ busy: a NACK is sent to the requestor¾ exclusive: home is not the owner, request is fwdedto owner, owner sends data to requestor and home3Inner Details of Handling the Read• The block is in exclusive state – memory may or may nothave a clean copy – it is speculatively read anyway• The directory state is set to busy-exclusive and thepresence vector is updated• In addition to fwding the request to the owner, the memorycopy is speculatively forwarded to the requestor¾ Case 1: excl-dirty: owner sends block to requestorand home, the speculatively sent data is over-written¾ Case 2: excl-clean: owner sends an ack (without data)to requestor and home, requestor waits for this ackbefore it moves on with speculatively sent data4Inner Details II• Why did we send the block speculatively to the requestorif it does not save traffic or latency?¾ the R10K cache controller is programmed to notrespond with data if it has a block in excl-clean state¾ when an excl-clean block is replaced from the cache,the directory need not be updated – hence, directorycannot rely on the owner to provide data andspeculatively provides data on its own5Handling Write Requests• The home node must invalidate all sharers and allinvalidations must be acked (to the requestor), the requestor is informed of the number of invalidates to expect• Actions taken for each state:¾ shared: invalidates are sent, state is changed toexcl, data and num-sharers is sent to requestor,the requestor cannot continue until it receives all acks(Note: the directory does not maintain busy state,subsequent requests will be fwded to new ownerand they must be buffered until the previous writehas completed)6Handling Writes II• Actions taken for each state:¾ unowned: if the request was an upgrade and not aread-exclusive, is there a problem?¾ exclusive: is there a problem if the request was anupgrade? In case of a read-exclusive: directory isset to busy, speculative reply is sent to requestor,invalidate is sent to owner, owner sends data torequestor (if dirty), and a “transfer of ownership”message (no data) to home to change out of busy¾ busy: the request is NACKed and the requestormust try again7Handling Write-Back• When a dirty block is replaced, a writeback is generatedand the home sends back an ack • Can the directory state be shared when a writeback isreceived by the directory?• Actions taken for each directory state:¾ exclusive: change directory state to unowned andsend an ack¾ busy: a request and the writeback have crossedpaths: the writeback changes directory state toshared or excl (depending on the busy state),memory is updated, and home sends data torequestor, the intervention request is dropped8Writeback CasesP1 P2D3E: P1WbackThis is the “normal” caseD3 sends back an AckAck9Writeback CasesP1 P2D3E: P1 ÆbusyWbackIf someone else has the block in exclusive, D3 moves to busyIf Wback is received, D3 serves the requesterIf we didn’t use busy state when transitioning from E:P1 to E:P2, D3 may not have known who to service(since ownership may have been passed on to P3 and P4…)(although, this problem can be solved by NACKing the Wbackand having P1 buffer its “strange” intervention requests)FwdRd or Wr10Writeback CasesP1 P2D3E: P1 ÆbusyTransferownershipIf Wback is from new requester, D3 sends back a NACKFloating unresolved messages are a problemAlternatively, can accept the Wback and put D3 in some new busy stateConclusion: could have got rid of busy state between E:P1 Æ E:P2, butwith Wback ACK/NACK and other bufferingcould have kept the busy state between E:P1 Æ E:P2, couldhave got rid of ACK/NACK, but need one new busy stateFwdWbackData11Serialization• Note that the directory serializes writes to a location, butdoes not know when a write/read has completed at anyprocessor• For example, a read reply may be floating on the networkand may reach the requestor much later – in the meantime,the directory has already issued a number of invalidates,the invalidate is overwritten when the read reply finallyshows up – hence, each node must buffer its requestsuntil outstanding requests have completed12Sequent NUMA-Q• Employs a flat cache-based directory protocol between nodes –IEEE standard SCI (Scalable Coherent Interface) protocol• Each node is a 4-way SMP with a bus-based snooping protocol• The communication assist includes a large “remote access cache”– the directory protocol tries to keep the remote caches coherent,while the snooping protocol ensures that each processor cache iskept coherent with the remote access cache and local-memC C C CLocalMemCARACNetworkP P P P13Directory Structure• The physical address identifies the home node – the homenode directory stores a pointer to the head of a linked list –each cache stores pointers to the next and previous sharer• A main memory block can be in three directory states:¾ Home: (similar to unowned) the block does not existin any remote access cache (may be in the homenode’s processor caches, though)¾ Fresh: (similar to shared) read-only copies exist inremote access caches and memory copy is up-to-date¾ Gone: (similar to exclusive) writeable copy exists insome remote cache14Cache Structure• 29 stable states and many more pending/busy states!• The stable states have two descriptors:¾ position in linked list: ONLY, HEAD, TAIL, MID¾ state within cache: dirty, clean, fresh, valid, etc.• SCI defines and implements primitive operations tofacilitate linked list manipulations:¾ List construction: add a new node to the list head¾ Rollout: remove a node from a list¾ Purging: invoked by the head to invalidate allother nodes15Handling Read Requests• On a read miss, the remote cache sets up a


View Full Document

U of U CS 7810 - Directory-Based Coherence

Download Directory-Based Coherence
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Directory-Based Coherence and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Directory-Based Coherence 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?