DOC PREVIEW
Berkeley COMPSCI 152 - Memory Consistency and Cache Coherence

This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Problem P6.1: Sequential ConsistencyProblem P6.2: Synchronization PrimitivesProblem P6.3: Directory-based Cache Coherence Invalidate ProtocolsProblem P6.5: Snoopy Cache Coherent Shared MemoryCS152Computer Architecture and EngineeringMemory Consistency and Cache CoherenceAssigned April 23Problem Set #6Due May 5http://inst.eecs.berkeley.edu/~cs152/sp09The problem sets are intended to help you learn the material, and we encourage you tocollaborate with other students and to ask questions in discussion sections and office hours tounderstand the problems. However, each student must turn in their own solutions to the problems.The problem sets also provide essential background material for the quizzes. The problem setswill be graded primarily on an effort basis, but if you do not work through the problem sets youare unlikely to succeed at the quizzes! We will distribute solutions to the problem sets on the daythe problem sets are due to give you feedback. Homework assignments are due at the beginningof class on the due date. Homework will not be accepted once solutions are handed out.Problem P6.1: Sequential ConsistencyFor this problem we will be using the following sequences of instructions. These are smallprograms, each executed on a different processor, each with its own cache and register set. In thefollowing R is a register and X is a memory location. Each instruction has been named (e.g., B3)to make it easy to write answers.Assume data in location X is initially 0.Processor A Processor B Processor CA1: ST X, 1 B1: R := LD X C1: ST X, 6A2: R := LD X B2: R := ADD R, 1 C2: R := LD XA3: R := ADD R, R B3: ST X, R C3: R := ADD R, RA4: ST X, R B4: R:= LD X C4: ST X, RB5: R := ADD R, RB6: ST X, RFor each of the questions below, please circle the answer and provide a short explanationassuming the program is executing under the SC model. No points will be given for justcircling an answer!Problem P6.1.ACan X hold value of 4 after all three threads have completed? Please explain briefly.Yes / NoProblem P6.1.BCan X hold value of 5 after all three threads have completed?Yes / NoProblem P6.1.CCan X hold value of 6 after all three threads have completed?Yes / NoProblem P6.1.DFor this particular program, can a processor that reorders instructions but follows localdependencies produce an answer that cannot be produced under the SC model?Yes / NoProblem P6.2: Synchronization Primitives One of the common instruction sequences used for synchronizing several processors are theLOAD RESERVE/STORE CONDITIONAL pair (from now on referred to as LdR/StC pair). TheLdR instruction reads a value from the specified address and sets a local reservation for theaddress. The StC attempts to write to the specified address provided the local reservation for theaddress is still held. If the reservation has been cleared the StC fails and informs the CPU. Problem P6.2.ADescribe under what events the local reservation for an address is cleared.Problem P6.2.BIs it possible to implement LdR/StC pair in such a way that the memory bus is not affected, i.e.,unaware of the addition of these new instructions? ExplainProblem P6.2.CGive two reasons why the LdR/StC pair of instructions is preferable over atomic read-test-modify instructions such as the TEST&SET instruction. Problem P6.2.DLdR/StC pair of instructions were conceived in the context of snoopy busses. Do theseinstructions make sense in our directory-based system in Handout #6? Do they still offer anadvantage over atomic read-test-modify instructions in a directory-based system? Please explain.Problem P6.3: Directory-based Cache Coherence Invalidate Protocols In this problem we consider a cache-coherence protocol presented in Handout #6. Problem P6.3.A Protocol UnderstandingConsider the situation in which memory sends a FlushReq message to a processor. This canonly happen when the memory directory shows that the exclusive copy resides at that site. Thememory processor intends to obtain the most up-to-date data and exclusive ownership, and thensupply it to another site that has issued a ExReq. Table H12-1 row 21 specifies the PP behaviorwhen the current cache state is C-pending (not C-exclusive) and a FlushReq is received.Give a simple scenario that causes this situation. Problem P6.3.B Non-FIFO NetworkFIFO message passing is a necessary assumption for the correctness of the protocol. Assumenow that the network is non-FIFO. Give a simple scenario that shows how the protocol fails.Problem P6.3.C ReplaceIn the current scheme, when a cache wants to voluntarily invalidate a shared cache line, the PPinforms the memory of this operation. Describe a simple scenario where there would be an error,if the line was “silently dropped.” Can you provide a simple fix for this problem in the protocol?Give such a fix if there is one, or explain why it wouldn’t be a simple fix.Problem P6.4: Directory-based Cache Coherence Update Protocols In Handout #6, we examined a cache-coherent distributed shared memory system. Ben wants toconvert the directory-based invalidate cache coherence protocol from the handout into an updateprotocol. He proposes the following scheme.Caches are write-through, not write allocate. When a processor wants to write to a memorylocation, it sends a WriteReq to the memory, along with the data word that it wants written. Thememory processor updates the memory, and sends an UpdateReq with the new data to each ofthe sites caching the block, unless that site is the processor performing the store, in which case itsends a WriteRep containing the new data.If the processor performing the store is caching the block being written, it must wait for the replyfrom the home site to arrive before storing the new value into its cache. If the processorperforming the store is not caching the block being written, it can proceed after issuing theWriteReq.Ben wants his protocol to perform well, and so he also proposes to implement silent drops. Whena cache line needs to be evicted, it is silently evicted and the memory processor is not notified ofthis event.Note that WriteReq and UpdateReq contain data at the word-granularity, and not at the block-granularity. Also note that in the proposed scheme, memory will always have the most up-to-datedata and the state C-exclusive is no longer used.As in the lecture, the


View Full Document

Berkeley COMPSCI 152 - Memory Consistency and Cache Coherence

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Memory Consistency and Cache Coherence
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Memory Consistency and Cache Coherence and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Memory Consistency and Cache Coherence 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?