DOC PREVIEW
Berkeley COMPSCI 152 - Problem Set 6

This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS152 Computer Architecture and Engineering Memory Consistency and Cache Coherence April 23, 2008 Assigned April 25 Problem Set #6 Due May 6 http://inst.eecs.berkeley.edu/~cs152/sp08 The problem sets are intended to help you learn the material, and we encourage you to collaborate with other students and to ask questions in discussion sections and office hours to understand the problems. However, each student must turn in their own solutions to the problems. The problem sets also provide essential background material for the quizzes. The problem sets will be graded primarily on an effort basis, but if you do not work through the problem sets you are unlikely to succeed at the quizzes! We will distribute solutions to the problem sets on the day the problem sets are due to give you feedback. Homework assignments are due at the beginning of class on the due date. Homework will not be accepted once solutions are handed out.Problem P6.1: Sequential Consistency For this problem we will be using the following sequences of instructions. These are small programs, each executed on a different processor, each with its own cache and register set. In the following R is a register and X is a memory location. Each instruction has been named (e.g., B3) to make it easy to write answers. Assume data in location X is initially 0. Processor A Processor B Processor C A1: ST X, 1 B1: R := LD X C1: ST X, 6 A2: R := LD X B2: R := ADD R, 1 C2: R := LD X A3: R := ADD R, R B3: ST X, R C3: R := ADD R, R A4: ST X, R B4: R:= LD X C4: ST X, R B5: R := ADD R, R B6: ST X, R For each of the questions below, please circle the answer and provide a short explanation assuming the program is executing under the SC model. No points will be given for just circling an answer! Problem P6.1.A Can X hold value of 4 after all three threads have completed? Please explain briefly. Yes / No Problem P6.1.B Can X hold value of 5 after all three threads have completed? Yes / NoProblem P6.1.C Can X hold value of 6 after all three threads have completed? Yes / No Problem P6.1.D For this particular program, can a processor that reorders instructions but follows local dependencies produce an answer that cannot be produced under the SC model? Yes / NoProblem P6.2: Synchronization Primitives One of the common instruction sequences used for synchronizing several processors are the LOAD RESERVE/STORE CONDITIONAL pair (from now on referred to as LdR/StC pair). The LdR instruction reads a value from the specified address and sets a local reservation for the address. The StC attempts to write to the specified address provided the local reservation for the address is still held. If the reservation has been cleared the StC fails and informs the CPU. Problem P6.2.A Describe under what events the local reservation for an address is cleared. Problem P6.2.B Is it possible to implement LdR/StC pair in such a way that the memory bus is not affected, i.e., unaware of the addition of these new instructions? Explain Problem P6.2.C Give two reasons why the LdR/StC pair of instructions is preferable over atomic read-test-modify instructions such as the TEST&SET instruction. Problem P6.2.D LdR/StC pair of instructions were conceived in the context of snoopy busses. Do these instructions make sense in our directory-based system in Handout #12? Do they still offer an advantage over atomic read-test-modify instructions in a directory-based system? Please explain.Problem P6.3: Directory-based Cache Coherence Invalidate Protocols In this problem we consider a cache-coherence protocol presented in Handout #6. Problem P6.3.A Protocol Understanding Consider the situation in which memory sends a FlushReq message to a processor. This can only happen when the memory directory shows that the exclusive copy resides at that site. The memory processor intends to obtain the most up-to-date data and exclusive ownership, and then supply it to another site that has issued a ExReq. Table H12-1 row 21 specifies the PP behavior when the current cache state is C-pending (not C-exclusive) and a FlushReq is received. Give a simple scenario that causes this situation. Problem P6.3.B Non-FIFO Network FIFO message passing is a necessary assumption for the correctness of the protocol. Assume now that the network is non-FIFO. Give a simple scenario that shows how the protocol fails. Problem P6.3.C Replace In the current scheme, when a cache wants to voluntarily invalidate a shared cache line, the PP informs the memory of this operation. Describe a simple scenario where there would be an error, if the line was “silently dropped.” Can you provide a simple fix for this problem in the protocol? Give such a fix if there is one, or explain why it wouldn’t be a simple fix.Problem P6.4: Directory-base Cache Coherence Update Protocols In Handout #6, we examined a cache-coherent distributed shared memory system. Ben wants to convert the directory-based invalidate cache coherence protocol from the handout into an update protocol. He proposes the following scheme. Caches are write-through, not write allocate. When a processor wants to write to a memory location, it sends a WriteReq to the memory, along with the data word that it wants written. The memory processor updates the memory, and sends an UpdateReq with the new data to each of the sites caching the block, unless that site is the processor performing the store, in which case it sends a WriteRep containing the new data. If the processor performing the store is caching the block being written, it must wait for the reply from the home site to arrive before storing the new value into its cache. If the processor performing the store is not caching the block being written, it can proceed after issuing the WriteReq. Ben wants his protocol to perform well, and so he also proposes to implement silent drops. When a cache line needs to be evicted, it is silently evicted and the memory processor is not notified of this event. Note that WriteReq and UpdateReq contain data at the word-granularity, and not at the block-granularity. Also note that in the proposed scheme, memory will always have the most up-to-date data and


View Full Document

Berkeley COMPSCI 152 - Problem Set 6

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Problem Set 6
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Problem Set 6 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Problem Set 6 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?