DOC PREVIEW
Berkeley COMPSCI 152 - Quiz

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Computer Architecture and Engineering CS152 Quiz #6 May 8th, 2008 Professor Krste Asanovic Name:___________________ This is a closed book, closed notes exam. 80 Minutes 10 Pages Notes: • Not all questions are of equal difficulty, so look over the entire exam and budget your time carefully. • Please carefully state any assumptions you make. • Please write your name on every page in the quiz. • You must not discuss a quiz's contents with students who have not yet taken the quiz. If you have inadvertently been exposed to the quiz prior to taking it, you must tell the instructor or TA. • You will get no credit for selecting multiple choice answers without giving explanations if the instructions ask you to explain your choice. Writing name on each sheet ________ 1 Point Question 1 _______ 32 Points Question 2 _______ 29 Points Question 3 _______ 18 Points TOTAL ________ 80 PointsProblem Q6.1: Implementing Directories 32 POINTS Ben Bitdiddle is implementing a directory-based cache coherence invalidate protocol for a 64-processor system. He first builds a smaller prototype with only 4 processors to test out the cache coherence protocol described in Handout #6. To implement the list of sharers, S, kept by home, he maintains a bit vector per cache block to keep track of all the sharers. The bit vector has one bit corresponding to each processor in the system. The bit is set to one if the processor is caching a shared copy of the block, and zero if the processor does not have a copy of the block. For example, if Processors 0 and 3 are caching a shared copy of some data, the corresponding bit vector would be 1001. Problem Q6.1.A 4 POINTS The bit vector worked well for the 4-processor prototype, but when building the actual 64-processor system, Ben discovered that he did not have enough hardware resources. Assume each cache block is 32 bytes. What is the overhead of maintaining the sharing bit vector for a 4-processor system, as a fraction of data storage bits? What is the overhead for a 64-processor system, as a fraction of data storage bits? Overhead for a 4-processor system: ________________________ Overhead for a 64-processor system: _______________________Problem Q6.1.B 8 POINTS Since Ben does not have the resources to keep track of all potential sharers in the 64-processor system, he decides to limit S to keep track of only 1 processor using its 6-bit ID as shown in Figure Q6.1-A (single-sharer scheme). When there is a load (ShReq) request to a shared cache block, Ben invalidates the existing sharer to make room for the new sharer (home sends an InvReq to the existing sharer, the existing sharer sends an InvRep to home, home replaces the exiting sharer's ID with the new sharer's ID and sends a ShRep to the new sharer). 6 Sharer ID Figure Q6.1-A Consider a 64-processor system. To determine the efficiency of a full bit-vector scheme and single-sharer scheme, fill in the number of invalidate-requests that are generated by the protocols for each step in the following two sequences of events. Assume cache block B is uncached initially (R(dir) & dir= ε)) for both sequences. Sequence 1 Full bit-vector scheme # of invalidate-requests single-sharer scheme # of invalidate-requests Processor #0 reads B 0 0 Processor #1 reads B Processor #0 reads B Sequence 2 Full bit-vector scheme # of invalidate-requests single-sharer scheme # of invalidate-requests Processor #0 reads B 0 0 Processor #1 reads B Processor #2 writes BPage 4 of 12 Problem Q6.1.C 8 POINTS Ben thinks that he can improve his original scheme by adding an extra “global bit” to S as shown in Figure Q6.1-B (global-bit scheme). The global bit is set when there is more than 1 processor sharing the data, and zero otherwise. 1 6 0 Sharer ID global Figure Q6.1-B When the global bit is set, home stops keeping track of a specific sharer and assumes that all processors are potential sharers. 1 6 1 XXXXXX global Figure Q6.1-C Consider a 64-processor system. To determine the efficiency of the global-bit scheme, fill in the number of invalidate-requests that are generated for each step in the following two sequences of events. Assume cache block B is uncached initially (i.e., R(dir) & (dir = ε)) for both sequences. Sequence 1 global-bit scheme # of invalidate-requests Processor #0 reads B 0 Processor #1 reads B Processor #0 reads B Sequence 2 global-bit scheme # of invalidate-requests Processor #0 reads B 0 Processor #1 reads B Processor #2 writes BPage 5 of 12 Problem Q6.1.D 12 POINTS Ben decides to modify the protocol from Handout #6 for his global-bit scheme (Problem 4.4.C) for a 64-processor system. Your job is to complete the following table (Table Q6.1-1) for him. Use the same assumptions for the interconnection network, cache states, home directory states, and protocol messages as Handout #6. However, R(dir) (if dir≠ε) now means that there is only one processor sharing the cache data (global bit is unset), and R(all) means the global bit is set. Use k to represent the site that issued the received message. For Tr(dir) and Tw(id) states, use j to represent the site that issued the original protocol request (ShReq/ExReq). No. Current State Message Received Next State Action 1 R(dir) & (dir = ε) ShReq R({k}) ShRep->k 2 R(dir) & (dir = ε) ExReq W(k) ExRep->k 3 R(dir) & (dir ≠ ε) ShReq 4 R(all) ShReq 5 R(dir) & (dir ≠ ε) ExReq 6 R(all) ExReq 7 W(id) ShReq Tw(id) WbReq->id 8 Tr(dir) & (id ∈ dir) InvRep Tr(dir - {id}) nothing 9 Tr(dir) & (dir = {k}) InvRep W(j) ExRep->j 10 Tw(id) FlushRep Table Q6.1-1: Partial List of Home Directory State TransitionsPage 6 of 12 Problem Q6.2: Snoopy Cache Coherent Shared Memory 29 POINTS This problem improves the snoopy cache coherence protocol presented in Handout #7. As a review of that protocol: When multiple shared copies of a modified data block exist, one of the caches owns the current copy of the data block instead of the memory (the owner has the data block in the OS state). When another cache tries to retrieve the data block from memory, the


View Full Document

Berkeley COMPSCI 152 - Quiz

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Quiz
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Quiz and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Quiz 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?