DOC PREVIEW
Berkeley COMPSCI 61C - CS 61C Final Exam Review

This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 61C Spring 2011Final Exam Review featuring Yunsup Lee and Andrew Waterman with a cameo appearance by the talented Vasily VolkovC and MIPS Assembly Programming Following is the definition of the Ackermann function (thanks, Wikipedia), which is defined over the domain of non-negative integers: a) Complete the following C function, ack, so that it computes the Ackermann function correctly. unsigned int ack(unsigned int m, unsigned int n) { unsigned int answer; return answer; }b) Write MIPS assembly that implements ack correctly. m and n are passed in a0 and a1, respectively. We’ve provided the prologue and epilogue for you. s0 and s1 are saved and restored, so you can safely use them if you need to preserve values across function calls. Assume that there are no delay slots. ack: # prologueaddiu $sp, $sp, -12sw $ra, 0($sp)sw $s0, 4($sp)sw $s1, 8($sp) epilogue:lw $ra, 0($sp)lw $s0, 4($sp)lw $s1, 8($sp)addiu $sp, $sp, 12jr $raAMAT The average memory access time (AMAT) is given by: AMAT = Hit TimeL1 + Miss RateL1 x Miss PenaltyL1 (a) Write down the AMAT equation for a two level cache (Use HT for hit time, MR for Local Miss Rate, and MP for Miss Penalty): (b) Given the following specifications: For every 1000 CPU-to-memory references:40 will miss in L1$20 will miss in L2$ L1$ hits in 1 clock cycleL2$ hits in 10 clock cyclesMain memory access is 100 clock cycles There are 1.3 memory references per instructionIdeal CPI (i.e. the CPI assuming a 100% L1$ hit rate) is 1.0 Answer the following questions:(i) What is the local miss rate in the L2$? (ii) What is the global miss rate in the L2$? (iii) What is the AMAT with both levels of cache? (iv) What is the AMAT for a one-level cache without L2$? (v) What is the average memory stalls per reference in the system of question (iii)? (vi) What is the average memory stalls per reference in the system of question (iv)?(vii) What is the average memory stalls per instruction in the system of question (iii)? (viii) What is the average memory stalls per instruction in the system of question (iv)? (ix) What is the performance of the system of question (iv) verses that of question (iii)?MIPS 5 stage pipelineConsider a variation on the canonical MIPS 5 stage pipeline, which doesn’t have any bypass paths, doesn’t have branch delay slots, and resolves branches in the execute stage. For the following code sequence, answer questions (a) - (c). addu $t0, $a0, $a1sllv $t1, $t0, $a2 (a) How many cycles does the processor stall? (b) Assume we have added a forwarding path from the beginning of the writeback stage to the beginning of the execute stage. How many cycles does the processor stall? (c) Assume we have implemented all forwarding paths. How many cycles does the processor stall? For the following code sequence, answer questions (d)-(h). addu $t3, $t4, $t5beq $t0, $t1, labelsrlv $s0, $s1, $s2addu $t4, $t4, $t4 (d) When $t0 != $t1, how many instruction fetches are wasted in the original processor? (e) When $t0 == $t1, how many instruction fetches are wasted in the original processor?(f) Assume the branches are resolved in the decode stage. When $t0 == $t1, how many instruction fetches are wasted? (g) Assume the processor implements branch delay slots, and branches are resolved in the decode stage. Reorganize the code sequence to minimize number of wasted instruction fetches when $t0 == $t1. (h) Given the code sequence in (g), how many instruction fetches are wasted when $t0 == $t1? (i) Given all the optimizations above (full forwarding paths, resolve branches in the decode stage, and branch delay slots), is there a situation where the processor needs to stall? If so, provide an example sequence of instructions that will interlock.FSM Given the following state machine, answer questions (a)-(d).(a) Fill in the following truth table: S1 S0 W NS1 NS0 Z0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 X X X1 1 1 X X X (b) Write boolean expressions that implement the NS1, NS0, and Z function. NS1 = NS0 = Z = (c) Implement the finite-state machine by finishing the diagram.(d) Draw the waveform:Virtual Memory Consider a 32-bit MIPS machine like the one we studied in class with 4 GB of physical main memory. The machine’s virtual memory system uses 4 KB pages. PTEs (page table entries) have dirty, valid, readable, writable, & executable bits along with the PPN (physical page number). The machine uses a flat page table (with all entries present) to perform translation in the event of a TLB (translation lookaside buffer) miss. (a) How big is a PTE in bits? Assume that the size is rounded up to the next byte and the extra bits are reserved for use by the OS. Name the bits in the PTE. (b) How large is a page table in this system? (c) Assume that a process must have, at minimum, a single data page and a single code page. Remember to account for the space required by the page tables. What is the maximum number of processes we can run at a time without paging to disk (ignoring the pages used by the operating system itself)?Set-Associative caches (a) Given a 2-way set-associative cache with LRU replacement policy, initially empty, and the following memory access pattern (all are byte addresses): 8 0 4 32 36 8 0 4 16 0 What is the hit rate, miss rate, and what blocks are in the cache after these accesses if the cache has 4 32-bit blocks? hit rate: _________ miss rate: _________ blocks at end (write the appropriate full byte addresses (NOT the tag) in the appropriate blocks, or “EMPTY” if the block is empty): way 0 way 1set 0 set 1 (b) Consider a write-allocate, write-through, 4-way set-associative cache with 16 byte blocks and 64*210 bytes of data bits. Assume a byte-addressed machine with 32-bit addresses.a. Partition the following address and label each field with its name and size in bits. 31 0 (i) Given the address DEADBEEFhex, what is the value of the index, offset, and tag? (Write your answers in hexadecimal.) index = 0x________________________ offset = 0x________________________ tag = 0x________________________(ii) How many cache management bits are there for each block? List them. (iii) How many bits comprise the cache?Digital Logic Consider the following digital circuit, which implements a combinational logic function: a) Fill in the following truth table for the Y


View Full Document

Berkeley COMPSCI 61C - CS 61C Final Exam Review

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Download CS 61C Final Exam Review
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS 61C Final Exam Review and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS 61C Final Exam Review 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?