DOC PREVIEW
Berkeley COMPSCI 152 - Caches and the Memory Hierarchy

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Problem 2.1: Cache Access-Time & PerformanceProblem 2.2: Pipelined Cache AccessProblem 2.3: Loop OrderingLoop ALoop BThe number of cache misses for Loop A:_____________________________The number of cache misses for Loop B:_____________________________Data-cache size required for Loop A: ____________________________ cache line(s)Data-cache size required for Loop B: ____________________________ cache line(s)The number of cache misses for Loop A:_____________________________The number of cache misses for Loop B:_____________________________CS152 Computer Architecture andEngineeringCaches and the Memory HierarchyFebruary 16,2008Assigned February 16Problem Set #2Due February 26http://inst.eecs.berkeley.edu/~cs152/sp08The problem sets are intended to help you learn the material, and we encourage you tocollaborate with other students and to ask questions in discussion sections and officehours to understand the problems. However, each student must turn in their own solutionsto the problems.The problem sets also provide essential background material for the quizzes. The problemsets will be graded primarily on an effort basis, but if you do not work through theproblem sets you are unlikely to succeed at the quizzes! We will distribute solutions to theproblem sets on the day the problem sets are due to give you feedback. Homeworkassignments are due at the beginning of class on the due date. Homework will not beaccepted once solutions are handed out.This material will be on Quiz 2, not Quiz 1.Problem 2.1: Cache Access-Time & PerformanceThis problem requires the knowledge of Handout #2 (Cache Implementations) andLecture 7. Please, read these materials before answering the following questions.Ben is trying to determine the best cache configuration for a new processor. He knowshow to build two kinds of caches: direct-mapped caches and 4-way set-associativecaches. The goal is to find the better cache configuration with the given building blocks.He wants to know how these two different configurations affect the clock speed and thecache miss-rate, and choose the one that provides better performance in terms of averagelatency for a load. Problem 2.1.A Access Time: Direct-MappedNow we want to compute the access time of a direct-mapped cache. We use theimplementation shown in Figure H2-A in Handout #2. Assume a 128-KB cache with 8-word (32-byte) cache lines. The address is 32 bits, and the two least significant bits of theaddress are ignored since a cache access is word-aligned. The data output is also 32 bits,and the MUX selects one word out of the eight words in a cache line. Using the delayequations given in Table 2.1-1, fill in the column for the direct-mapped (DM) cache inthe table. In the equation for the data output driver, ‘associativity’ refers to theassociativity of the cache (1 for direct-mapped caches, A for A-way set-associativecaches). Component Delay equation (ps) DM (ps) SA (ps)Decoder 200(# of index bits) + 1000 TagDataMemory array 200log2 (# of rows) + 200log2 (# of bits in a row) + 1000TagDataComparator 200(# of tag bits) + 1000N-to-1 MUX 500log2 N + 1000Buffer driver 2000Data output driver 500(associativity) + 1000Valid output driver 1000Table 2.1-1: Delay of each Cache ComponentWhat is the critical path of this direct-mapped cache for a cache read? What is theaccess time of the cache (the delay of the critical path)? To compute the access time,assume that a 2-input gate (AND, OR) delay is 500 ps. If the CPU clock is 150 MHz,how many CPU cycles does a cache access take? Problem 2.1.B Access Time: Set-AssociativeWe also want to investigate the access time of a set-associative cache using the 4-way set-associative cache in Figure H2-B in Handout #2. Assume the total cache size is still 128-KB (each way is 32-KB), a 4-input gate delay is 1000 ps, and all other parameters (suchas the input address, cache line, etc.) are the same as part 2.1.A. Compute the delay ofeach component, and fill in the column for a 4-way set-associative cache in TableM2.1-1. What is the critical path of the 4-way set-associative cache? What is the access timeof the cache (the delay of the critical path)? What is the main reason that the 4-wayset-associative cache is slower than the direct-mapped cache? If the CPU clock is150 MHz, how many CPU cycles does a cache access take?IndexTagInput Address• • • •• • • •• • • •• • • •• • • •• • • •• • • •• • • •• • • •• • • •• • • •• • • •4×2b-2data words•••S•••T•••S•••T•••S•••T•••S•••TMUXTagDecoderDataDecoderValid BitValidOutput Driver=Buffer DriverComparatorMUX MUX MUX= = =Problem 2.1.C Miss-rate analysisNow Ben is studying the effect of set-associativity on the cache performance. Since henow knows the access time of each configuration, he wants to know the miss-rate of eachone. For the miss-rate analysis, Ben is considering two small caches: a direct-mappedcache with 8 lines with 16 bytes/line, and a 4-way set-associative cache of the same size.For the set-associative cache, Ben tries out two replacement policies – least recently used(LRU) and round robin (FIFO).Ben tests the cache by accessing the following sequence of hexadecimal byte addresses,starting with empty caches. For simplicity, assume that the addresses are only 12 bits.Complete the following tables for the direct-mapped cache and both types of 4-way set-associative caches showing the progression of cache contents as accesses occur (in thetables, ‘inv’ = invalid, and the column of a particular cache line contains the {tag,index}contents of that line). You only need to fill in elements in the table when a value changes. D-mapAddressline in cache hit?L0 L1 L2 L3 L4 L5 L6 L7110inv 11 inv inv inv inv inv inv no13613 no20220 no1A31023612041141A4177301206135D-mapTotal MissesTotal Accesses4-wayAddressLRUline in cache hit?Set 0 Set 1way0 way1 Way2 way3 way0 way1 way2 way3110inv Inv Inv inv 11 inv inv inv no13611 13 no20220 no1A31023612041141A41773012061354-way LRUTotal MissesTotal Accesses4-wayAddressFIFOline in cache hit?Set 0 Set 1way0 way1 way2 way3 way0 way1 way2 way3110inv Inv Inv inv 11 inv inv inv no13613 no20220 no1A31023612041141A41773012061354-way FIFOTotal MissesTotal AccessesProblem 2.1.D Average


View Full Document

Berkeley COMPSCI 152 - Caches and the Memory Hierarchy

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Caches and the Memory Hierarchy
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Caches and the Memory Hierarchy and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Caches and the Memory Hierarchy 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?