DOC PREVIEW
Berkeley COMPSCI 152 - Locality and Memory Technology

This preview shows page 1-2-3-4-25-26-27-51-52-53-54 out of 54 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 54 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS152 Computer Architecture and Engineering Lecture 20 Locality and Memory Technology November 9th 2001 John Kubiatowicz http cs berkeley edu kubitron lecture slides http www inst eecs berkeley edu cs152 11 9 01 UCB Fall 2001 CS152 Kubiatowicz Review Tomasulo With Reorder buffer Done FP Op Queue Reorder val2 val2 F0 F0 val2 val2 F4 F4 M 10 M 10 Buffer F2 F2 F10 F10 F0 F0 ST ST 0 R3 F0 0 R3 F0 ADDD ADDD F0 F4 F6 F0 F4 F6 LD LD F4 0 R3 F4 0 R3 BNE BNE F2 F2 DIVD DIVD F2 F10 F6 F2 F10 F6 ADDD ADDD F10 F4 F0 F10 F4 F0 LD LD F0 10 R2 F0 10 R2 Registers Dest 22 ADDD ADDD R F4 ROB1 R F4 ROB1 To Memory Dest 33 DIVD DIVD ROB2 R F6 ROB2 R F6 Reservation Stations FP FP FPadders adders FPmultipliers multipliers 11 9 01 YY ROB7 Newest Ex Ex ROB6 YY ROB5 NN ROB5 NN ROB3 NN ROB2 Oldest NN ROB1 UCB Fall 2001 from Memory Dest 11 10 R2 10 R2 CS152 Kubiatowicz Review Branch Target Buffer BTB Address of branch index to get prediction AND branch address if taken Must check for branch match now since can t use wrong branch address Grab predicted PC from table since may take several cycles to compute PC of instruction FETCH Branch PC Predicted PC 11 9 01 Predict taken or untaken UCB Fall 2001 CS152 Kubiatowicz Review Branch History Table Predictor 0 Predictor 1 T Branch PC NT NT T NT Predictor 7 T NT BHT is a table of Predictors Usually 2 bit saturating counters Indexed by PC address of Branch without tags In Fetch state of branch BTB identifies branch Predictor from BHT used to make prediction When branch completes Update corresponding Predictor 11 9 01 UCB Fall 2001 CS152 Kubiatowicz The Big Picture Where are We Now The Five Classic Components of a Computer Processor Input Control Memory Datapath Output Today s Topics Recap last lecture Locality and Memory Hierarchy Administrivia SRAM Memory Technology DRAM Memory Technology Memory Organization 11 9 01 UCB Fall 2001 CS152 Kubiatowicz Technology Trends from 1st lecture Capacity Speed latency Logic 2x in 3 years 2x in 3 years DRAM 4x in 3 years 2x in 10 years Disk 4x in 3 years 2x in 10 years DRAM Year 1980 11 9 01 1000 1 Size 64 Kb 2 1 Cycle Time 250 ns 1983 256 Kb 220 ns 1986 1 Mb 190 ns 1989 4 Mb 165 ns 1992 16 Mb 145 ns 1995 64 Mb 120 ns UCB Fall 2001 CS152 Kubiatowicz Who Cares About the Memory Hierarchy Processor DRAM Memory Gap latency Performance 1000 100 10 198 198 0 1 198 198 2 198 3 198 4 5 198 198 6 198 7 1 898 199 9 199 0 199 199 2 199 3 199 4 1 599 199 6 199 7 8 199 200 9 0 1 Proc 60 yr Moore s Law 2X 1 5yr Processor Memory Performance Gap grows 50 year Less Law DRAM DRAM 9 yr 2X 10 yrs CPU 11 9 01 Time UCB Fall 2001 CS152 Kubiatowicz The Goal illusion of large fast cheap memory Fact Large memories are slow Fast memories are small How do we create a memory that is large cheap and fast most of the time Hierarchy Parallelism 11 9 01 UCB Fall 2001 CS152 Kubiatowicz Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality Present the user with as much memory as is available in the cheapest technology Provide access at the speed offered by the fastest technology Processor Control On Chip Cache Registers Datapath Second Level Cache SRAM Main Memory DRAM Speed ns 1s 10s 100s Size bytes 100s Ks Ms 11 9 01 UCB Fall 2001 Secondary Storage Disk Tertiary Storage Tape 10 000 000s 10 000 000 000s 10s sec 10s Gsms Ts CS152 Kubiatowicz Today s Situation Microprocessor Rely on caches to bridge gap Microprocessor DRAM performance gap time of a full cache miss in instructions executed 1st Alpha 7000 340 ns 5 0 ns 68 clks x 2 or 136 instructions 2nd Alpha 8400 266 ns 3 3 ns 80 clks x 4 or 320 instructions 3rd Alpha t b d 180 ns 1 7 ns 108 clks x 6 or 1 2X latency x 3X clock rate x 3X Instr clock 5X 648 instructions 11 9 01 UCB Fall 2001 CS152 Kubiatowicz Memory Hierarchy Why Does it Work Locality Probability of reference 0 2 n 1 Address Space Temporal Locality Locality in Time Keep most recently accessed data items closer to the processor Spatial Locality Locality in Space Move blocks consists of contiguous words to the upper levels To Processor Upper Level Memory Lower Level Memory Blk X From Processor 11 9 01 Blk Y UCB Fall 2001 CS152 Kubiatowicz Example 1 KB Direct Mapped Cache with 32 B Blocks For a 2 N byte cache The uppermost 32 N bits are always the Cache Tag The lowest M bits are the Byte Select Block Size 2 M Block address 31 Cache Tag Example 0x50 9 Cache Index Ex 0x01 4 0 Byte Select Ex 0x00 Stored as part of the cache state Cache Tag Cache Data Byte 31 Byte 63 Valid Bit 0x50 Byte 1 Byte 0 0 Byte 33 Byte 32 1 2 3 Byte 1023 11 9 01 UCB Fall 2001 Byte 992 31 CS152 Kubiatowicz Example Set Associative Cache N way set associative N entries for each Cache Index N direct mapped caches operates in parallel Example Two way set associative cache Cache Index selects a set from the cache The two tags in the set are compared to the input in parallel Data is selected based on the tag result Valid Cache Tag Adr Tag Compare Cache Index Cache Data Cache Data Cache Block 0 Cache Block 0 Sel1 1 Mux 0 Sel0 Cache Tag Valid Compare OR Hit 11 9 01 Cache Block UCB Fall 2001 CS152 Kubiatowicz Memory Hierarchy Terminology Hit data appears in some block in the upper level example Block X Hit Rate the fraction of memory access found in the upper level Hit Time Time to access the upper level which consists of RAM access time Time to determine hit miss Miss data needs to be retrieve from a block in the lower level Block Y Miss Rate 1 Hit Rate Miss Penalty Time to replace a block in the upper level Time to deliver the block the processor Hit Time Miss Penalty To Processor Upper Level Memory Lower Level Memory Blk X From Processor 11 9 01 Blk Y UCB Fall 2001 CS152 Kubiatowicz Recap Cache Performance CPU time CPU execution clock cycles Memory stall clock cycles x clock cycle time Memory stall clock cycles Reads x Read miss rate x Read miss penalty Writes x Write miss rate x Write miss penalty Memory stall clock cycles Memory accesses x Miss rate x Miss penalty Different measure AMAT Average Memory Access time AMAT Hit Time Miss Rate x Miss Penalty Note memory hit time is included …


View Full Document

Berkeley COMPSCI 152 - Locality and Memory Technology

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Locality and Memory Technology
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Locality and Memory Technology and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Locality and Memory Technology 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?