DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 21 Memory Systems Caches

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.1CS152Computer Architecture and EngineeringLecture 21Memory Systems (recap)CachesApril 21, 2003John Kubiatowicz (www.cs.berkeley.edu/~kubitron)lecture slides: http://inst.eecs.berkeley.edu/~cs152/4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.2° The Five Classic Components of a Computer° Today’s Topics: • Recap last lecture• Simple caching techniques• Many ways to improve cache performance• Virtual memory?Recap: The Big Picture: Where are We Now? ControlDatapathMemoryProcessorInputOutput4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.3µProc60%/yr.(2X/1.5yr)DRAM9%/yr.(2X/10 yrs)110100100019801981198319841985198619871988198919901991199219931994199519961997199819992000DRAMCPU1982Processor-MemoryPerformance Gap:(grows 50% / year)PerformanceTime“Moore’s Law”Processor-DRAM Memory Gap (latency)Recap: Who Cares About the Memory Hierarchy?“Less’ Law?”4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.4Recap: Memory Hierarchy: Why Does it Work? Locality!° Temporal Locality (Locality in Time):=> Keep most recently accessed data items closer to the processor° Spatial Locality (Locality in Space):=> Move blocks consists of contiguous words to the upper levels Lower LevelMemoryUpper LevelMemoryTo ProcessorFrom ProcessorBlk XBlk YAddress Space02^n -1Probabilityof reference4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.5Recap: Static RAM Cell6-Transistor SRAM Cellbit bitword(row select)bit bitword° Write:1. Drive bit lines (bit=1, bit=0)2.. Select row° Read:1. Precharge bit and bit to Vdd or Vdd/2 => make sure equal!2.. Select row3. Cell pulls one line low4. Sense amp on column detects difference between bit and bitreplaced with pullupto save area10014/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.6Recap: 1-Transistor Memory Cell (DRAM)° Write:• 1. Drive bit line• 2.. Select row° Read:• 1. Precharge bit line to Vdd/2• 2.. Select row• 3. Cell and bit line share charges- Very small voltage changes on the bit line• 4. Sense (fancy sense amp)- Can detect changes of ~1 million electrons• 5. Write: restore the value ° Refresh• 1. Just do a dummy read to every cell.row selectbitTrench Capacitor4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.7Recap: Classical DRAM Organization (square)° Row and Column Address Select 1 bit at a time° Act of reading refreshes one complete row• Sense amps detect slight variations from VDD/2 and amplify themrowdecoderrowaddressSense-AMPS, Column Selector & I/OColumnAddressdataRAM CellArrayword (row) selectbit (data) linesEach intersection representsa 1-T DRAM Cell4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.8ADOE_L256K x 8DRAM98WE_LCAS_LRAS_LOE_LA Row AddressWE_LJunkRead AccessTimeOutput EnableDelayCAS_LRAS_LCol Address Row Address JunkCol AddressD High Z Data OutDRAM Read Cycle TimeEarly Read Cycle: OE_L asserted before CAS_L Late Read Cycle: OE_L asserted after CAS_L° Every DRAM access begins at:• The assertion of the RAS_L• 2 ways to read: early or late v. CAS Junk Data Out High ZRecap: Traditional “asynchronous” DRAM Read Timing4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.9Recap: “Synchronous timing”: SDRAM timing for Lab6 ° Micron 128M-bit dram (using 2Megu16bitu4bank ver)• Row (12 bits), bank (2 bits), column (9 bits) RAS(New Bank)CASEnd RASxBurstREADCAS Latency4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.10Processor$MEMMemoryreference stream <op,addr>, <op,addr>,<op,addr>,<op,addr>, . . .op: i-fetch, read, writeOptimize the memory system organizationto minimize the average memory access timefor typical workloadsWorkload orBenchmarkprogramsThe Art of Memory System Design4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.11Impact of Memory Hierarchy on Algorithms° Today CPU time is a function of (ops, cache misses)° What does this mean to Compilers, Data structures, Algorithms?• Quicksort: fastest comparison based sorting algorithm when keys fit in memory• Radix sort: also called “linear time” sortFor keys of fixed length and fixed radix a constant number of passes over the data is sufficient independent of the number of keys° “The Influence of Caches on the Performance of Sorting” by A. LaMarca and R.E. Ladner. Proceedings of the Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, January, 1997, 370-379.• For Alphastation 250, 32 byte blocks, direct mapped L2 2MB cache, 8 byte keys, from 4000 to 40000004/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.12Quicksort vs. Radix as vary number keys: Instructions01002003004005006007008001000 10000 100000 1000000 1E+07Quick (Instr/key)Radix (Ins tr/ke y)Job size in keysInstructions/keyRadix sortQuicksort4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.13Quicksort vs. Radix as vary number keys: Instrs & Time01002003004005006007008001000 10000 100000 1000000 1E+07Quick (Instr/key)Radix (Instr/key)Quick (Clocks/key)Radix (c loc ks /ke y)TimeJob size in keysInstructionsRadix sortQuicksort4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.14Quicksort vs. Radix as vary number keys: Cache misses0123451000 10000 100000 1000000 10000000Quick(miss/key)Ra dix(mis s /key)Cache missesJob size in keysRadix sortQuicksortWhat is proper approach to fast algorithms?4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.15Example: 1 KB Direct Mapped Cache with 32 B Blocks° For a 2 ** N byte cache:• The uppermost (32 - N) bits are always the Cache Tag• The lowest M bits are the Byte Select (Block Size = 2M)• One cache miss, pull in complete “Cache Block” (or “Cache Line”)Cache Index0123:Cache DataByte 00431:Cache Tag Example: 0x50Ex: 0x010x50Stored as partof the cache “state”Valid Bit:31Byte 1Byte 31:Byte 32Byte 33Byte 63:Byte 992Byte 1023:Cache TagByte SelectEx: 0x009Block address4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.16Set Associative Cache° N-way set associative: N entries for each Cache Index• N direct mapped caches operates in parallel° Example: Two-way set associative cache• Cache Index selects a “set” from the cache• The two tags in the set are compared to the input in parallel• Data is selected based on the tag resultCache DataCache Block 0Cache TagValid:::Cache DataCache Block 0Cache Tag Valid:::Cache IndexMux01Sel1 Sel0Cache BlockCompareAdr TagCompareORHit4/21/03 ©UCB Spring 2003CS152 / Kubiatowicz Lec21.17Disadvantage of Set Associative Cache° N-way Set


View Full Document

Berkeley COMPSCI 152 - Lecture 21 Memory Systems Caches

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture 21 Memory Systems Caches
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 21 Memory Systems Caches and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 21 Memory Systems Caches 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?