CS152 Computer Architecture and Engineering Lecture 18 Memory and Caches April 7 1999 John Kubiatowicz http cs berkeley edu kubitron lecture slides http www inst eecs berkeley edu cs152 4 7 99 UCB Spring 1999 CS152 Kubiatowicz Recap Who Cares About the Memory Hierarchy Processor DRAM Memory Gap latency Performance 1000 100 10 198 198 0 1 198 198 2 198 3 198 4 5 198 198 6 198 7 1 898 199 9 199 0 199 199 2 199 3 199 4 1 599 199 6 199 7 8 199 200 9 0 1 Proc 60 yr Moore s Law 2X 1 5yr Processor Memory Performance Gap grows 50 year DRAM DRAM 9 yr 2X 10 yrs CPU 4 7 99 Time UCB Spring 1999 CS152 Kubiatowicz Recap Static RAM Cell 6 Transistor SRAM Cell 0 0 bit word word row select 1 1 bit Write 1 Drive bit lines bit 1 bit 0 bit bit 2 Select row replaced with pullup to save area Read 1 Precharge bit and bit to Vdd 2 Select row 3 Cell pulls one line low 4 Sense amp on column detects difference between bit and bit 4 7 99 UCB Spring 1999 CS152 Kubiatowicz Recap 1 Transistor Memory Cell DRAM row select Write 1 Drive bit line 2 Select row Read 1 Precharge bit line to Vdd 2 Select row bit 3 Cell and bit line share charges Very small voltage changes on the bit line 4 Sense fancy sense amp Can detect changes of 1 million electrons 5 Write restore the value Refresh 1 Just do a dummy read to every cell 4 7 99 UCB Spring 1999 CS152 Kubiatowicz Recap Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality Present the user with as much memory as is available in the cheapest technology Provide access at the speed offered by the fastest technology Processor Control Speed ns 1s Size bytes 100s 4 7 99 On Chip Cache Registers Datapath Second Level Cache SRAM Main Memory DRAM 10s 100s Ks Ms UCB Spring 1999 Secondary Storage Disk Tertiary Storage Disk 10 000 000s 10 000 000 000s 10s ms 10s sec Gs Ts CS152 Kubiatowicz Recap Memory Systems Two Different Types of Locality Temporal Locality Locality in Time If an item is referenced it will tend to be referenced again soon Spatial Locality Locality in Space If an item is referenced items whose addresses are close by tend to be referenced soon By taking advantage of the principle of locality Present the user with as much memory as is available in the cheapest technology Provide access at the speed offered by the fastest technology DRAM is slow but cheap and dense Good choice for presenting the user with a BIG memory system SRAM is fast but expensive and not very dense Good choice for providing the user FAST access time 4 7 99 UCB Spring 1999 CS152 Kubiatowicz The Big Picture Where are We Now The Five Classic Components of a Computer Processor Input Control Memory Datapath Output Today s Topics Recap last lecture Continue discussion of DRAM Cache Review Advanced Cache Virtual Memory Protection TLB 4 7 99 UCB Spring 1999 CS152 Kubiatowicz Classical DRAM Organization square bit data lines r o w d e c o d e r row address Each intersection represents a 1 T DRAM Cell RAM Cell Array word row select Column Selector I O Circuits data 4 7 99 Column Address Row and Column Address together Select 1 bit a time UCB Spring 1999 CS152 Kubiatowicz DRAM logical organization 4 Mbit 11 A0 A10 Column Decoder Sense Amps I O Memory Array 2 048 x 2 048 D Q Storage Word Line Cell Square root of bits per RAS CAS 4 7 99 UCB Spring 1999 CS152 Kubiatowicz DRAM physical organization 4 Mbit Column Address Row Address Block Row Dec 9 512 I O I O Block Row Dec 9 512 I O I O 8 I Os D Block Row Dec 9 512 Block Row Dec 9 512 Q 2 I O I O Block 0 4 7 99 I O I O Block 3 UCB Spring 1999 8 I Os CS152 Kubiatowicz Logic Diagram of a Typical DRAM RAS L A 9 CAS L WE L OE L 256K x 8 DRAM 8 D Control Signals RAS L CAS L WE L OE L are all active low Din and Dout are combined D WE L is asserted Low OE L is disasserted High D serves as the data input pin WE L is disasserted High OE L is asserted Low D is the data output pin Row and column addresses share the same pins A RAS L goes low Pins A are latched in as row address CAS L goes low Pins A are latched in as column address 4 7 99 RAS CAS edge sensitive UCB Spring 1999 CS152 Kubiatowicz DRAM Read Timing Every DRAM access begins at RAS L The assertion of the RAS L 2 ways to read early or late v CAS A CAS L WE L 256K x 8 DRAM 9 OE L D 8 DRAM Read Cycle Time RAS L CAS L A Row Address Col Address Junk Row Address Col Address Junk WE L OE L D High Z Junk Read Access Time Data Out Early Read Cycle OE L asserted before CAS L 4 7 99 High Z Output Enable Delay Data Out Late Read Cycle OE L asserted after CAS L UCB Spring 1999 CS152 Kubiatowicz DRAM Write Timing Every DRAM access begins at RAS L The assertion of the RAS L 2 ways to write early or late v CAS A CAS L WE L 256K x 8 DRAM 9 OE L D 8 DRAM WR Cycle Time RAS L CAS L A Row Address Col Address Junk Row Address Col Address Junk OE L WE L D Junk Data In Junk WR Access Time Early Wr Cycle WE L asserted before CAS L 4 7 99 Data In Junk WR Access Time Late Wr Cycle WE L asserted after CAS L UCB Spring 1999 CS152 Kubiatowicz Main Memory Performance Wide Simple Interleaved CPU Mux 1 word Mux Cache Bus Memory N words Alpha 64 bits 256 bits CPU Cache Bus Memory same width 32 bits 4 7 99 UCB Spring 1999 CPU Cache Bus 1 word Memory N Modules 4 Modules example is word interleaved CS152 Kubiatowicz Main Memory Performance Cycle Time Access Time Time DRAM Read Write Cycle Time DRAM Read Write Access Time 2 1 why DRAM Read Write Cycle Time How frequent can you initiate an access Analogy A little kid can only ask his father for money on Saturday DRAM Read Write Access Time How quickly will you get what you want once you initiate an access Analogy As soon as he asks his father will give him the money DRAM Bandwidth Limitation analogy What happens if he runs out of money on Wednesday 4 7 99 UCB Spring 1999 CS152 Kubiatowicz Increasing Bandwidth Interleaving Access Pattern without Interleaving D1 available Start Access for D1 CPU Memory Start Access for …
View Full Document
Unlocking...