DOC PREVIEW
Berkeley COMPSCI 252 - Lec 23 – Storage Technology

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Page 1EECS 252 Graduate Computer ArchitectureLec 23 – Storage TechnologyDavid CullerElectrical Engineering and Computer SciencesUniversity of California, Berkeleyhttp://www.eecs.berkeley.edu/~cullerhttp://www-inst.eecs.berkeley.edu/~cs252Classical DRAM Organization (square)rowdecoderrowaddressColumn Selector &I/O CircuitsColumnAddressdataRAM CellArrayword (row) selectbit (data) lines• Row and Column Address together: – Select 1 bit a timeEach intersection representsa 1-T DRAM CellReview:1-T Memory Cell (DRAM)• Write:– 1. Drive bit line– 2.. Select row• Read:– 1. Precharge bit line to Vdd/2– 2.. Select row– 3. Cell and bit line share charges» Very small voltage changes on the bit line– 4. Sense (fancy sense amp)» Can detect changes of ~1 million electrons– 5. Write: restore the value • Refresh– 1. Just do a dummy read to every cell.row selectbitDRAM Capacitors: more capacitance in a small area• Trench capacitors:– Logic ABOVE capacitor– Gain in surface area of capacitor– Better Scaling properties– Better Planarization• Stacked capacitors– Logic BELOW capacitor– Gain in surface area of capacitor– 2-dim cross-section quite smallADOE_L256K x 8DRAM98WE_LCAS_LRAS_LOE_LA Row AddressWE_LJunkRead AccessTimeOutput EnableDelayCAS_LRAS_LCol Address Row Address JunkCol AddressD High Z Data OutDRAM Read Cycle TimeEarly Read Cycle: OE_L asserted before CAS_L Late Read Cycle: OE_L asserted after CAS_L• Every DRAM access begins at:– The assertion of the RAS_L– 2 ways to read: early or late v. CAS Junk Data Out High ZDRAM Read Timing4 Key DRAM Timing Parameters• tRAC: minimum time from RAS line falling to the valid data output. – Quoted as the speed of a DRAM when buy– A typical 4Mb DRAM tRAC= 60 ns– Speed of DRAM since on purchase sheet?• tRC: minimum time from the start of one row access to the start of the next. – tRC= 110 ns for a 4Mbit DRAM with a tRACof 60 ns• tCAC: minimum time from CAS line falling to valid data output. – 15 ns for a 4Mbit DRAM with a tRACof 60 ns• tPC: minimum time from the start of one column access to the start of the next. – 35 ns for a 4Mbit DRAM with a tRACof 60 nsPage 2• DRAM (Read/Write) Cycle Time >> DRAM (Read/Write) Access Time– - 2:1; why?• DRAM (Read/Write) Cycle Time :– How frequent can you initiate an access?– Analogy: A little kid can only ask his father for money on Saturday• DRAM (Read/Write) Access Time:– How quickly will you get what you want once you initiate an access?– Analogy: As soon as he asks, his father will give him the money • DRAM Bandwidth Limitation analogy:– What happens if he runs out of money on Wednesday?TimeAccess TimeCycle TimeMain Memory PerformanceAccess Pattern without Interleaving:Start Access for D1CPU MemoryStart Access for D2D1 availableAccess Pattern with 4-way Interleaving:Access Bank 0Access Bank 1Access Bank 2Access Bank 3We can Access Bank 0 againCPUMemoryBank 1MemoryBank 0MemoryBank 3MemoryBank 2Increasing Bandwidth - Interleaving• Simple: – CPU, Cache, Bus, Memory same width (32 bits)• Interleaved: – CPU, Cache, Bus 1 word: Memory N Modules(4 Modules); example is word interleaved• Wide: – CPU/Mux 1 word; Mux/Cache, Bus, Memory N words (Alpha: 64 bits & 256 bits)Main Memory Performance• Timing model– 1 to send address, – 4 for access time, 10 cycle time, 1 to send data– Cache Block is 4 words• Simple M.P. = 4 x (1+10+1) = 48• Wide M.P. = 1 + 10 + 1 = 12• Interleaved M.P. = 1+10+1 + 3 =15addressBank 004812addressBank 115913addressBank 2261014addressBank 3371115Main Memory PerformanceAvoiding Bank Conflicts• Lots of banksint x[256][512];for (j = 0; j < 512; j = j+1)for (i = 0; i < 256; i = i+1)x[i][j] = 2 * x[i][j];• Even with 128 banks, since 512 is multiple of 128, conflict on word accesses• SW: loop interchange or declaring array not power of 2 (“array padding”)• HW: Prime number of banks– bank number = address mod number of banks– bank number = address mod number of banks– address within bank = address / number of words in bank– modulo & divide per memory access with prime no. banks?Finding Bank Number and Address within a bankProblem: We want to determine the number of banks, Nb, to useand the number of words to store in each bank, Wb, such that:• given a word address x, it is easy to find the bank where x willbe found, B(x), and the address of x within the bank, A(x).•for any addressx, B(x) and A(x) are unique.• the number of bank conflicts is minimizedPage 3Finding Bank Number and Address within a bankSolution: We will use the following relation to determine the banknumber for x, B(x), and the address of x within the bank, A(x):B(x) = x MOD NbA(x) = x MOD Wband we will choose Nband Wbto be co-prime, i.e., there is no primenumber that is a factor of Nband Wb(this condition is satisfiedif we choose Nbto be a prime number that is equal to an integerpower of two minus 1).We can then use the Chinese Remainder Theorem to show that B(x) and A(x) is always unique.• Chinese Remainder TheoremAs long as two sets of integers ai and bi follow these rulesand that ai and aj are co-prime if i ≠ j, then the integer x has only one solution (unambiguous mapping):– bank number = b0, number of banks = a0– address within bank = b1, number of words in bank = a1– N word address 0 to N-1, prime no. banks, words power of 2• 3 banks Nb = 3, and 8 words per bank, Wb = 8. b i=x moda i,0≤b i<a i,0 ≤ x < a 0 × a1 × a 2 ×…Fast Bank NumberSeq. Interleaved Modulo InterleavedBank Number: 0 1 2 0 1 2Address within Bank: 0 012 01681 345 91172 6 7 8 18 10 23 9 10 11 3 19 114 12 13 14 12 4 205 15 16 17 21 13 56 18 19 20 6 22 147 21 22 23 15 7 23Fast Memory Systems: DRAM specific• Multiple CAS accesses: several names (page mode)– Extended Data Out (EDO): 30% faster in page mode• New DRAMs to address gap; what will they cost, will they survive?– RAMBUS: startup company; reinvent DRAM interface» Each Chip a module vs. slice of memory» Short bus between CPU and chips» Does own refresh» Variable amount of data returned» 1 byte / 2 ns (500 MB/s per chip)– Synchronous DRAM: 2 banks on chip, a clock signal to DRAM, transfer synchronous to system clock (66 - 150 MHz)– Intel claims RAMBUS Direct (16 b wide) is future PC memory• Niche memory or main memory?– e.g., Video RAM for frame buffers, DRAM + fast serial outputFast Page Mode Operation• Regular DRAM


View Full Document

Berkeley COMPSCI 252 - Lec 23 – Storage Technology

Documents in this Course
Quiz

Quiz

9 pages

Caches I

Caches I

46 pages

Lecture 6

Lecture 6

36 pages

Lecture 9

Lecture 9

52 pages

Figures

Figures

26 pages

Midterm

Midterm

15 pages

Midterm

Midterm

14 pages

Midterm I

Midterm I

15 pages

ECHO

ECHO

25 pages

Quiz  1

Quiz 1

12 pages

Load more
Download Lec 23 – Storage Technology
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lec 23 – Storage Technology and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lec 23 – Storage Technology 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?