COMP 206: Computer Architecture and ImplementationOutlineBasics of DRAM TechnologyDRAM Organization: Fig. 5.29Chip OrganizationChip Organization Example: 64Mb DRAMDRAM AccessDRAM RefreshMemory Performance CharacteristicsImproving PerformanceTwo Recent ProblemsIncreasing Granularity of Memory SystemsGranularity ExampleGranularity Example (2)Improving Memory Chip PerformanceBasic Mode of OperationNibble (or Burst) ModeFast Page ModeEDO ModeEvolutionary DRAM ArchitecturesRevolutionary DRAM ArchitecturesAchieving Higher Memory BandwidthMemory InterleavingLow-order Bit InterleavingMixed InterleavingOther types of Memory1COMP 206:COMP 206:Computer Architecture and Computer Architecture and ImplementationImplementationMontek SinghMontek SinghWed., Nov. 19, 2003Wed., Nov. 19, 2003Topic: Topic: Main Memory (DRAM) OrganizationMain Memory (DRAM) Organization2OutlineOutlineIntroductionIntroductionDRAM OrganizationDRAM OrganizationChallengesChallengesBandwidthBandwidthGranularityGranularityPerformancePerformanceReading: HP3 5.8 and 5.9Reading: HP3 5.8 and 5.93Basics of DRAM TechnologyBasics of DRAM TechnologyDRAM (Dynamic RAM)DRAM (Dynamic RAM)Used mostly in main mem.Used mostly in main mem.Capacitor + 1 transistor/bitCapacitor + 1 transistor/bitNeed refresh every 4-8 msNeed refresh every 4-8 ms5% of total time5% of total timeRead is destructive (need Read is destructive (need for write-back)for write-back)Access time < cycle time Access time < cycle time (because of writing back)(because of writing back)Density (25-50):1 to SRAMDensity (25-50):1 to SRAMAddress lines multiplexedAddress lines multiplexedpins are scarce!pins are scarce!SRAM (Static RAM)SRAM (Static RAM)Used mostly in caches Used mostly in caches (I, D, TLB, BTB)(I, D, TLB, BTB)1 flip-flop (4-6 1 flip-flop (4-6 transistors) per bittransistors) per bitRead is not destructiveRead is not destructiveAccess time = cycle Access time = cycle timetimeSpeed (8-16):1 to DRAMSpeed (8-16):1 to DRAMAddress lines not multiplexedAddress lines not multiplexedhigh speed of decoding high speed of decoding imp.imp.4DRAM Organization: Fig. 5.29DRAM Organization: Fig. 5.295Chip OrganizationChip OrganizationChip capacity (= number of data bits)Chip capacity (= number of data bits)tends to quadrupletends to quadruple1K, 4K, 16K, 64K, 256K, 1M, 4M, …1K, 4K, 16K, 64K, 256K, 1M, 4M, …In early designs, each data bit belonged to a In early designs, each data bit belonged to a different address (x1 organization)different address (x1 organization)Starting with 1Mbit chips, wider chips (4, 8, 16, Starting with 1Mbit chips, wider chips (4, 8, 16, 32 bits wide) began to appear32 bits wide) began to appearAdvantage: Higher bandwidthAdvantage: Higher bandwidthDisadvantage: More pins, hence more expensive Disadvantage: More pins, hence more expensive packagingpackaging6Chip Organization Example: 64Mb Chip Organization Example: 64Mb DRAMDRAMOrganization Address Bits Address Pins Data Pins Total Pins64Mx1 26 13 1 1416Mx4 24 12 4 168Mx8 23 12 8 204Mx16 22 11 16 272Mx32 21 11 32 43pins data and pins address has chip 2)(2iiniin7DRAM AccessDRAM AccessSeveral steps in DRAM access:Several steps in DRAM access:Half of the address bits select a row of the square Half of the address bits select a row of the square arrayarrayWhole row of bits is brought out of the memory array Whole row of bits is brought out of the memory array into a buffer register (slow, 60-80% of access time)into a buffer register (slow, 60-80% of access time)Other half of address bits select one bit of buffer Other half of address bits select one bit of buffer register (with the help of multiplexer), which is read or register (with the help of multiplexer), which is read or writtenwrittenWhole row is written back to memory arrayWhole row is written back to memory arrayNotes:Notes:This organization is demanded by needs of refreshThis organization is demanded by needs of refreshHas advantages: e.g., nibble, page, and static column Has advantages: e.g., nibble, page, and static column mode operationmode operation8DRAM RefreshDRAM RefreshRefreshes are performed one row at a time.Refreshes are performed one row at a time.Consider a 1Mx1 DRAM chip with 190 ns cycle timeConsider a 1Mx1 DRAM chip with 190 ns cycle timeTime for refreshing one row at a timeTime for refreshing one row at a time1901901010-9-9 101033 = 0.19 ms < 4-8 ms = 0.19 ms < 4-8 msRefresh complicates operation of memoryRefresh complicates operation of memoryRefresh control competes with CPU for access to Refresh control competes with CPU for access to DRAMDRAMEach row refreshed once every 4-8 ms irrespective of Each row refreshed once every 4-8 ms irrespective of the use of that rowthe use of that rowWant to keep refresh fast (< 5-10% of total Want to keep refresh fast (< 5-10% of total time)time)9Memory Performance Memory Performance CharacteristicsCharacteristicsLatency (access time)Latency (access time)The time interval between the instant at which the data is The time interval between the instant at which the data is called for (READ) or requested to be stored (WRITE), and called for (READ) or requested to be stored (WRITE), and the instant at which it is delivered or completely storedthe instant at which it is delivered or completely storedCycle timeCycle timeThe time between the instant the memory is accessed, and The time between the instant the memory is accessed, and the instant at which it may be validly accessed againthe instant at which it may be validly accessed againBandwidth (throughput)Bandwidth (throughput)The rate at which data can be transferred to or from The rate at which data can be transferred to or from memorymemoryReciprocal of cycle timeReciprocal of cycle time““Burst mode” bandwidth is of greatest interestBurst mode” bandwidth is of greatest interestCycle time > access time for conventional DRAMCycle time > access time for conventional DRAMCycle time < access time in “burst mode” when a Cycle time < access time in “burst mode” when a sequence of consecutive locations is read or writtensequence of consecutive locations is read or written10Improving PerformanceImproving PerformanceLatency can be reduced byLatency can be reduced byReducing access time of chipsReducing access time of chipsUsing
View Full Document