Unformatted text preview:

COMP 206: Computer Architecture and ImplementationOutlineBasics of DRAM TechnologyDRAM Organization: Fig. 5.29Chip OrganizationChip Organization Example: 64Mb DRAMDRAM AccessDRAM RefreshMemory Performance CharacteristicsImproving PerformanceTwo Recent ProblemsIncreasing Granularity of Memory SystemsGranularity ExampleGranularity Example (2)Improving Memory Chip PerformanceBasic Mode of OperationNibble (or Burst) ModeFast Page ModeEDO ModeEvolutionary DRAM ArchitecturesRevolutionary DRAM ArchitecturesAchieving Higher Memory BandwidthMemory InterleavingLow-order Bit InterleavingMixed InterleavingOther types of Memory1COMP 206:COMP 206:Computer Architecture and Computer Architecture and ImplementationImplementationMontek SinghMontek SinghWed., Nov. 19, 2003Wed., Nov. 19, 2003Topic: Topic: Main Memory (DRAM) OrganizationMain Memory (DRAM) Organization2OutlineOutlineIntroductionIntroductionDRAM OrganizationDRAM OrganizationChallengesChallengesBandwidthBandwidthGranularityGranularityPerformancePerformanceReading: HP3 5.8 and 5.9Reading: HP3 5.8 and 5.93Basics of DRAM TechnologyBasics of DRAM TechnologyDRAM (Dynamic RAM)DRAM (Dynamic RAM)Used mostly in main mem.Used mostly in main mem.Capacitor + 1 transistor/bitCapacitor + 1 transistor/bitNeed refresh every 4-8 msNeed refresh every 4-8 ms5% of total time5% of total timeRead is destructive (need Read is destructive (need for write-back)for write-back)Access time < cycle time Access time < cycle time (because of writing back)(because of writing back)Density (25-50):1 to SRAMDensity (25-50):1 to SRAMAddress lines multiplexedAddress lines multiplexedpins are scarce!pins are scarce!SRAM (Static RAM)SRAM (Static RAM)Used mostly in caches Used mostly in caches (I, D, TLB, BTB)(I, D, TLB, BTB)1 flip-flop (4-6 1 flip-flop (4-6 transistors) per bittransistors) per bitRead is not destructiveRead is not destructiveAccess time = cycle Access time = cycle timetimeSpeed (8-16):1 to DRAMSpeed (8-16):1 to DRAMAddress lines not multiplexedAddress lines not multiplexedhigh speed of decoding high speed of decoding imp.imp.4DRAM Organization: Fig. 5.29DRAM Organization: Fig. 5.295Chip OrganizationChip OrganizationChip capacity (= number of data bits)Chip capacity (= number of data bits)tends to quadrupletends to quadruple1K, 4K, 16K, 64K, 256K, 1M, 4M, …1K, 4K, 16K, 64K, 256K, 1M, 4M, …In early designs, each data bit belonged to a In early designs, each data bit belonged to a different address (x1 organization)different address (x1 organization)Starting with 1Mbit chips, wider chips (4, 8, 16, Starting with 1Mbit chips, wider chips (4, 8, 16, 32 bits wide) began to appear32 bits wide) began to appearAdvantage: Higher bandwidthAdvantage: Higher bandwidthDisadvantage: More pins, hence more expensive Disadvantage: More pins, hence more expensive packagingpackaging6Chip Organization Example: 64Mb Chip Organization Example: 64Mb DRAMDRAMOrganization Address Bits Address Pins Data Pins Total Pins64Mx1 26 13 1 1416Mx4 24 12 4 168Mx8 23 12 8 204Mx16 22 11 16 272Mx32 21 11 32 43pins data and pins address has chip 2)(2iiniin7DRAM AccessDRAM AccessSeveral steps in DRAM access:Several steps in DRAM access:Half of the address bits select a row of the square Half of the address bits select a row of the square arrayarrayWhole row of bits is brought out of the memory array Whole row of bits is brought out of the memory array into a buffer register (slow, 60-80% of access time)into a buffer register (slow, 60-80% of access time)Other half of address bits select one bit of buffer Other half of address bits select one bit of buffer register (with the help of multiplexer), which is read or register (with the help of multiplexer), which is read or writtenwrittenWhole row is written back to memory arrayWhole row is written back to memory arrayNotes:Notes:This organization is demanded by needs of refreshThis organization is demanded by needs of refreshHas advantages: e.g., nibble, page, and static column Has advantages: e.g., nibble, page, and static column mode operationmode operation8DRAM RefreshDRAM RefreshRefreshes are performed one row at a time.Refreshes are performed one row at a time.Consider a 1Mx1 DRAM chip with 190 ns cycle timeConsider a 1Mx1 DRAM chip with 190 ns cycle timeTime for refreshing one row at a timeTime for refreshing one row at a time1901901010-9-9 101033 = 0.19 ms < 4-8 ms = 0.19 ms < 4-8 msRefresh complicates operation of memoryRefresh complicates operation of memoryRefresh control competes with CPU for access to Refresh control competes with CPU for access to DRAMDRAMEach row refreshed once every 4-8 ms irrespective of Each row refreshed once every 4-8 ms irrespective of the use of that rowthe use of that rowWant to keep refresh fast (< 5-10% of total Want to keep refresh fast (< 5-10% of total time)time)9Memory Performance Memory Performance CharacteristicsCharacteristicsLatency (access time)Latency (access time)The time interval between the instant at which the data is The time interval between the instant at which the data is called for (READ) or requested to be stored (WRITE), and called for (READ) or requested to be stored (WRITE), and the instant at which it is delivered or completely storedthe instant at which it is delivered or completely storedCycle timeCycle timeThe time between the instant the memory is accessed, and The time between the instant the memory is accessed, and the instant at which it may be validly accessed againthe instant at which it may be validly accessed againBandwidth (throughput)Bandwidth (throughput)The rate at which data can be transferred to or from The rate at which data can be transferred to or from memorymemoryReciprocal of cycle timeReciprocal of cycle time““Burst mode” bandwidth is of greatest interestBurst mode” bandwidth is of greatest interestCycle time > access time for conventional DRAMCycle time > access time for conventional DRAMCycle time < access time in “burst mode” when a Cycle time < access time in “burst mode” when a sequence of consecutive locations is read or writtensequence of consecutive locations is read or written10Improving PerformanceImproving PerformanceLatency can be reduced byLatency can be reduced byReducing access time of chipsReducing access time of chipsUsing


View Full Document

UNC-Chapel Hill COMP 206 - LECTURE NOTES

Download LECTURE NOTES
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LECTURE NOTES and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LECTURE NOTES 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?