Cache memoryUsing T Flip Flop and JK Flip FlopStep 1 - Translate diagram into State TableStep 2 - Create maps for T and JKStep 3 - Determine T, J, and K equationsStep 4 - Draw resulting diagramImplementing FSM with No Inputs Using D, T, and JK Flip FlopsImplementing FSM with No Inputs Using D, T, and JK Flip Flops Cont.Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14von Neumann Architecture PrincetonSlide 16Slide 17Slide 18Slide 19Slide 20Slide 21Cache MemorySlide 23Cache Memory - Three Levels ArchitectureSlide 25Cache (1)Slide 27Slide 28Slide 29Slide 30Slide 31Cache (2)Where can a block be placed in Cache? (1)Where can a block be placed in Cache? (2)How is a Block Found in the Cache?Slide 36Which Block should be Replaced on a Cache Miss?What Happens on a Write?Slide 39Slide 40Slide 41Slide 42Slide 43Slide 44Memory/Cache Related TermsLocality of ReferenceCurrent CPUsReplacing DataReplacement Policies for Associative CacheReplacement in Set-Associative CacheWriting DataCache PerformanceSlide 53Page Replacement - FIFOFIFO vs. OptimalLeast Recently Used (LRU)LRUCache memoryProf. Sin-Min LeeDepartment of Computer ScienceUsing T Flip Flop and JK Flip Floplog24 = 2, so 2 flip flops are needed to implement this FSAStep 1 - Translate diagram into StateTableStep 2 - Create maps for T and JKStep 3 - Determine T, J, and K equationsStep 4 - Draw resulting diagramImplementing FSM with No Inputs Using D, T, and JK Flip FlopsConvert the diagram into a chartImplementing FSM with No Inputs Using D, T, and JK Flip Flops Cont.For D and T Flip FlopsImplementing FSM with No Inputs Using D, T, and JK Flip Flops Cont.For JK Flip FlopImplementing FSM with No Inputs Using D, T, and JK Flip Flops Cont.Final ImplementationThe Processor PictureThe Five Classic Components of a ComputerControlDatapathMemoryProcessorInputOutputProcessor/MemoryBusPCI BusI/O Bussesvon NeumannArchitecturePrincetonAddress PointerArithmeticLogic Unit(ALU)MemoryProgram CounterPc = Pc + 1Data/InstructionsFeaturing Deterministic ExecutionCache MemoryPhysical memory is slow (more than 30 times slower than processor)Cache memory uses SRAM chips.Much fasterMuch expensiveSituated closest to the processorCan be arranged hierarchicallyL1 cache is incorporated into processorL2 cache is outsideCache Memory This photo shows level 2 cache memory on the Processor board, beside the CPUCache Memory- Three LevelsArchitectureAddress PointerMemoryMulti-GigabytesLarge and Slow160 X16XL3 CacheMemoryCache ControlLogicL2 CacheMemoryL1 CacheMemory2X8X16 Megabytes128 Kilobytes32 Kilobytes 2 Gigahertz Clock Featuring Really Non-Deterministic ExecutionCache (1)Is the first level of memory hierarchy encountered once the address leaves the CPUSince the principle of locality applies, and taking advantage of locality to improve performance is so popular, the term cache is now applied whenever buffering is employed to reuse commonly occurring itemsWe will study caches by trying to answer the four questions for the first level of the memory hierarchyCache (2)Every address reference goes first to the cache; if the desired address is not here, then we have a cache miss;The contents are fetched from main memory into the indicated CPU register and the content is also saved into the cache memoryIf the desired data is in the cache, then we have a cache hitThe desired data is brought from the cache, at very high speed (low access time)Most software exhibits temporal locality of access, meaning that it is likely that same address will be used again soon, and if so, the address will be found in the cacheTransfers between main memory and cache occur at granularity of cache lines or cache blocks, around 32 or 64 bytes (rather than bytes or processor words). Burst transfers of this kind receive hardware support and exploit spatial locality of access to the cache (future access are often to address near to the previous one)Where can a block be placed in Cache? (1)Our cache has eight block frames and the main memory has 32 blocksWhere can a block be placed in Cache? (2)Direct mapped CacheEach block has only one place where it can appear in the cache(Block Address) MOD (Number of blocks in cache)Fully associative CacheA block can be placed anywhere in the cacheSet associative CacheA block can be placed in a restricted set of places into the cacheA set is a group of blocks into the cache(Block Address) MOD (Number of sets in the cache)If there are n blocks in the cache, the placement is said to be n-way set associativeHow is a Block Found in the Cache?Caches have an address tag on each block frame that gives the block address. The tag is checked against the address coming from CPUAll tags are searched in parallel since speed is criticalValid bit is appended to every tag to say whether this entry contains valid addresses or notAddress fields:Block addressTag – compared against for a hitIndex – selects the setBlock offset – selects the desired data from the blockSet associative cache Large index means large sets with few blocks per setWith smaller index, the associativity increasesFull associative cache – index field is not existingWhich Block should be Replaced on a Cache Miss?When a miss occurs, the cache controller must select a block to be replaced with the desired dataBenefit of direct mapping is that the hardware decision is much simplifiedTwo primary strategies for full and set associative cachesRandom – candidate blocks are randomly selectedSome systems generate pseudo random block numbers, to get reproducible behavior useful for debuggingLRU (Last Recently Used) – to reduce the chance that information that has been recently used will be needed again, the block replaced is the least-recently used one. Accesses to blocks are recorded to be able to implement LRUWhat Happens on a Write?Two basic options when writing to the cache:Writhe through – the information is written to both, the block in the cache an the block in the lower-level memoryWrite back – the information is written only to the lock in the cacheThe modified block of cache is written back into the lower-level memory only when it is replacedTo reduce the frequency of writing back blocks on replacement, an implementation feature called dirty bit is commonly used. This bit indicates whether a block is dirty (has
View Full Document