CS150 Cache CheckpointDan Yeager, Daiwei Li, James ParkerCurrent Architecture: Block Rams● Essentially single write port, dual read ports● Pros:○ Simple○ Fast (1 cycle)○ Allows us to load programs into IMEM using SW● Cons:○ Max 5MB block ram spaceExpanding MemoryImages from DDCA(DDR2)DDR2 - Basic Idea● Same interface:○ Dual read read ports for I & D○ Writes now more complicated (to be discussed)MIPS DDR2I CacheD CacheConcurrency● Write 0xBAADF00D to ICache 0x00000000● Write 0xDEADBEEF to DCache 0x00000000Caches not concurrent!MIPS DDR2I CacheD CacheConcurrency Solution1. Writing a new program:○ Write to I & D simultaneously■ Hardware enforced○ Concurrency enforced 2. D-cache loads and stores:○ Don't overwrite the program space■ SW enforced○ OK if not concurrent DDDDD IIIIBIOS● Write to I & D Cache simultaneously● Run bios out of block ram○ Like I/D mem from last checkpointMIPS DDR2I CacheD CacheBIOSMemory MapDeviceR/W Top Nibble Address TypeD$ R/W xxx1 MemI$ R xxx1 PCI$ W xx1x MemBIOS R x1xx PC/MemI/O R/W 1xxx MemExample Address: yyyy_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxxTop NibbleBIOS (RAM)UARTI CacheD CacheMemory Map Examples Based on the map,what do these opsdo? ● PC=0x4000010, MemAddr=0x30000000, SW● PC=0x1000000, MemAddr=0x80000000, LW● PC=0x1000000, MemAddr=0x30000000, SWDeviceR/W Top Nibble Address TypeD$ R/W xxx1 MemI$ R xxx1 PCI$ W xx1x MemBIOS R x1xx PC/MemI/O R/W 1xxx MemFull PictureStallsRemember:Speed, cost, size tradeoff CPU needs to wait (stall) for cache missStall Implementation● Implement & verify separate from your cache○ Modularity is crucial for testability ● Test by stalling each instruction○ @(posedge clk) stall <= ~stall; ● Ensure that no state elements are written while stall is assertedFinal Notes● Start early● Please read spec carefully● Draw timing diagramsInstrument your code● Survey of CP3 common mistakes
View Full Document