CS152 Computer Architecture and Engineering Lecture 23 Virtual Memory (cont) Buses and I/O #1Recall: Levels of the Memory HierarchyRecall: What is virtual memory?Recall: Three Advantages of Virtual MemoryRecall: Reducing Misses via a “Victim Cache”Recall: Large Address Spaces: Hierarchical PTRecall: Inverted Page TablesVirtual Address and a Cache: Step backward???Making address translation practical: TLBTLB organization: include protectionExample: R3000 pipeline includes TLB stagesWhat is the replacement policy for TLBs?Page Replacement: Not Recently Used (1-bit LRU, Clock)Slide 14Reducing translation time furtherOverlapped TLB & Cache AccessProblems With Overlapped TLB AccessAnother option: Virtually Addressed CacheCache Optimization: Alpha 21064AdministriviaAdministrivia IIWhat is a bus?BusesAdvantages of BusesDisadvantage of BusesThe General Organization of a BusMaster versus SlaveWhat is DMA (Direct Memory Access)?Types of BusesA Computer System with One Bus: Backplane BusA Two-Bus SystemA Three-Bus System (+ backside cache)Main components of Intel Chipset: Pentium II/IIIWhat defines a bus?Synchronous and Asynchronous BusSimple Synchronous ProtocolTypical Synchronous ProtocolAsynchronous Write TransactionAsynchronous Read TransactionMultiple Potential Bus Masters: the Need for ArbitrationArbitration: Obtaining Access to the BusThe Daisy Chain Bus Arbitrations SchemeCentralized Parallel ArbitrationIncreasing the Bus BandwidthIncreasing Transaction Rate on Multimaster BusPCI Read/Write TransactionsPCI Read TransactionPCI Write TransactionPCI OptimizationsSummary #1 / 2: Virtual MemorySummary #2 / 211/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.1CS152Computer Architecture and EngineeringLecture 23Virtual Memory (cont)Buses and I/O #1November 21st, 2001John Kubiatowicz (http.cs.berkeley.edu/~kubitron)lecture slides: http://www-inst.eecs.berkeley.edu/~cs152/11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.2CPU Registers100s Bytes<10s nsCacheK Bytes10-100 ns$.01-.001/bitMain MemoryM Bytes100ns-1us$.01-.001DiskG Bytesms10 - 10 cents-3-4CapacityAccess TimeCostTapeinfinitesec-min10-6RegistersCacheMemoryDiskTapeInstr. OperandsBlocksPagesFilesStagingXfer Unitprog./compiler1-8 bytescache cntl8-128 bytesOS512-4K bytesuser/operatorMbytesUpper LevelLower LevelfasterLargerRecall: Levels of the Memory Hierarchy11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.3°Virtual memory => treat memory as a cache for the disk°Terminology: blocks in this cache are called “Pages”•Typical size of a page: 1K — 8K°Page table maps virtual page numbers to physical frames•“PTE” = Page Table EntryPhysical Address SpaceVirtual Address SpaceRecall: What is virtual memory?Virtual AddressPage TableindexintopagetablePage TableBase RegVAccessRightsPAV page no. offset10table locatedin physicalmemoryP page no. offset10Physical Address11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.4Recall: Three Advantages of Virtual Memory°Translation: •Program can be given consistent view of memory, even though physical memory is scrambled•Makes multithreading reasonable (now used a lot!)•Only the most important part of program (“Working Set”) must be in physical memory.•Contiguous structures (like stacks) use only as much physical memory as necessary yet still grow later.°Protection:•Different threads (or processes) protected from each other.•Different pages can be given special behavior- (Read Only, Invisible to user programs, etc).•Kernel data protected from User programs•Very important for protection from malicious programs=> Far more “viruses” under Microsoft Windows°Sharing:•Can map same physical page to multiple users(“Shared memory”)11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.5To Next Lower Level InHierarchyDATATAGSOne Cache line of DataTag and ComparatorOne Cache line of DataTag and ComparatorOne Cache line of DataTag and ComparatorOne Cache line of DataTag and ComparatorRecall: Reducing Misses via a “Victim Cache”°How to combine fast hit time of direct mapped yet still avoid conflict misses? °Add buffer to place data discarded from cache°Jouppi [1990]: 4-entry victim cache removed 20% to 95% of conflicts for a 4 KB direct mapped data cache°Used in Alpha, HP machines11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.6Recall: Large Address Spaces: Hierarchical PTTwo-level Page Tables32-bit address:P1 index P2 index page offest4 bytes4 bytes4KB10 10 121KPTEs° 2 GB virtual address space° 4 MB of PTE2– paged, holes° 4 KB of PTE1What about a 48-64 bit address space?11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.7Recall: Inverted Page TablesV.Page P. FramehashVirtualPage=IBM System 38 (AS400) implements 64-bit addresses.48 bits translatedstart of object contains a 12-bit tag=> TLBs or virtually addressed caches are critical11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.8Virtual Address and a Cache: Step backward???°Virtual memory seems to be really slow:•Must access memory on load/store -- even cache hits!•Worse, if translation not completely in memory, may need to go to disk before hitting in cache!°Solution: Caching! (surprise!)•Keep track of most common translations and place them in a “Translation Lookaside Buffer” (TLB)CPUTrans-lationCacheMainMemoryVA PAmisshitdata11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.9Making address translation practical: TLB°Virtual memory => memory acts like a cache for the disk°Page table maps virtual page numbers to physical frames°Translation Look-aside Buffer (TLB) is a cache translationsPhysicalMemory SpaceVirtual Address SpaceTLBPage Table2013virtual addresspageoff2frame page250physical addresspageoff11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.10TLB organization: include protection°TLB usually organized as fully-associative cache•Lookup is by Virtual Address•Returns Physical Address + other info°Dirty => Page modified (Y/N)? Ref => Page touched (Y/N)?Valid => TLB entry valid (Y/N)? Access => Read? Write? ASID => Which User?Virtual Address Physical Address Dirty Ref Valid Access ASID0xFA00 0x0003 Y N Y R/W 340xFA00 0x0003 Y N Y R/W 340x0040 0x0010 N Y Y R 00x0041 0x0011 N Y Y R 011/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.11Example: R3000 pipeline includes TLB stagesInst FetchDcd/ RegALU / E.A Memory Write Reg TLB I-Cache RF Operation WB E.A. TLB
View Full Document