11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.1CS152Computer Architecture and EngineeringLecture 23Virtual Memory (cont)Buses and I/O #1November 21st, 2001John Kubiatowicz (http.cs.berkeley.edu/~kubitron)lecture slides: http://www-inst.eecs.berkeley.edu/~cs152/11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.2CPU Registers100s Bytes<10s nsCacheK Bytes10-100 ns$.01-.001/bitMain MemoryM Bytes100ns-1us$.01-.001DiskG Bytesms10 - 10 cents-3-4CapacityAccess TimeCostTapeinfinitesec-min10-6RegistersCacheMemoryDiskTapeInstr. OperandsBlocksPagesFilesStagingXfer Unitprog./compiler1-8 bytescache cntl8-128 bytesOS512-4K bytesuser/operatorMbytesUpper LevelLower LevelfasterLargerRecall: Levels of the Memory Hierarchy11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.3° Virtual memory => treat memory as a cache for the disk° Terminology: blocks in this cache are called “Pages”• Typical size of a page: 1K — 8K° Page table maps virtual page numbers to physical frames• “PTE” = Page Table EntryPhysical Address SpaceVirtual Address SpaceRecall: What is virtual memory?Virtual AddressPage TableindexintopagetablePage TableBase RegVAccessRightsPAV page no. offset10table locatedin physicalmemoryP page no. offset10Physical Address11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.4Recall: Three Advantages of Virtual Memory° Translation:• Program can be given consistent view of memory, even though physical memory is scrambled• Makes multithreading reasonable (now used a lot!)• Only the most important part of program (“Working Set”) must be in physical memory.• Contiguous structures (like stacks) use only as much physical memory as necessary yet still grow later.° Protection:• Different threads (or processes) protected from each other.• Different pages can be given special behavior- (Read Only, Invisible to user programs, etc).• Kernel data protected from User programs• Very important for protection from malicious programs=> Far more “viruses” under Microsoft Windows° Sharing:• Can map same physical page to multiple users(“Shared memory”)11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.5To Next Lower Level InHierarchyDATATAGSOne Cache line of DataTag and ComparatorOne Cache line of DataTag and ComparatorOne Cache line of DataTag and ComparatorOne Cache line of DataTag and ComparatorRecall: Reducing Misses via a “Victim Cache”° How to combine fast hit time of direct mapped yet still avoid conflict misses? ° Add buffer to place data discarded from cache° Jouppi [1990]: 4-entry victim cache removed 20% to 95% of conflicts for a 4 KB direct mapped data cache° Used in Alpha, HP machines11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.6Recall: Large Address Spaces: Hierarchical PTTwo-level Page Tables32-bit address:P1 index P2 index page offest4 bytes4 bytes4KB10 10 121KPTEs° 2 GB virtual address space° 4 MB of PTE2–paged, holes° 4 KB of PTE1What about a 48-64 bit address space?11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.7Recall: Inverted Page TablesV.Page P. FramehashVirtualPage=IBM System 38 (AS400) implements 64-bit addresses.48 bits translatedstart of object contains a 12-bit tag=> TLBs or virtually addressed caches are critical11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.8Virtual Address and a Cache: Step backward???° Virtual memory seems to be really slow:• Must access memory on load/store -- even cache hits!• Worse, if translation not completely in memory, may need to go to disk before hitting in cache!° Solution: Caching! (surprise!)• Keep track of most common translations and place them in a “Translation Lookaside Buffer” (TLB)CPUTrans-lationCacheMainMemoryVA PAmisshitdata11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.9Making address translation practical: TLB° Virtual memory => memory acts like a cache for the disk° Page table maps virtual page numbers to physical frames° Translation Look-aside Buffer (TLB) is a cache translationsPhysicalMemory SpaceVirtualAddress SpaceTLBPage Table2013virtual addresspageoff2frame page250physical addresspageoff11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.10TLB organization: include protection° TLB usually organized as fully-associative cache• Lookup is by Virtual Address• Returns Physical Address + other info° Dirty => Page modified (Y/N)? Ref => Page touched (Y/N)?Valid => TLB entry valid (Y/N)? Access => Read? Write? ASID => Which User?Virtual Address Physical Address Dirty Ref Valid Access ASID0xFA00 0x0003 Y N Y R/W 340xFA00 0x0003 Y N Y R/W 340x0040 0x0010 N Y Y R 00x0041 0x0011 N Y Y R 011/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.11Example: R3000 pipeline includes TLB stagesInst FetchDcd/ RegALU / E.A Memory Write RegTLB I-Cache RF Operation WBE.A. TLB D-CacheMIPS R3000 PipelineASID V. Page Number Offset122060xx User segment (caching based on PT/TLB entry)100 Kernel physical space, cached101 Kernel physical space, uncached11x Kernel virtual spaceAllows context switching among64 user processes without TLB flushVirtual Address SpaceTLB64 entry, on-chip, fully associative, software TLB fault handler11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.12What is the replacement policy for TLBs?° On a TLB miss, we check the page table for an entry.Two architectural possibilities:• Hardware “table-walk” (Sparc, among others)- Structure of page table must be known to hardware• Software “table-walk” (MIPS was one of the first)- Lots of flexibility- Can be expensive with modern operating systems.° What if missing Entry is not in page table?• This is called a “Page Fault” requested virtual page is not in memory• Operating system must take over (CS162)- pick a page to discard (possibly writing it to disk)- start loading the page in from disk- schedule some other process to run° Note: possible that parts of page table are not even in memory (I.e. paged out!)• The root of the page table always “pegged” in memory11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.13Page Replacement: Not Recently Used (1-bit LRU, Clock) Set of all pagesin MemoryTail pointer:Mark pages as “not used recentlyHead pointer:Place pages on free list if they are still marked as “not used”. Schedule dirty pages for writing to diskFreelistFree Pages11/21/01 ©UCB Fall 2001CS152 / Kubiatowicz Lec23.14Page Replacement: Not Recently Used (1-bit LRU, Clock) Associated with
View Full Document