Recap What is virtual memory Virtual Address Space CS152 Computer Architecture and Engineering Lecture 22 Physical Address Space Virtual Address 10 offset V page no Page Table Virtual Memory continued Buses Page Table Base Reg index into page table April 21 2004 John Kubiatowicz www cs berkeley edu kubitron lecture slides http inst eecs berkeley edu cs152 V Access Rights PA table located in physical P page no memory offset 10 Physical Address Virtual memory treat memory as a cache for the disk Terminology blocks in this cache are called Pages Typical size of a page 1K 8K Page table maps virtual page numbers to physical frames PTE Page Table Entry 4 21 04 Recap Implementing Large Page Tables CS152 Kubiatowicz Lec22 2 UCB Spring 2004 Recap Making address translation practical TLB Virtual memory memory acts like a cache for the disk Two level Page Tables 32 bit address 10 P1 index 10 P2 index Page table maps virtual page numbers to physical frames 1K PTEs 4KB Translation Look aside Buffer TLB is a cache translations virtual address 12 page offest Virtual Address Space Physical Memory Space page off Page Table 2 4 bytes 0 2 GB virtual address space 1 4 MB of PTE2 3 physical address paged holes 4 KB of PTE1 TLB page off frame page 2 2 0 5 4 bytes What about a 48 64 bit address space 4 21 04 UCB Spring 2004 CS152 Kubiatowicz Lec22 3 4 21 04 UCB Spring 2004 CS152 Kubiatowicz Lec22 4 TLB organization include protection Example R3000 pipeline includes TLB stages Virtual Address Physical Address Dirty Ref Valid Access ASID MIPS R3000 Pipeline Dcd Reg Inst Fetch 0xFA00 0x0040 0x0041 0x0003 0x0010 0x0011 Y N N N Y Y Y Y Y R W R R 34 0 0 TLB I Cache RF ALU E A Memory Operation E A TLB Write Reg WB D Cache TLB 64 entry on chip fully associative software TLB fault handler TLB usually organized as fully associative cache Lookup is by Virtual Address Returns Physical Address other info Virtual Address Space ASID 6 Dirty Page modified Y N Ref Page touched Y N Valid TLB entry valid Y N Access Read Write ASID Which User V Page Number 20 Offset 12 0xx User segment caching based on PT TLB entry 100 Kernel physical space cached 101 Kernel physical space uncached 11x Kernel virtual space Allows context switching among 64 user processes without TLB flush 4 21 04 UCB Spring 2004 CS152 Kubiatowicz Lec22 5 What is the replacement policy for TLBs 4 21 04 Tail pointer Mark pages as not used recently Set of all pages in Memory What if missing Entry is not in page table This is called a Page Fault requested virtual page is not in memory Operating system must take over CS162 pick a page to discard possibly writing it to disk start loading the page in from disk schedule some other process to run Freelist Note possible that parts of page table are not even in memory I e paged out The root of the page table always pegged in memory UCB Spring 2004 CS152 Kubiatowicz Lec22 6 Page Replacement Not Recently Used 1 bit LRU Clock On a TLB miss we check the page table for an entry Two architectural possibilities Hardware table walk Sparc among others Structure of page table must be known to hardware Software table walk MIPS was one of the first Lots of flexibility Can be expensive with modern operating systems 4 21 04 UCB Spring 2004 CS152 Kubiatowicz Lec22 7 Head pointer Place pages on free list if they are still marked as not used Schedule dirty pages for writing to disk 4 21 04 UCB Spring 2004 Free Pages CS152 Kubiatowicz Lec22 8 Page Replacement Not Recently Used 1 bit LRU Clock Reducing translation time further As described TLB lookup is in serial with cache lookup Associated with each page is a used flag such that used flag 1 if the page has been referenced in recent past 0 otherwise Virtual Address page table entry 1 1 0 1 0 0 0 1 1 0 TLB Lookup page fault handler dirty used page table entry last replaced pointer lrp if replacement is to take place advance lrp to next entry mod table size until one with a 0 bit is found this is the target for replacement As a side effect all examined PTE s have their used bits set to zero Or search for the a page that is both not recently referenced AND not dirty V PA offset 10 Physical Address Machines with TLBs go one step further they overlap TLB lookup with cache access Works because lower bits of result offset available early CS152 Kubiatowicz Lec22 9 UCB Spring 2004 Access Rights P page no Architecture part support dirty and used bits in the page table may need to update PTE on any instruction fetch load store How does TLB affect this design problem Software TLB miss 4 21 04 10 offset V page no if replacement is necessary choose any page frame such that its reference bit is 0 This is a page that has not been referenced in the recent past Overlapped TLB Cache Access 4 21 04 CS152 Kubiatowicz Lec22 10 UCB Spring 2004 Problems With Overlapped TLB Access Overlapped access only works as long as the address bits used to index into the cache do not change as the result of VA translation If we do this in parallel we have to be careful however This usually limits things to small caches large page sizes or high n way set associative caches if you want a large cache assoc lookup 32 index TLB 4K Cache 1K Example suppose everything the same except that the cache is increased to 8 K bytes instead of 4 K 11 2 cache index 10 2 disp 00 20 page FN Data Hit Miss Solutions go to 8K byte page sizes go to 2 way set associative cache or SW guarantee VA 13 PA 13 What if cache size is increased to 8KB 4 21 04 UCB Spring 2004 This bit is changed by VA translation but is needed for cache lookup 12 disp 20 virt page Hit Miss FN 00 4 bytes 1K 10 CS152 Kubiatowicz Lec22 11 4 21 04 4 4 UCB Spring 2004 2 way set assoc cache CS152 Kubiatowicz Lec22 12 Cache Optimization Alpha 21064 Another option Virtually Addressed Cache TLBs fully associative VA CPU Translation PA TLB updates in SW Priv Arch Libr Main Memory Separate Instr Data TLB Caches Cache hit Caches 8KB direct mapped write thru data Instr Data Critical 8 bytes first Only require address translation on cache miss synonym problem two different virtual addresses map to same physical address two different cache entries holding data for the same physical address nightmare for update must update all cache entries with same physical address or memory becomes inconsistent Prefetch instr stream buffer 4 entry write buffer between D L2 2 MB L2 cache direct mapped off chip 256 bit path to main memory 4 x 64 bit modules Write Buffer …
View Full Document
Unlocking...