Page 1P6 / Linux Memory SystemMarch 23, 2004P6 / Linux Memory SystemMarch 23, 2004Topics P6 address translation Linux memory management Linux page fault handling Memory mappingclass19.ppt15-213“The course that gives CMU its Zip!”– 2 –15-213, S’04Intel P6(Bob Collwel’s Chip, CMU Alumni)Intel P6(Bob Collwel’s Chip, CMU Alumni)Internal Designation for Successor to Pentium Which had internal designation P5Fundamentally Different from Pentium Out-of-order, superscalar operation Designed to handle server applications Requires high performance memory systemResulting Processors PentiumPro (1996) Pentium II (1997) Incorporated MMX instructions» special instructions for parallel processing L2 cache on same chip Pentium III (1999) Incorporated Streaming SIMD Extensions» More instructions for parallel processing– 3 –15-213, S’04P6 Memory SystemP6 Memory Systembus interface unitDRAMexternal system bus (e.g. PCI)instructionfetch unitL1i-cacheL2cachecache busL1d-cacheinstTLBdataTLBprocessor package32 bit address space4 KB page sizeL1, L2, and TLBs 4-way set associativeInst TLB 32 entries 8 setsData TLB 64 entries 16 setsL1 i-cache and d-cache 16 KB 32 B line size 128 setsL2 cache unified 128 KB -- 2 MB– 4 –15-213, S’04Review of AbbreviationsReview of AbbreviationsSymbols: Components of the virtual address (VA) TLBI: TLB index TLBT: TLB tag VPO: virtual page offset VPN: virtual page number Components of the physical address (PA) PPO: physical page offset (same as VPO) PPN: physical page number CO: byte offset within cache line CI: cache index CT: cache tagPage 2– 5 –15-213, S’04IA32 Segmented VM OverviewIA32 Segmented VM Overview– 6 –15-213, S’04Overview of P6 Address TranslationOverview of P6 Address TranslationCPUVPN VPO20 12TLBT TLBI416virtual address (VA)...TLB (16 sets, 4 entries/set)VPN1 VPN21010PDE PTEPDBRPPN PPO20 12Page tablesTLBmissTLBhitphysicaladdress (PA)result32...CT CO20 5CI7L2 and DRAML1 (128 sets, 4 lines/set)L1hitL1miss– 7 –15-213, S’04P6 2-level Page Table StructureP6 2-level Page Table StructurePage directory 1024 4-byte page directory entries (PDEs) that point to page tables one page directory per process. page directory must be in memory when its process is running always pointed to by PDBRPage tables: 1024 4-byte page table entries (PTEs) that point to pages. page tables can be paged in and out.page directory...Up to 1024 page tables1024PTEs1024PTEs1024PTEs...1024PDEs– 8 –15-213, S’04P6 Page Directory Entry (PDE)P6 Page Directory Entry (PDE)Page table physical base addr Avail G PS A CD WT U/S R/W P=1Page table physical base address: 20 most significant bits of physical page table address (forces page tables to be 4KB aligned)Avail: These bits available for system programmersG: global page (don’t evict from TLB on task switch)PS: page size 4K (0) or 4M (1)A: accessed (set by MMU on reads and writes, cleared by software)CD: cache disabled (1) or enabled (0)WT: write-through or write-back cache policy for this page tableU/S: user or supervisor mode accessR/W: read-only or read-write accessP: page table is present in memory (1) or not (0)31 1211 9 8 7 6 5 4 3 2 1 0Available for OS (page table location in secondary storage) P=031 01Page 3– 9 –15-213, S’04P6 Page Table Entry (PTE)P6 Page Table Entry (PTE)Page physical base address Avail G 0 D A CD WT U/S R/W P=1Page base address: 20 most significant bits of physical page address (forces pages to be 4 KB aligned)Avail: available for system programmersG: global page (don’t evict from TLB on task switch)D: dirty (set by MMU on writes)A: accessed (set by MMU on reads and writes) CD: cache disabled or enabledWT: write-through or write-back cache policy for this pageU/S: user/supervisorR/W: read/writeP: page is present in physical memory (1) or not (0)31 1211 9 8 7 6 5 4 3 2 1 0Available for OS (page location in secondary storage) P=031 01– 10 –15-213, S’04How P6 Page Tables Map VirtualAddresses to Physical OnesHow P6 Page Tables Map VirtualAddresses to Physical OnesPDEPDBRphysical addressof page table base(if P=1)physical addressof page base(if P=1)physical addressof page directoryword offset into page directoryword offset into page tablepage directory page tableVPN110VPO10 12VPN2 Virtual addressPTEPPN PPO2012Physical addressword offset into physical and virtualpage– 11 –15-213, S’044Mbyte PDE’s4Mbyte PDE’s– 12 –15-213, S’04Support for 4Mbyte PagesSupport for 4Mbyte PagesPage 4– 13 –15-213, S’04Representation of VM Address SpaceRepresentation of VM Address SpaceSimplified Example 16 page virtual address spaceFlags P: Is entry in physical memory? M: Has this part of VA space been mapped?Page DirectoryPT 3P=1, M=1P=1, M=1P=0, M=0P=0, M=1••••P=1, M=1P=0, M=0P=1, M=1P=0, M=1••••P=1, M=1P=0, M=0P=1, M=1P=0, M=1••••P=0, M=1P=0, M=1P=0, M=0P=0, M=0••••PT 2PT 0Page 0Page 1Page 2Page 3Page 4Page 5Page 6Page 7Page 8Page 9Page 10Page 11Page 12Page 13Page 14Page 15Mem AddrDisk AddrIn MemOn DiskUnmapped– 14 –15-213, S’04Aside: Inverted Page TablesAside: Inverted Page TablesProblem: PageProblem: Page--table storage overhead is a function of table storage overhead is a function of the virtual address space size.the virtual address space size.VPNPPNTAG=?HitHash– 15 –15-213, S’04P6 TLB TranslationP6 TLB TranslationCPUVPN VPO20 12TLBT TLBI416virtual address (VA)...TLB (16 sets, 4 entries/set)VPN1 VPN21010PDE PTEPDBRPPN PPO20 12Page tablesTLBmissTLBhitphysicaladdress (PA)result32...CT CO20 5CI7L2 and DRAML1 (128 sets, 4 lines/set)L1hitL1miss– 16 –15-213, S’04P6 TLBP6 TLBTLB entry (not all documented, so this is speculative): V: indicates a valid (1) or invalid (0) TLB entry PD: is this entry a PDE (1) or a PTE (0)? tag: disambiguates entries cached in the same set PDE/PTE: page directory or page table entry Structure of the data TLB: 16 sets, 4 entries/setPDE/PTE Tag PD V1 11632entry entry entry entryentry entry entry entryentry entry entry entryentry entry entry entry...set 0set 1set 2set 15Page 5– 17 –15-213, S’04Translating with the P6 TLBTranslating with the P6 TLB1. Partition VPN into TLBT and TLBI.2. Is the PTE for VPN cached in set TLBI? 3. Yes: then build physical address. 4. No: then read PTE (and PDE if not cached) from memory and build
View Full Document