Carnegie Mellon Introduction to Computer Systems 15 213 18 243 spring 2009 19th Lecture Mar 26th Instructors Gregory Kesden and Markus P schel Carnegie Mellon Last Time Linux VM as Collection of Areas task struct mm vm area struct mm struct pgd mmap vm end vm start vm prot vm flags vm next Page directory address vm end vm start vm prot vm flags vm prot vm next pgd Read write permissions for this area vm flags Shared with other processes or private to this process process virtual memory shared libraries 0x40000000 data 0x0804a020 text vm end vm start vm prot vm flags vm next 0x08048000 0 Carnegie Mellon Last Time Memory Mapping Creation of new VM area Create new vm area struct and page tables for area Area can be backed by i e get its initial values from File on disk copy on write possible e g fork Nothing e g bss demand zero Key point no virtual pages are copied into physical memory until they are referenced Known as demand paging Carnegie Mellon Last Time P6 Address Translation 32 result CPU 20 VPN 12 VPO 16 TLBT virtual address VA L1 miss L1 hit 4 TLBI TLB hit TLB miss L1 128 sets 4 lines set TLB 16 sets 4 entries set 10 10 VPN1 VPN2 20 PPN PDE PDBR L2 and DRAM Page tables PTE 12 PPO physical address PA 20 CT 7 5 CI CO Carnegie Mellon Today Performance optimization for VM system Dynamic memory allocation Carnegie Mellon Large Pages 10 12 20 12 VPN VPO VPN VPO versus 10 12 20 12 PPN PPO PPN PPO 4MB on 32 bit 2MB on 64 bit Simplify address translation Useful for programs with very large contiguous working sets Reduces compulsory TLB misses How to use Linux hugetlbfs support since at least 2 6 16 Use libhugetlbs m c re alloc replacements Carnegie Mellon Buffering Example MMM Blocked for cache c a i1 Block size B x B b Assume blocking for L2 cache say 512 MB 219 B 216 doubles C 3B2 C means B 150 c Carnegie Mellon Buffering Example MMM cont But Look at one iteration c assume 4 KB 512 doubles a b c blocksize B 150 each row used O B times but every time O B2 ops between Consequence Each row is on different page More rows than TLB entries TLB thrashing Solution buffering copy block to contiguous memory O B2 cost for O B3 operations Carnegie Mellon Today Performance optimization for VM system Dynamic memory allocation Carnegie Mellon Process Memory Image kernel virtual memory memory protected from user code stack esp Allocators request additional heap memory from the kernel using the sbrk function the brk ptr error sbrk amt more run time heap via malloc uninitialized data bss initialized data data program text text 0 Carnegie Mellon Why Dynamic Memory Allocation Sizes of needed data structures may only be known at runtime Carnegie Mellon Dynamic Memory Allocation Memory allocator VM hardware and kernel allocate pages Application objects are typically smaller Allocator manages objects within pages Application Dynamic Memory Allocator Heap Memory Explicit vs Implicit Memory Allocator Explicit application allocates and frees space In C malloc and free Implicit application allocates but does not free space Allocation A memory allocator doles out memory blocks to application A block is a contiguous range of bytes In Java ML Lisp garbage collection of any size in this context Today simple explicit memory allocation Carnegie Mellon Malloc Package include stdlib h void malloc size t size Successful Returns a pointer to a memory block of at least size bytes typically aligned to 8 byte boundary If size 0 returns NULL Unsuccessful returns NULL 0 and sets errno void free void p Returns the block pointed at by p to pool of available memory p must come from a previous call to malloc or realloc void realloc void p size t size Changes size of block p and returns pointer to new block Contents of new block unchanged up to min of old and new size Old block has been free d logically if new old Carnegie Mellon Malloc Example void foo int n int m int i p allocate a block of n ints p int malloc n sizeof int if p NULL perror malloc exit 0 for i 0 i n i p i i add m bytes to end of p block if p int realloc p n m sizeof int NULL perror realloc exit 0 for i n i n m i p i i print new array for i 0 i n m i printf d n p i free p return p to available memory pool Carnegie Mellon Assumptions Made in This Lecture Memory is word addressed each word can hold a pointer Allocated block 4 words Free block 3 words Free word Allocated word Carnegie Mellon Allocation Example p1 malloc 4 p2 malloc 5 p3 malloc 6 free p2 p4 malloc 2 Carnegie Mellon Constraints Applications Can issue arbitrary sequence of malloc and free requests free requests must be to a malloc d block Allocators Can t control number or size of allocated blocks Must respond immediately to malloc requests i e can t reorder or buffer requests Must allocate blocks from free memory i e can only place allocated blocks in free memory Must align blocks so they satisfy all alignment requirements 8 byte alignment for GNU malloc libc malloc on Linux boxes Can manipulate and modify only free memory Can t move the allocated blocks once they are malloc d i e compaction is not allowed Carnegie Mellon Performance Goal Throughput Given some sequence of malloc and free requests R0 R1 Rk Rn 1 Goals maximize throughput and peak memory utilization These goals are often conflicting Throughput Number of completed requests per unit time Example 5 000 malloc calls and 5 000 free calls in 10 seconds Throughput is 1 000 operations second How to do malloc and free in O 1 What s the problem Carnegie Mellon Performance Goal Peak Memory Utilization Given some sequence of malloc and free requests R0 R1 Rk Rn 1 Def Aggregate payload Pk malloc p results in a block with a payload of p bytes After request Rk has completed the aggregate payload Pk is the sum of currently allocated payloads all malloc d stuff minus all free d stuff Def Current heap size Hk Assume Hk is monotonically nondecreasing reminder it grows when allocator uses sbrk Def Peak memory utilization after k requests Uk maxi k Pi Hk Carnegie Mellon Fragmentation Poor memory utilization caused by fragmentation internal fragmentation external fragmentation Carnegie Mellon Internal Fragmentation For a given block internal fragmentation occurs if payload is smaller than block size block Internal fragmentation payload Caused by overhead of maintaining heap data structures padding for alignment purposes explicit policy decisions e g to return a big block to satisfy a small request Depends only on the pattern of previous requests thus
View Full Document