Unformatted text preview:

CS162 Operating Systems and Systems Programming Lecture 14 Caching and Demand Paging October 14 2009 Prof John Kubiatowicz http inst eecs berkeley edu cs162 Review Memory Hierarchy of a Modern Computer System Take advantage of the principle of locality to Present as much memory as in the cheapest technology Provide access at speed offered by the fastest technology Processor Control On Chip Cache Registers Datapath Second Level Cache SRAM Speed ns 1s 10s 100s Size bytes 100s Ks Ms 10 14 09 Main Memory DRAM 100s Ms Secondary Storage Disk Tertiary Storage Tape 10 000 000 10 000 000 000 s s 10s ms 10s sec Gs Ts Kubiatowicz CS162 UCB Fall 2009 Lec 14 2 Review A Summary on Sources of Cache Misses Compulsory cold start first reference to a block Cold fact of life not a whole lot you can do about it Note When running billions of instruction Compulsory Misses are insignificant Capacity Cache cannot contain all blocks access by the program Solution increase cache size Conflict collision Multiple memory locations mapped to same cache location Solutions increase cache size or increase associativity Two others 10 14 09 Kubiatowicz CS162 UCB Fall 2009 Lec 14 3 Review Set Associative Cache N way set associative N entries per Cache Index N direct mapped caches operates in parallel Example Two way set associative cache Cache Index selects a set from the cache Two tags in the set are compared to input in 31 parallel 8 4 0 Cache Tag based on the Cachetag Index result Byte Select Data is selected Valid Cache Tag Cache Data Cache Block 0 Cache Data Cache Block 0 Cache Tag Valid Compare Sel1 1 Mux 0 Sel0 Compare OR 10 14 09 Kubiatowicz CS162 UCB Fall 2009 Hit Cache Block Lec 14 4 Review Where does a Block Get Placed in a Cache Example Block 12 placed in 8 block cache 32 Block Address Space Block no 1111111111222222222233 01234567890123456789012345678901 Direct mapped Set associative Fully associative block 12 can go only into block 4 12 mod 8 block 12 can go anywhere in set 0 12 mod 4 block 12 can go anywhere Block no 10 14 09 01234567 Block no 01234567 Block no Set Set Set Set 0 1 2 3 Kubiatowicz CS162 UCB Fall 2009 01234567 Lec 14 5 Goals for Today Finish discussion of Caching TLBs Concept of Paging to Disk Page Faults and TLB Faults Precise Interrupts Page Replacement Policies Note Some slides and or pictures in the following are adapted from slides 2005 Silberschatz Galvin and 10 14 09 Kubiatowicz CS162 UCB Fall 2009 Lec 14 6 Gagne Many slides Gagne generated from my lecture notes Which block should be replaced on a miss Easy for Direct Mapped Only one possibility Set Associative or Fully Associative Random LRU Least Recently Used 2 way 4 way 8 way Size LRU Random LRU Random LRU Random 16 KB 5 2 5 7 64 KB 1 9 2 0 256 KB1 15 1 17 10 14 09 4 7 5 3 4 4 5 0 1 5 1 7 1 4 1 5 1 13 1 13 1 12 1 12 Kubiatowicz CS162 UCB Fall 2009 Lec 14 7 What happens on a write Write through The information is written to both the block in the cache and to the block in the lower level memory Write back The information is written only to the block in the cache Modified cache block is written to main memory only when it is replaced Question is block clean or dirty Pros and Cons of each WT PRO read misses cannot result in writes CON Processor held up on writes unless writes buffered WB 10 14 09 PRO repeated writes not sent to DRAM processor not held up on writes CON More complex Read miss may require writeback of dirty data Kubiatowicz CS162 UCB Fall 2009 Lec 14 8 Caching Applied to Address Translation CPU Virtual Address TLB Cached Yes No Physical Address e t v l Sa su e R Translate MMU Physical Memory Data Read or Write untranslated Question is one of page locality does it exist Instruction accesses spend a lot of time on the same page since accesses sequential Stack accesses have definite locality of reference Data accesses have less page locality but still some Can we have a TLB hierarchy Sure multiple levels at different sizes speeds 10 14 09 Kubiatowicz CS162 UCB Fall 2009 Lec 14 9 What Actually Happens on a TLB Miss Hardware traversed page tables On TLB miss hardware in MMU looks at current page table to fill TLB may walk multiple levels If PTE valid hardware fills TLB and processor never knows If PTE marked as invalid causes Page Fault after which kernel decides what to do afterwards Software traversed Page tables like MIPS On TLB miss processor receives TLB fault Kernel traverses page table to find PTE If PTE valid fills TLB and returns from fault If PTE marked as invalid internally calls Page Fault handler Most chip sets provide hardware traversal Modern operating systems tend to have more TLB faults since they use translation for many things Examples shared segments user level portions of an operating system 10 14 09 Kubiatowicz CS162 UCB Fall 2009 Lec 14 10 What happens on a Context Switch Need to do something since TLBs map virtual addresses to physical addresses Address Space just changed so TLB entries no longer valid Options Invalidate TLB simple but might be expensive What if switching frequently between processes Include ProcessID in TLB This is an architectural solution needs hardware What if translation tables change For example to move page from memory to disk or vice versa Must invalidate TLB entry Otherwise might think that page is still in memory 10 14 09 Kubiatowicz CS162 UCB Fall 2009 Lec 14 11 Administrative Midterm I next week Monday 10 19 6 00 9 00pm 145 Dwinelle Should be 2 hour exam with extra time Closed book one page of hand written notes both sides No class on day of Midterm Extra Office Hours Mon 2 00 5 00 Perhaps Midterm Topics Topics Everything up to Wednesday 10 14 History Concurrency Multithreading Synchronization Protection Address Spaces TLBs Make sure to fill out Group Evaluations Project 2 Initial Design Document due tomorrow Tuesday 10 13 Look at the lecture schedule to keep up with due dates 10 14 09 Kubiatowicz CS162 UCB Fall 2009 Lec 14 12 What TLB organization makes sense CPU TLB Cache Memory Needs to be really fast Critical path of memory access In simplest view before the cache Thus this adds to access time reducing cache speed Seems to argue for Direct Mapped or Low Associativity However needs to have very few conflicts With TLB the Miss Time extremely high This argues that cost of Conflict Miss Time is much higher than slightly increased cost of access Hit Time Thrashing continuous conflicts between accesses What if use low order bits of page as index into TLB First page of


View Full Document

Berkeley COMPSCI 162 - Lecture 14 Caching and Demand Paging

Documents in this Course
Lecture 1

Lecture 1

12 pages

Nachos

Nachos

41 pages

Security

Security

39 pages

Load more
Loading Unlocking...
Login

Join to view Lecture 14 Caching and Demand Paging and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 14 Caching and Demand Paging and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?