DOC PREVIEW
Berkeley COMPSCI 162 - Lecture 14: Caching and TLBs

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 162 Spring 2004 Lecture 14 1/10 CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2004 Lecture 14: Caching and TLBs 14.0 Main Points: • Cache concept, in general and as applied to translation • Ways of organizing caches – associativity • Problems with caching 14.1 Cache concept Cache: copy that can be accessed more quickly than original. Idea is: make frequent case efficient, and then the infrequent path doesn’t matter as much. Caching is a fundamental concept used in lots of places in computer systems. It underlies many of the techniques that are used today to make computers go fast: can cache translations, memory locations, pages, file blocks, file names, network routes, authorizations for security systems, etc. 14.2 Caching applied to address translation Address translation is on the critical path for every instruction --- can’t afford to always do a memory look up (or even worse, an I/O!) to find a page table entry. Often reference same page repeatedly, why go through entire translation each time? CS 162 Spring 2004 Lecture 14 2/10 CPUTranslation (MMU)Physical MemoryVirtual AddressPhysical AddressData read or write (untranslated)remember? yes noremember!TLB Translation Buffer, Translation Lookaside Buffer: hardware table of frequently used translations, to avoid having to go through page table lookup in common case. Typically, on chip, so access time of 5 – 10ns, instead of several hundred of ns for main memory. TLB for example from previous lecture (simple paging) Virtual page # Physical page # Control bits 2 1 Valid, RW – – Invalid 0 4 Valid, RW 14.2.1 How do we tell if needed translation is in TLB? 1. Search table in sequential orderCS 162 Spring 2004 Lecture 14 3/10 2. Direct mapped: restrict each virtual page to use specific slot in TLB direct mappedvirtual page #hash=?nofull translation replace TLB entryvirtual page #phys page #yesuse TLB entry As a (bad, as we will see shortly) example, could use the lower bits of virtual page number to index TLB. Compare against the upper bits of virtual page number to check for match. Note: Do we need to store the entire virtual page number in the TLB? No, as an enhancement we could get away with storing only the upper bits. Consider a 256 entry TLB, and the following reference stream of virtual page numbers in hex (note these are just page numbers, not entire virtual addresses): 0x621 0x2145 0x621 0x2145 ... 0x321 0x2145 0x321 0x621 What happens to the TLB? CS 162 Spring 2004 Lecture 14 4/10 What if two pages conflict for the same TLB slot? Ex: program counter and stack. Thrashing: cache contents tossed out even if still needed One approach: pick hash function to minimize conflicts What if use low order bits as index into TLB? What if use high order bits as index into TLB? Thus, use selection of high order and low order bits as index. 3. Set associativity: arrange TLB (or cache) as N separate banks. Do simultaneous lookup in each bank. In this case, called “N-way set associative cache”. virtual page #hash=?virtual page #phys page #virtual page #phys page #=?if either match, use TLB entry, otherwise, translate and replace one of the entries. Two-way set associativeCS 162 Spring 2004 Lecture 14 5/10 More set associativity means less chance of thrashing. Translations can be stored, replaced in either bank. CS 162 Spring 2004 Lecture 14 6/10 4. Fully associative: translation can be stored anywhere in TLB, so check all entries in the TLB in parallel. virtual page #if any match, use that TLB entry, otherwise, translate and replace one of the entries.=? =? =?=?Fully associative TLB One element per bank, and one comparator per bank. So parallelism is obtained through the addition of comparator hardware. Same set of options, whether you are building TLB or any kind of cache. Typically, TLB’s are small and fully associative. Hardware caches are larger, and direct mapped or set associative to a small degree. 14.2.2 How do we choose which item to replace? For direct mapped, never any choice as to which item to replace. But for set associative or fully associative cache, have a choice. What should we do? Replace least recently used? Random? Most recently used? Defer until next lecture topic. In hardware, often choose item to replace randomly, because it’s simple and fast. In software (for example, for page replacement), typically do something more sophisticated. Tradeoff: spend CPU cycles to try to improve cache hit rate.CS 162 Spring 2004 Lecture 14 7/10 14.2.3 Consistency between TLB and page tables What happens on context switch? Have to invalidate entire TLB contents. When new program starts running, will bring in new translations. Alternatively, include Process ID tag in TLB comparator. Have to keep TLB consistent with whatever the full translation would give. What if translation tables change? For example, to move page from memory to disk, or vice versa. Have to invalidate TLB entry. 14.3 Relationship between TLB and hardware memory caches CPUTranslation (MMU)Physical MemoryVirtual AddressPhysical AddressData read or write (untranslated)remember? yes noremember!TLB Can put a cache of memory values anywhere in this process. If between translation box and memory, called a “physically addressed cache.” Could also put a cache between CPU and translation box: “virtually addressed cache.” CS 162 Spring 2004 Lecture 14 8/10 CPU virtually addressed cache vaddr data virt page# phys page # TLB paddr data Xoff A X Y Yoff A if no match if no match segment and page table lookup if match if match if no match Yoff A main memory if page fault disk A physically addressed cache Virtual memory is a kind of caching: we’re going to talk about using the contents of main memory as a cache for disk. What about writes? What if CPU modifies a location? Two options: • Write-through: update immediately sent through to next level in memory hierarchy • Write-back: (delayed write-through) update kept until item is replaced from cache, and then sent to next level. Since write-back is faster, memory caches typically use write-back; file systems, which need to worry about whether the data being written will survive a machine crash, tend to use write-through.CS 162 Spring 2004 Lecture 14 9/10 14.4 Memory Hierarchy Two principles: 1. The smaller amount of memory needed,


View Full Document

Berkeley COMPSCI 162 - Lecture 14: Caching and TLBs

Documents in this Course
Lecture 1

Lecture 1

12 pages

Nachos

Nachos

41 pages

Security

Security

39 pages

Load more
Download Lecture 14: Caching and TLBs
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 14: Caching and TLBs and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 14: Caching and TLBs 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?