DOC PREVIEW
Berkeley COMPSCI 162 - Lecture 14 Caching and Demand Paging

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Page 1 CS162 Operating Systems and Systems Programming Lecture 14 Caching and Demand Paging March 4, 2010 Ion Stoica http://inst.eecs.berkeley.edu/~cs162 Lec 14.2 3/4/10 CS162 ©UCB Spring 2010 Review: Memory Hierarchy of a Modern Computer System • Take advantage of the principle of locality to: – Present as much memory as in the cheapest technology – Provide access at speed offered by the fastest technology On-Chip Cache Registers Control Datapath Secondary Storage (Disk) Processor Main Memory (DRAM) Second Level Cache (SRAM) 1s 10,000,000s (10s ms) Speed (ns): 10s-100s 100s 100s Gs Size (bytes): Ks-Ms Ms Tertiary Storage (Tape) 10,000,000,000s (10s sec) Ts Lec 14.3 3/4/10 CS162 ©UCB Spring 2010 Example Processor Main Memory (DRAM) 100ns Access time = 100ns Average Access time = (Hit Rate x HitTime) + (Miss Rate x MissTime) • Data in memory, 10ns cache: • HitRate + MissRate = 1 • HitRate = 90%  Average Access Time = 19ns • HitRate = 99%  Average Access Time = 10.9ns Processor Main Memory (DRAM) 100ns 10ns Second Level Cache (SRAM) • Data in memory, no cache: Lec 14.4 3/4/10 CS162 ©UCB Spring 2010 • Compulsory (cold start): first reference to a block – “Cold” fact of life: not a whole lot you can do about it – Note: When running “billions” of instruction, Compulsory Misses are insignificant • Capacity: – Cache cannot contain all blocks access by the program – Solution: increase cache size • Conflict (collision): – Multiple memory locations mapped to same cache location – Solutions: increase cache size, or increase associativity • Two others: – Coherence (Invalidation): other process (e.g., I/O) updates memory – Policy: Due to non-optimal replacement policy Review: A Summary on Sources of Cache MissesPage 2 Lec 14.5 3/4/10 CS162 ©UCB Spring 2010 Cache Index 0 4 31 Cache Tag Byte Select 8 Cache Data Cache Block 0 Cache Tag Valid : : : Cache Data Cache Block 0 Cache Tag Valid : : : Mux 0 1 Sel1 Sel0 OR Hit Review: Set Associative Cache • N-way set associative: N entries per Cache Index – N direct mapped caches operates in parallel • Example: Two-way set associative cache – Cache Index selects a “set” from the cache – Two tags in the set are compared to input in parallel – Data is selected based on the tag result Compare Compare Cache Block Lec 14.6 3/4/10 CS162 ©UCB Spring 2010 • Example: Block 12 placed in 8 block cache 0 1 2 3 4 5 6 7 Block no. Direct mapped: block 12 (01100) can go only into block 4 (12 mod 8) Set associative: block 12 can go anywhere in set 0 (12 mod 4) 0 1 2 3 4 5 6 7 Block no. Set 0 Set 1 Set 2 Set 3 Fully associative: block 12 can go anywhere 0 1 2 3 4 5 6 7 Block no. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 32-Block Address Space: 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 Block no. Review: Where does a Block Get Placed in a Cache? 01 100 tag block 011 00 tag block 01100 tag Lec 14.7 3/4/10 CS162 ©UCB Spring 2010 • Easy for Direct Mapped: Only one possibility • Set Associative or Fully Associative: – Random – LRU (Least Recently Used) 2-way 4-way 8-way Size LRU Random LRU Random LRU Random 16 KB 5.2% 5.7% 4.7% 5.3% 4.4% 5.0% 64 KB 1.9% 2.0% 1.5% 1.7% 1.4% 1.5% 256 KB 1.15% 1.17% 1.13% 1.13% 1.12% 1.12% Review: Which block should be replaced on a miss? Lec 14.8 3/4/10 CS162 ©UCB Spring 2010 Goals for Today • Finish discussion of Caching/TLBs • Concept of Paging to Disk • Page Faults and TLB Faults • Precise Interrupts • Page Replacement Policies Note: Some slides and/or pictures in the following are adapted from slides ©2005 Silberschatz, Galvin, and Gagne Note: Some slides and/or pictures in the following are adapted from slides ©2005 Silberschatz, Galvin, and Gagne. Many slides generated from my lecture notes by Kubiatowicz.Page 3 Lec 14.9 3/4/10 CS162 ©UCB Spring 2010 • Write through: The information is written to both the block in the cache and to the block in the lower-level memory • Write back: The information is written only to the block in the cache. – Modified cache block is written to main memory only when it is replaced – Question is block clean or dirty? • Pros and Cons of each? – WT: » PRO: read misses cannot result in writes » CON: Processor held up on writes unless writes buffered – WB: » PRO: repeated writes not sent to DRAM processor not held up on writes » CON: More complex Read miss may require writeback of dirty data What happens on a write? Lec 14.10 3/4/10 CS162 ©UCB Spring 2010 Caching Applied to Address Translation • Question is one of page locality: does it exist? – Instruction accesses spend a lot of time on the same page (since accesses sequential) – Stack accesses have definite locality of reference – Data accesses have less page locality, but still some… • Can we have a TLB hierarchy? – Sure: multiple levels at different sizes/speeds Data Read or Write (untranslated) CPU Physical Memory TLB Translate (MMU) No Virtual Address Physical Address Yes Cached? Save Result Lec 14.11 3/4/10 CS162 ©UCB Spring 2010 What Actually Happens on a TLB Miss? • Hardware traversed page tables: – On TLB miss, hardware in MMU looks at current page table to fill TLB (may walk multiple levels) » If PTE valid, hardware fills TLB and processor never knows » If PTE marked as invalid, causes Page Fault, after which kernel decides what to do afterwards • Software traversed Page tables (like MIPS) – On TLB miss, processor receives TLB fault – Kernel traverses page table to find PTE » If PTE valid, fills TLB and returns from fault » If PTE marked as invalid, internally calls Page Fault handler • Most chip sets provide hardware traversal – Modern operating systems tend to have more TLB faults since they use translation for many things – Examples: » shared segments » user-level portions of an operating system Lec 14.12 3/4/10 CS162 ©UCB Spring 2010 What happens on a Context Switch? • Need to do something, since TLBs map virtual addresses to physical addresses – Address Space just changed, so TLB entries no longer valid! • Options? – Invalidate TLB: simple but


View Full Document

Berkeley COMPSCI 162 - Lecture 14 Caching and Demand Paging

Documents in this Course
Lecture 1

Lecture 1

12 pages

Nachos

Nachos

41 pages

Security

Security

39 pages

Load more
Download Lecture 14 Caching and Demand Paging
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 14 Caching and Demand Paging and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 14 Caching and Demand Paging 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?