DOC PREVIEW
Berkeley COMPSCI 162 - Lecture 14 Caching and Demand Paging

This preview shows page 1-2-24-25 out of 25 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS162 Operating Systems and Systems Programming Lecture 14 Caching and Demand PagingReview: Memory Hierarchy of a Modern Computer SystemReview: A Summary on Sources of Cache MissesReview: Where does a Block Get Placed in a Cache?Review: Other Caching QuestionsGoals for TodayQuick Aside: Protection without HardwareCaching Applied to Address TranslationTLB organizationExample: R3000 pipeline includes TLB “stages”Reducing translation time furtherOverlapping TLB & Cache AccessAdministriviaDemand PagingIllusion of Infinite MemoryDemand Paging is CachingReview: What is in a PTE?Demand Paging MechanismsSoftware-Loaded TLBTransparent ExceptionsConsider weird things that can happenPrecise ExceptionsPage Replacement PoliciesReplacement Policies (Con’t)SummaryCS162Operating Systems andSystems ProgrammingLecture 14Caching and Demand PagingOctober 19, 2005Prof. John Kubiatowiczhttp://inst.eecs.berkeley.edu/~cs162Lec 14.210/19/05 Kubiatowicz CS162 ©UCB Fall 2005Review: Memory Hierarchy of a Modern Computer System•Take advantage of the principle of locality to:–Present as much memory as in the cheapest technology–Provide access at speed offered by the fastest technologyOn-ChipCacheRegistersControlDatapathSecondaryStorage(Disk)ProcessorMainMemory(DRAM)SecondLevelCache(SRAM)1s10,000,000s (10s ms)Speed (ns): 10s-100s 100s100s GsSize (bytes): Ks-Ms MsTertiaryStorage(Tape)10,000,000,000s (10s sec)TsLec 14.310/19/05 Kubiatowicz CS162 ©UCB Fall 2005•Compulsory (cold start or process migration, first reference): first access to a block–“Cold” fact of life: not a whole lot you can do about it–Note: If you are going to run “billions” of instruction, Compulsory Misses are insignificant•Capacity:–Cache cannot contain all blocks access by the program–Solution: increase cache size•Conflict (collision):–Multiple memory locations mappedto the same cache location–Solution 1: increase cache size–Solution 2: increase associativity•Coherence (Invalidation): other process (e.g., I/O) updates memory Review: A Summary on Sources of Cache MissesLec 14.410/19/05 Kubiatowicz CS162 ©UCB Fall 2005•Example: Block 12 placed in 8 block cache0 1 2 3 4 5 6 7Blockno.Direct mapped:block 12 can go only into block 4 (12 mod 8)Set associative:block 12 can go anywhere in set 0 (12 mod 4)0 1 2 3 4 5 6 7Blockno.Set0Set1Set2Set3Fully associative:block 12 can go anywhere0 1 2 3 4 5 6 7Blockno.0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 132-Block Address Space:1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3Blockno.Review: Where does a Block Get Placed in a Cache?Lec 14.510/19/05 Kubiatowicz CS162 ©UCB Fall 2005•What line gets replaced on cache miss?–Easy for Direct Mapped: Only one possibility–Set Associative or Fully Associative:»Random»LRU (Least Recently Used)•What happens on a write?–Write through: The information is written to both the cache and to the block in the lower-level memory–Write back: The information is written only to the block in the cache»Modified cache block is written to main memory only when it is replaced»Question is block clean or dirty?Review: Other Caching QuestionsLec 14.610/19/05 Kubiatowicz CS162 ©UCB Fall 2005Goals for Today•Finish discussion of TLBs•Concept of Paging to Disk•Page Faults and TLB Faults•Precise Interrupts•Page Replacement PoliciesNote: Some slides and/or pictures in the following areadapted from slides ©2005 Silberschatz, Galvin, and GagneLec 14.710/19/05 Kubiatowicz CS162 ©UCB Fall 2005Quick Aside: Protection without Hardware•Does protection require hardware support for translation and dual-mode behavior?–No: Normally use hardware, but anything you can do in hardware can also do in software (possibly expensive)•Protection via Strong Typing–Restrict programming language so that you can’t express program that would trash another program–Loader needs to make sure that program produced by valid compiler or all bets are off–Example languages: LISP, Ada, Modula-3 and Java•Protection via software fault isolation:–Language independent approach: have compiler generate object code that provably can’t step out of bounds»Compiler puts in checks for every “dangerous” operation (loads, stores, etc). Again, need special loader.»Alternative, compiler generates “proof” that code cannot do certain things (Proof Carrying Code)–Or: use virtual machine to guarantee safe behavior (loads and stores recompiled on fly to check bounds)Lec 14.810/19/05 Kubiatowicz CS162 ©UCB Fall 2005Caching Applied to Address Translation•Question is one of page locality: does it exist?–Instruction accesses spend a lot of time on the same page (since accesses sequential)–Stack accesses have definite locality of reference–Data accesses have less page locality, but still some…•Can we have a TLB hierarchy?–Sure: multiple levels at different sizes/speedsData Read or Write(untranslated)CPUPhysicalMemoryTLBTranslate(MMU)NoVirtualAddressPhysicalAddressYesCached?SaveResultLec 14.910/19/05 Kubiatowicz CS162 ©UCB Fall 2005TLB organization•How big does TLB actually have to be?–Usually small: 128-512 entries–Not very big, can support higher associativity•TLB usually organized as fully-associative cache–Lookup is by Virtual Address–Returns Physical Address + other info•What happens when fully-associative is too slow?–Put a small (4-16 entry) direct-mapped cache in front–Called a “TLB Slice”•Example for MIPS R3000:0xFA00 0x0003 Y N Y R/W 340x0040 0x0010 N Y Y R 00x0041 0x0011 N Y Y R 0Virtual Address Physical Address Dirty Ref Valid Access ASIDLec 14.1010/19/05 Kubiatowicz CS162 ©UCB Fall 2005Example: R3000 pipeline includes TLB “stages”Inst FetchDcd/ RegALU / E.A Memory Write Reg TLB I-Cache RF Operation WB E.A. TLB D-CacheMIPS R3000 PipelineASID V. Page Number Offset122060xx User segment (caching based on PT/TLB entry)100 Kernel physical space, cached101 Kernel physical space, uncached11x Kernel virtual spaceAllows context switching among64 user processes without TLB flushVirtual Address SpaceTLB64 entry, on-chip, fully associative, software TLB fault handlerLec 14.1110/19/05 Kubiatowicz CS162 ©UCB Fall 2005•As described, TLB lookup is in serial with cache lookup:•Machines with TLBs go one step further: they overlap TLB lookup with cache access.–Works because offset available earlyReducing


View Full Document

Berkeley COMPSCI 162 - Lecture 14 Caching and Demand Paging

Documents in this Course
Lecture 1

Lecture 1

12 pages

Nachos

Nachos

41 pages

Security

Security

39 pages

Load more
Download Lecture 14 Caching and Demand Paging
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 14 Caching and Demand Paging and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 14 Caching and Demand Paging 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?