Introduction to Computer Systems 15-213, fall 2009 12th Lecture, Oct. 7thLast TimeCache ReadExample: Direct Mapped Cache (E = 1)E-way Set Associative Cache (Here: E = 2)Strided Access QuestionThe Strided Access Problem (Blackboard?)The Memory MountainPentium Blocked Matrix Multiply PerformanceTodayControl FlowAltering the Control FlowExceptional Control FlowExceptionsInterrupt VectorsAsynchronous Exceptions (Interrupts)Synchronous ExceptionsTrap Example: Opening FileFault Example: Page FaultFault Example: Invalid Memory ReferenceException Table IA32 (Excerpt)TodayProcessesConcurrent ProcessesUser View of Concurrent ProcessesContext Switchingfork: Creating New ProcessesUnderstanding forkFork Example #1Fork Example #2Fork Example #3Fork Example #4Fork Example #4exit: Ending a processZombiesZombie ExampleNonterminating Child Examplewait: Synchronizing with Childrenwait: Synchronizing with Childrenwait() Examplewaitpid(): Waiting for a Specific Processexecve: Loading and Running Programsexecve: Exampleexecl and exec Familyexec: Loading and Running ProgramsSummarySummary (cont.)Carnegie MellonIntroduction to Computer Systems15‐213, fall 200912thLecture, Oct. 7thInstructors:Majd Sakr and Khaled HarrasCarnegie MellonLast Time Cache Organization Memory Mountain Optimization for the memory hierarchyCarnegie MellonCache ReadE = 2elines per setS = 2ssets0 1 2B‐1tagvvalid bitB = 2bbytes per cache block (the data)t bits s bits b bitsAddress of word:tagsetindexblockoffsetdata begins at this offset•Locate set•Check if any line in sethas matching tag•Yes + line valid: hit•Locate data startingat offsetCarnegie MellonExample: Direct Mapped Cache (E = 1)S = 2ssetsDirect mapped: One line per setAssume: cache block size 8 bytest bits 0…01 100Address of int:0 1 27tagv 36540 1 27tagv 36540 1 27tagv 36540 1 27tagv 3654find setCarnegie MellonE‐way Set Associative Cache (Here: E = 2)E = 2: Two lines per setAssume: cache block size 8 bytest bits 0…01 100Address of short int:0 1 27tagv 36540 1 27tagv 36540 1 27tagv 36540 1 27tagv 36540 1 27tagv 36540 1 27tagv 36540 1 27tagv 36540 1 27tagv 3654find setCarnegie MellonStrided Access Question What happens if arrays are accessed in two‐power strides? Example on the next slideE = 2elines per setS = 2ssetst bits s bits b bitsAddress of word:tagsetindexblockoffsetCarnegie MellonThe Strided Access Problem (Blackboard?) Example: L1 cache, Core 2 Duo 32 KB, 8‐way associative, 64 byte cache block size What is S, E, B? Answer: B = 26, E = 23, S = 26. Consider an array of ints accessed at stride 2i, i ≥ 0 What is the smallest i such that only one set is used? Answer: i = 10 What happens if the stride is 29? Answer: two sets are used Source of two‐power strides? Example: Column access of 2‐D arrays (images!)Carnegie MellonThe Memory Mountains1s3s5s7s9s11s13s158m2m512k128k32k8k2k020040060080010001200L1L2memxeSlopes ofSpatialLocalityPentium III550 MHz16 KB on-chip L1 d-cache16 KB on-chip L1 i-cache512 KB off-chip unifiedL2 cacheRidges ofTemporalLocalityWorking set size(bytes)Stride (words)Throughput (MB/sec)Carnegie MellonPentium Blocked Matrix Multiply Performance Blocking (bijk and bikj) improves performance by a factor of two over unblocked versions (ijk and jik) relatively insensitive to array size.0102030405060255075100125150175200225250275300325350375400Array size (n)Cycles/iterationkjijkikijikjjikijkbijk (bsize = 25)bikj (bsize = 25) No blocking:(9/8) * n3 Blocking: 1/(4B) * n3Carnegie MellonToday Exceptional Control Flow ProcessesCarnegie MellonControl Flow<startup>inst1inst2inst3…instn<shutdown> Processors do only one thing: From startup to shutdown, a CPU simply reads and executes (interprets) a sequence of instructions, one at a time This sequence is the CPU’s control flow (or flow of control)Physical control flowTimeCarnegie MellonAltering the Control Flow Up to now: two mechanisms for changing control flow: Jumps and branches Call and returnBoth react to changes in program state Insufficient for a useful system: Difficult to react to changes in system state data arrives from a disk or a network adapter instruction divides by zero user hits Ctrl‐C at the keyboard System timer expires System needs mechanisms for “exceptional control flow”Carnegie MellonExceptional Control Flow Exists at all levels of a computer system Low level mechanisms Exceptions change in control flow in response to a system event (i.e., change in system state) Combination of hardware and OS software Higher level mechanisms Process context switch Signals Nonlocal jumps: setjmp()/longjmp() Implemented by either: OS software (context swit ch and signals) C language runtime library (nonlocal jumps)Carnegie MellonExceptions An exception is a transfer of control to the OS in response to some event (i.e., change in processor state) Examples: div by 0, arithmetic overflow, page fault, I/O request completes, Ctrl‐CUser Process OSexceptionexception processingby exception handler• return to I_current•return to I_next•abortevent I_currentI_nextCarnegie Mellon012...n-1Interrupt Vectors Each type of event has a unique exception number k k = index into exception table (a.k.a. interrupt vector) Handler k is called each time exception k occursExceptionTablecode for exception handler 0code for exception handler 1code forexception handler 2code for exception handler n‐1...Exception numbersCarnegie MellonAsynchronous Exceptions (Interrupts) Caused by events external to the processor Indicated by setting the processor’s interrupt pin Handler returns to “next” instruction Examples: I/O interrupts hitting Ctrl‐C at the keyboard arrival of a packet from a network arrival of data from a disk Hard reset interrupt hitting the reset button Soft reset interrupt hitting Ctrl‐Alt‐Delete on a PCCarnegie MellonSynchronous Exceptions Caused by events that occur as a result of executing an instruction: Traps Intentional Examples: system calls, breakpoint traps, special instructions Returns control to “next” instruction Faults Unintentional but possibly recoverable Examples: page faults (recoverable), protection faults (unrecoverable), floating point exceptions Either re‐executes faulting (“current”) instruction or aborts Aborts
View Full Document