CS152 Computer Architecture and Engineering Lecture 9 Performance 2004 09 28 Dave Patterson www cs berkeley edu patterson John Lazzaro www cs berkeley edu lazzaro www inst eecs berkeley edu cs152 CS 152 L09 Performance UC Regents Fall 2004 UCB Last Time Microcode Multi Cycle CS 152 L09 Performance UC Regents Fall 2004 UCB Today s Lecture Performance Measurement what why how The performance equation Amdahl s law How energy limits performance CS 152 L09 Performance UC Regents Fall 2004 UCB Performance Measurement as seen by the customer CS 152 L09 Performance UC Regents Fall 2004 UCB Who sensibly upgrades CPUs often A professional who turns CPU cycles into money and who is cycle limited Artist tool animation video special effects CS 152 L09 Performance UC Regents Fall 2004 UCB How to decide to buy a new machine Measure After Effects execution time on a representative render workload Night flight Night Flight Movie Goes Here City map and clouds computed on the fly with fractals CPU intensive Trivial I O CS 152 L09 Performance UC Regents Fall 2004 UCB Interpreting Execution Time Power Book G4 1 25 GHz Performance 1 Executio n Time 1265 seconds 2 85 Execution Time renders hour 1 5 GHz PB Y is N times faster than 1 25 GHz Performance Y Execution Time X PB X N is N 1 Performance X Execution Time Y 19 PB 1 5 Ghz 3 4 renders hour PB 1 25 2 85 renders hour CS 152 L09 Performance UC Regents Fall 2004 UCB 2 CPUs Execution Time vs Throughput Execution Time Time for 1 job to complete 1 8x 2 CPUs faster vs 1 CPU What otherwise does similar this Throughput jobs hour completed not serialized imply Assume G5 MP execution time faster because AE Could does G5 and Opteron have similar not use both Opteron Throughput Why CPUs CS 152 L09 Performance UC Regents Fall 2004 UCB Performance Measurement as seen by a CPU designer Q Why do we care about After Effect s performance A We want the CPU we are designing to run it well CS 152 L09 Performance UC Regents Fall 2004 UCB Step 1 Analyze the right measurement Guides CPU design CPU Time Time the CPU spends running program under measurement How to measure CPU How do time program name time designers 25 77u 0 72s 0 29 17 use these two numbers Response90 8 Time Guides syste m design Total time CPU Time time spent waiting for disk I O CS 152 L09 Performance UC Regents Fall 2004 UCB Administrivia Adjust Class Time We have permission to stay in this room past 12 30 Does anyone have a class that starts 12 40 Class time options all sharp time A Lecture from 11 10 to 12 30 B Lecture from 11 15 to 12 35 C Lecture from 11 20 to 12 40 CS 152 L09 Performance UC Regents Fall 2004 UCB Administrivia Mid Term is Coming Mid term Tuesday 10 12 5 30 8 30 PM 101 Morgan No class on Tuesday After exam Pizza at LaVal s on us Mid term review session Sunday 10 10 7 9 PM 306 Soda CS 152 L09 Performance UC Regents Fall 2004 UCB Administrivia This Week s Deadlines Homework 2 due 9 29 tomorrow 283 Soda in CS 152 box at 5 PM Lab 2 Xilinx demo on Friday 10 1 Lab 2 due Monday 10 4 11 59 PM On Tuesday 10 5 onto the Pipelining Lab CS 152 L09 Performance UC Regents Fall 2004 UCB CPU time Proportional to Instruction Count Q Once ISA is set who can influence instruction count A Compiler writer application CPU time developer Program Q Static count lines of program printout Or dynamic count trace of execution A Dynamic Machine Instructions Program Q What type of Rationale Every computer architect additional instructioninfluences the you execute takes number of instructions a given time program needs A Instruction set CS 152 L09 Performance UC Regents Fall 2004 UCB CPU time Proportional to Clock Period Q How can architects not technologists reduce clock period A Shorten the machine critical path Time Program Q What ultimately limits an architect s ability to reduce A Clock to Q setup clock period times Time One Clock Period Rationale We measure each instruction s execution time in number of cycles By shortening the period for each cycle we shorten execution time CS 152 L09 Performance UC Regents Fall 2004 UCB Completing the performance equation What factors make Cache behavior varies the Instruction mix CPI for a program differ varies Branch prediction from the underlying Cycles varies CPI Seconds Instructions Seconds Instruction of a CPU Cycle Program Program implementation We need all three terms and only these terms to compute CPU Time CPI The Average Number of Clock Cycles Per Instruction For the Program When is it OK to compare clock CS 152 L09 Performance UC Regents Fall 2004 UCB CPI as an analytical tool to guide design Machine CPI Program Instruction Mix 5 x 30 1 x 20 2 x 20 2 x 10 2 x 20 100 2 7 cycles instruction Q We lower machine multiply CPI but program runs slower CS 152 L09 Performance Where progra m spends its time UC Regents Fall 2004 UCB Amdahl s Law of Diminishing Returns Where program spends its time Smax If enhancement E speeds up multiply but other instructions are unchanged what is the maximum speedup S 1 1 2 1 affected 100 1 50 100 Attributed to Gene Amdahl Amdahl s Law What is the lesson of Amdahl s Law Must enhance computers in a CS 152 L09 Performance UC Regents Fall 2004 UCB Peer Instruction Amdahl s Law The program spends 30 of its time running code that can not be recoded to run in parallel Program We Wish To Run On N CPUs CPUs Compute speedup for N 2 3 4 5 and 2 3 4 5 Speedup CS 152 L09 Performance UC Regents Fall 2004 UCB Peer Instruction Amdahl s Law The program spends 30 of its time in serial code Compute speedup for N 2 3 4 5 and Program We Wish To Run On N CPUs S 1 S 1 30 70 N 100 CPUs Speedup CS 152 L09 Performance 2 3 CPUs 2 3 4 5 1 54 1 85 2 1 2 3 3 3 UC Regents Fall 2004 UCB Final thoughts Performance Equation Seconds Program Goal is to optimize executio n time not individu al equation terms Instructions Program Cycles Instruction Seconds Cycle The CPI Machines Clock of the are period program optimized Optimize Reflects with jointly the respect with program to machine s program CPI instructio workload n mix s CS 152 L09 Performance UC Regents Fall 2004 UCB 1 Joule of energy is dissipated by a 1 Amp current flowing through Also 1for Watt for 1 a 1 Ohm resistor 1 second second 1 Watt 1 Amp flowing through 1 Ohm Energy and Performance 1 Joule 0 24 calories 1 calorie raises 1 gram of water 1 Sad fact computers turn electrical energy into heat Computation is a byproduct Air or water carries …
View Full Document
Unlocking...