Last Time Microcode Multi Cycle CS152 Computer Architecture and Adding the DispatchEngineering ROM Microprogramming Lecture 9 Performance Sequencer based control Called microPC or PC vs state register 2004 09 28 Control Value Effect 00 Next address 0 Dave Patterson 1 01 Next address dispatch ROM www cs berkeley edu patterson 10 Next address address 1 microPC Adder ROM R type BEQ ori LW SW John Lazzaro 000000 0100 000100 0011 www cs berkeley edu lazzaro 001101 0110 100011 1000 Address 101011 1011 sequencer control Inputs 1 www inst eecs berkeley edu cs152 ROM Code ROM microinstruction sequencer fetch dispatch sequential micro PC Mux 2 datapath control 0 Dispatch ROM Opcode 0 Decode Decode To DataPath Select Logic CS 152 L09 Performance CS 152 L09 Multicycle 27 UC Regents Fall 2004 UCB Opcode CS 152 L09 Performance UC Regents Fall 2004 UCB 1 Fall 2004 UC Regents 2 CS 152 L09 Multicycle 28 Fall 2004 UC Regents Today s Lecture Performance Measurement what why how Microprogramming Macroinstruction Interpretation Performance Measurement Microprogramming is a convenient method for performance equation implementingThe structured control state diagrams Main Memory DATA this can change one of these is mapped into one of these execution unit CPU UC Regents Fall 2004 UCB Control design reduces to Microprogramming User program plus Data as seen by the customer Random logic replaced by microPC sequencer and ROM Each line of ROM called a microinstruction Amdahl s law values for control points contains sequencer control To reduce confusion normal instruction e g MIPS addu called macroinstruction How energy limits performance limited state transitions branch to zero next sequential branch to instruction address from dispatch ROM CS 152 L09 Performance ADD SUB AND control memory CS 152 L09 Performance 3 AND microsequence e g Fetch Calc Operand Addr UC Regents Fall 2004 UCB Fetch Operand s 4 Calculate Save Answer s Part of the design process is to develop a language that describes control and is easy for humans to understand CS 152 L09 Multicycle 29 Fall 2004 UC Regents CS 152 L09 Multicycle 30 Fall 2004 UC Regents Who sensibly upgrades CPUs often How to decide to buy a new machine A professional who turns CPU cycles into money and who is cycle limited Measure After Effects execution time on a representative render workload Night flight City map and clouds computed on the fly with fractals Artist tool animation video special effects CS 152 L09 Performance CPU intensive Trivial I O UC Regents Fall 2004 UCB 5 CS 152 L09 Performance UC Regents Fall 2004 UCB 6 Interpreting Execution Time 2 CPUs Execution Time vs Throughput Power Book G4 1 25 GHz Execution Time 1265 seconds Performance 1 Execution Time 2 85 renders hour 1 5 GHz PB Y is N times faster than 1 25 GHz PB X N is N Performance Y Performance X 1 19 PB 1 5 Ghz 3 4 renders hour PB 1 25 2 85 renders hour Does artist productivity really increase CS 152 L09 Performance 1 8x faster What does this imply 2 CPUs vs 1 CPU otherwise similar Throughput jobs hour completed not serialized Execution Time X Execution Time Y Execution Time Time for 1 job to complete UC Regents Fall 2004 UCB Assume G5 MP execution time faster because AE does not use both Opteron CPUs Could G5 and Opteron have similar Throughput Why CS 152 L09 Performance UC Regents Fall 2004 UCB 7 8 Step 1 Analyze the right measurement Guides CPU design Performance Measurement How do designers use these two numbers as seen by a CPU designer Q Why do we care about After Effect s performance A We want the CPU we are designing to run it well CS 152 L09 Performance CPU Time Time the CPU spends running program under measurement Guides system design UC Regents Fall 2004 UCB How to measure CPU time time program name 25 77u 0 72s 0 29 17 90 8 Response Time Total time CPU Time time spent waiting for disk I O CS 152 L09 Performance UC Regents Fall 2004 UCB 9 Administrivia Adjust Class Time 10 Administrivia Mid Term is Coming Mid term Tuesday 10 12 5 30 8 30 PM 101 Morgan No class on Tuesday We have permission to stay in this room past 12 30 Does anyone have a class that starts 12 40 After exam Pizza at LaVal s on us Class time options all sharp time A Lecture from 11 10 to 12 30 B Lecture from 11 15 to 12 35 Mid term review session Sunday 10 10 7 9 PM 306 Soda C Lecture from 11 20 to 12 40 CS 152 L09 Performance UC Regents Fall 2004 UCB 11 CS 152 L09 Performance UC Regents Fall 2004 UCB 12 Administrivia This Week s Deadlines CPU time Proportional to Instruction Count Q Once ISA is set who can influence instruction count A Compiler writer application developer Homework 2 due 9 29 tomorrow 283 Soda in CS 152 box at 5 PM CPU time Program Lab 2 Xilinx demo on Friday 10 1 Lab 2 due Monday 10 4 11 59 PM A Dynamic Machine Instructions Program Rationale Every additional instruction you execute takes time On Tuesday 10 5 onto the Pipelining Lab CS 152 L09 Performance Q Static count lines of program printout Or dynamic count trace of execution UC Regents Fall 2004 UCB Q What type of computer architect influences the number of instructions a given program needs A Instruction set architect CS 152 L09 Performance UC Regents Fall 2004 UCB 13 CPU time Proportional to Clock Period Q How can architects not technologists reduce clock period A Shorten the machine critical path Completing the performance equation Q What ultimately limits an architect s ability to reduce clock period Time One Clock Period Program Seconds Program Instruction mix varies Branch prediction varies Instructions Cycles Seconds Program Instruction Cycle We need all three terms and only these terms to compute CPU Time Rationale We measure each instruction s execution time in number of cycles By shortening the period for each cycle we shorten execution time CS 152 L09 Performance Cache behavior varies What factors make the CPI for a program differ from the underlying CPI of a CPU implementation A Clock to Q setup times Time 14 CPI The Average Number of Clock Cycles Per Instruction For the Program When is it OK to compare clock rates UC Regents Fall 2004 UCB CS 152 L09 Performance UC Regents Fall 2004 UCB 15 Amdahl s Law of Diminishing Returns CPI as an analytical tool to guide design Program Instruction Mix Machine CPI 5 2 2 Where program spends its time Multiply 30 an Br Load 20 Other ALU 20 8 Multiply 50 Load 17 8 If enhancement E speeds up multiply but other instructions are unchanged what is the maximum speedup S 1 1 Smax 2 1 50 100 1 affected
View Full Document
Unlocking...