DOC PREVIEW
Berkeley COMPSCI 152 - Lecture 7 – Performance

This preview shows page 1-2-14-15-30-31 out of 31 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

UC Regents Spring 2005 © UCBCS 152 L7: Performance2005-2-8John Lazzaro (www.cs.berkeley.edu/~lazzaro)CS 152 Computer Architecture and EngineeringLecture 7 – Performancewww-inst.eecs.berkeley.edu/~cs152/TAs: Ted Hong and David MarquardtUC Regents Spring 2005 © UCBCS 152 L7: PerformanceSolution #2: Consensus. Keeping in mind the goal (correctly working CPU on the board on schedule), what option brings the group closer to the goal?Example: 3 members want to do the design one way; member number 4 does not agree. Last Time: Tips for TeamworkSolution #1: Voting. “Fair”. But, what if the “loser” was technically correct?Never lose sight of the goal !UC Regents Spring 2005 © UCBCS 152 L7: PerformanceToday’s Lecture - PerformanceMeasurement: what, why, howThe performance equationHow energy limits performanceAmdahl’s lawAlso: news about PlayStation 3 “Cell” processorUC Regents Spring 2005 © UCBCS 152 L7: PerformancePerformance Measurement(as seen by the customer)UC Regents Spring 2005 © UCBCS 152 L7: PerformanceWho (sensibly) upgrades CPUs often?A professional who turns CPU cycles into money, and who is cycle-limited.Artist tool: animation, video special effects.UC Regents Spring 2005 © UCBCS 152 L7: PerformanceHow to decide to buy a new machine?Measure After Effects “execution time” on a representative render “workload” “Night flight”City map and cloudscomputed“on the fly” with fractalsCPU intensive Trivial I/OUC Regents Spring 2005 © UCBCS 152 L7: Performance Interpreting Execution TimePerformance1Execution Time== 2.85 renders/hour1.5 GHz PB (Y) is N times faster than 1.25 GHz PB (X). N is ?N =Performance (Y)Execution Time (Y)Execution Time (X)Performance (X)== 1. 19PB 1.5 Ghz : 3. 4 renders/hour. PB 1.25 : 2.85 renders/hour.Does artist productivity really increase?Execution Time: 1265 secondsPowerBookG41.25 GHzUC Regents Spring 2005 © UCBCS 152 L7: PerformanceExecution Time: Time for 1 job to complete 2 CPUs: Execution Time vs ThroughputThroughput: # of parallel jobs/hour completedCould G5 and Opteron have similar Throughput? Why?Assume G5 MP executiontime faster because AE doesnot use both Opteron CPUs.1.8xfaster.What does this imply?2 CPUs vs1 CPU,otherwisesimilarUC Regents Spring 2005 © UCBCS 152 L7: PerformancePerformance Measurement(as seen by a CPU designer)Q. Why do we care about After Effect’s performance?A. We want the CPU we are designing to run it well !UC Regents Spring 2005 © UCBCS 152 L7: PerformanceStep 1: Analyze the right measurement!CPU Time:Time the CPU spends running program under measurement.Response Time:Total time: CPU Time + time spent waiting (for disk, I/O, ...).Guides CPU designGuides system designHow to measure CPU time?% time <program name>25.77u 0.72s 0:29.17 90.8%UC Regents Spring 2005 © UCBCS 152 L7: Performance CPU time: Proportional to Instruction CountCPU timeProgramMachine InstructionsProgram󲰮Q. Static count?(lines of program printout)Or dynamic count? (trace of execution)Rationale: Every additional instruction you execute takes time.Q. What type of computer architect influences the number of instructions a given program needs?A. Instruction set architect.A. Dynamic.Q. Once ISA is set, who can influence instructioncount?A. Compiler writer,application developer.UC Regents Spring 2005 © UCBCS 152 L7: Performance CPU time: Proportional to Clock PeriodQ. What ultimately limitsan architect’s ability to reduce clock period ?TimeProgramTimeOne Clock Period󲰮A. Clock-to-Q, setup times.Q. How can architects (not technologists) reduce clock period?A. Shorten the machine critical path.Rationale: We measure each instruction’sexecution time in “number of cycles”. By shortening the period for each cycle, we shorten execution time.UC Regents Spring 2005 © UCBCS 152 L7: Performance Completing the performance equationSecondsProgram InstructionsProgram=SecondsCycleWe need all three terms, and only these terms, to compute CPU Time!When is it OK to compare clock rates?What factors make the CPI for a program differfrom the underlying CPIof a CPU implementation?Instruction mix variesCache behavior varies.Branch prediction varies.“CPI” -- The Average Number of Clock Cycles Per Instruction For the Program InstructionCyclesUC Regents Spring 2005 © UCBCS 152 L7: Performance CPI as an analytical tool to guide designMultiplyOther ALULoadStoreBranch22215Machine CPI5 x 30 + 1 x 20 + 2 x 20 + 2 x 10 + 2 x 20100= 2.7 cycles/instruction20%Branch10%Store20%Load20%Other ALU30%MultiplyProgramInstruction Mix15%Branch7%15%Load7%56%MultiplyWhere program spends its timeUC Regents Spring 2005 © UCBCS 152 L7: Performance Amdahl’s Law (of Diminishing Returns)If enhancement “E” speeds up multiply, but other instructions are unchanged, what is the maximum speedup S? 16%Branch8%16%Load8%52%MultiplyWhere programspends its timeSmax =1un-enhanced % / 100%= 2.08148%/100%= Attributed to Gene Amdahl -- “Amdahl’s Law”What is the lesson of Amdahl’s Law? Must enhance computers in a balanced way!UC Regents Spring 2005 © UCBCS 152 L7: PerformanceInvented the “one ISA, many implementations” business model.UC Regents Spring 2005 © UCBCS 152 L7: PerformanceAmdahl’s Law in ActionProgramWeWishTo RunOn N CPUs30%Serial70%ParallelThe program spends 30%of its time running code that can not be recoded to run in parallel.CPUs2345∞SpeedupCompute speedup for N = 2, 3, 4, 5, and ∞.UC Regents Spring 2005 © UCBCS 152 L7: PerformanceA law of diminishing returns ...ProgramWeWishTo RunOn N CPUs30%Serial70%ParallelThe program spends 30%of its time running code that can not be recoded to run in parallel.S =1(30 % + (70% / N) ) / 100 %CPUs2345∞Speedup1.541.852.12.33.3S(∞)2 3 # CPUsUC Regents Spring 2005 © UCBCS 152 L7: Performance Final thoughts: Performance EquationSecondsProgram InstructionsProgram=SecondsCycle InstructionCyclesGoal is to optimize execution time, notindividualequationterms.The CPI of the program.Reflectsthe program’s instruction mix.Machinesareoptimizedwith respect toprogramworkloads.Clockperiod.OptimizejointlywithmachineCPI.UC Regents Spring 2005 © UCBCS 152 L7: PerformanceAdministrivia: Upcoming deadlines ...Thursday 2/17: At 11:59 PM via email:Lab 2 peer evaluations, and Lab 3 preliminary design document due.(More details on Lab 3 on Thursday)Monday 2/14: Lab 2 final report due via the submit program, 11:59 PM.Friday 2/11: “Xilinx Checkoff”, 12-1, 119 Cory. For 61(c)


View Full Document

Berkeley COMPSCI 152 - Lecture 7 – Performance

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Lecture 7 – Performance
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 7 – Performance and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 7 – Performance 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?