Unformatted text preview:

Edgar GabrielCOSC 6385Computer ArchitecturePerformance MeasurementEdgar GabrielSpring 2011COSC 6385 – Computer ArchitectureEdgar GabrielMeasuring performance (I)• Response time: how long does it take to execute a certain application/a certain amount of work• Given two platforms X and Y, X is n times faster than Yfor a certain application if• Performance of X is n times higher than performance of Y ifXYTimeTimen =YXXYXYPerfPerfPerfPerfTimeTimen ===11(1)(2)COSC 6385 – Computer ArchitectureEdgar GabrielMeasuring performance (II)• Timing how long an application takes– Wall clock time/elapsed time: time to complete a task as seen by the user. Might include operating system overhead or potentially interfering other applications. – CPU time: does not include time slices introduced by external sources (e.g. running other applications). CPU time can be further divided into• User CPU time: CPU time spent in the program• System CPU time: CPU time spent in the OS performing tasks requested by the program.COSC 6385 – Computer ArchitectureEdgar GabrielMeasuring performance• E.g. using the UNIX time commandElapsed timeUser CPU timeSystem CPU timeCOSC 6385 – Computer ArchitectureEdgar GabrielAmdahl’s Law• Describes the performance gains by enhancing one part of the overall system (code, computer)• Amdahl’s Law depends on two factors:– Fraction of the execution time affected by enhancement – The improvement gained by the enhancement for this fractionorgenhenhorgPerfPerfTimeTimeSpeedup ==))1((enhenhenhorgenhSpeedupFractionFractionTimeTime +−=enhenhenhenhorgoverallSpeedupFractionFractionTimeTimeSpeedup+−==)1(1(3)(4)(5)COSC 6385 – Computer ArchitectureEdgar GabrielAmdahl’s Law (III)01234560 20 40 60 80 100Speedup overallSpeedup enhancedFraction enhanced: 20%Fraction enhanced: 40%Fraction enhanced: 60%Fraction enhanced: 80%enhenhenhoverallSpeedupFractionFractionSpeedup+−=)1(1COSC 6385 – Computer ArchitectureEdgar GabrielAmdahl’s Law (IV)0246810120 0.2 0.4 0.6 0.8 1Speedup overallFraction enhancedSpeedup according to Amdahl's LawSpeedup enhanced: 2Speedup enhanced: 4Speedup enhanced: 10COSC 6385 – Computer ArchitectureEdgar GabrielAmdahl’s Law - example• Assume a new web-server with a CPU being 10 times faster on computation than the previous web-server. I/O performance is not improved compared to the old machine. The web-server spends 40% of its time in computation and 60% in I/O. How much faster is the new machine overall?using formula (5)4.0=enhFraction10=enhSpeedup56.164.01104.0)4.01(1)1(1==+−=+−=enhenhenhoverallSpeedupFractionFractionSpeedupCOSC 6385 – Computer ArchitectureEdgar GabrielAmdahl’s Law – example (II)• Example: Consider a graphics card– 50% of its total execution time is spent in floating point operations – 20% of its total execution time is spent in floating point square root operations (FPSQR). Option 1: improve the FPSQR operation by a factor of 10. Option 2: improve all floating point operations by a factor of 1.622.182.01)102.0()2.01(1==+−=FPSQRSpeedup23.18125.01)6.15.0()5.01(1==+−=FPSpeedupOption 2 slightly fasterCOSC 6385 – Computer ArchitectureEdgar GabrielCPU Performance Equation• Micro-processors are based on a clock running at a constant rate• Clock cycle time: CCt– length of the discrete time event in ns• Equivalent measure: Rate– Expressed in MHz, GHz• CPU time of a program can then be expressed asor(6)(7)timerCCCPU1=timecyclestimeCCnoCPU∗=rcyclestimeCPUnoCPU =COSC 6385 – Computer ArchitectureEdgar GabrielCPU Performance equation (II)• CPI: Average number of clock cycles per instruction• IC: number of instructions• Since the CPI is often known (average), the CPU time is• Expanding formula (6) leads to(8)(9)(10)ICnoCPIcycles=timetimeCCCPIICCPU∗∗=cyclescyclestimenotimeninstructionoprogramnsinstructioCPU ∗∗=COSC 6385 – Computer ArchitectureEdgar GabrielCPU performance equation (III)• According to (7) CPU performance is depending on– Clock cycle time → Hardware technology– CPI → Organization and instruction set architecture– Instruction count→ ISA and compiler technology• Note: on the last slide we used the average CPI over all instructions occurring in an application• Different instructions can have strongly varying CPI’s →→∑=×=niiicyclesCPIICno1timeniiitimeCCCPIICCPU ××=∑=1(11)(12)COSC 6385 – Computer ArchitectureEdgar GabrielCPU performance equation (IV)• The average CPI for an application can then be calculated as: Fraction of occurrence of that instruction in a programinitotalitotalniiiCPIICICICCPIICCPI ×=×=∑∑==11totaliICIC(13)COSC 6385 – Computer ArchitectureEdgar GabrielExample (I)• (Page 43 in the 4thEdition) Consider a graphics card, with – FP operations (including FPSQR): frequency 25%, average CPI 4.0 – FPSQR operations only: frequency 2%, average CPI 20– all other instructions: average CPI 1.3333333• Design option 1: decrease CPI of FPSQR to 2• Design option 2: decrease CPI of all FP operations to 2.5Using formula (13):64.1)220(02.00.21=−−=−=enhCPICPIorg0.2)75.0*333333.1()25.0*4(1=+=×=∑=initotaliorgCPIICICCPI625.1)75.0*333333.1()25.0*5.2(12=+=×=∑=initotaliCPIICICCPICOSC 6385 – Computer ArchitectureEdgar GabrielExample (II)• Slightly modified compared to the previous section: consider a graphics card, with – FP operations (excluding FPSQR): frequency 25%, average CPI 4.0 – FPSQR operations: frequency 2%, average CPI 20– all other instructions: average CPI 1.33• Design option 1: decrease CPI of FPSQR to 2• Design option 2: decrease CPI of all FP operations to 2.5Using formula (13):0109.2)73.0*33.1()02.0*2()25.0*4(11=++=×=∑=initotaliCPIICICCPI3709.2)73.0*33.1()02.0*20()25.0*4(1=++=×=∑=initotaliorgCPIICICCPI9959.1)73.0*33.1()02.0*20()25.0*5.2(12=++=×=∑=initotaliCPIICICCPICOSC 6385 – Computer ArchitectureEdgar GabrielDependability• Module reliability measures– MTTF: mean time to failure– FIT: failures in time • Often expressed as failures in 1,000,000,000 hours– MTTR: mean time to repair– MTBF: mean time between failures• Module availability:MTTRMTTFMTTFMA+=MTTRMTTFMTBF+=(14)(15)MTTFFIT1=(16)COSC 6385 – Computer ArchitectureEdgar GabrielDependability - example• Assume a disk subsystem with the following components and MTTFs:– 10 disks, MTTF=1,000,000h– 1 SCSI controller, MTTF=500,000h– 1 power supply, MTTF=200,000h– 1


View Full Document

UH COSC 6385 - Performance Measurement

Download Performance Measurement
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Performance Measurement and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Performance Measurement 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?