Using MIPS and MFLOPS as Performance MetricsApril 26, 2008One alternative way to measure CPU performance is MIPS, or million instructions persecond. For a given program, MIPS is given byMIPS =Instruction countExecution time × 106(1)Since,Execution time =Instruction count × CPIClock rate(2)Equation 1 becomesMIPS =Clock rateCPI × 106(3)Since MIPS is a rate of operations per unit time, CPU performance can be specifiedas the inverse of execution time, with faster machines having a higher MIPS rating. How-ever, according to the Patterson and Hennessy, there are problems with using MIPS as aperformance metric.• MIPS is dependent on the instruction set of the CPU, making it difficult to comparethe MIPS ratings of processors with different instruction sets.• MIPS can vary inversely to performance.Consider the MIPS rating of a processor with an optional floating-point unit. Sinceit generally takes more clock cycles per floating-point instruction that per integer instruc-tion, floating-point programs using the optional hardware instead of software floating-pointroutines take less time but have a lower MIPS rating. A software floating-point routineexecutes simpler instructions, resulting in a higher MIPS rating, but it executes so manymore instructions that the overall execution time is longer.We can see similar anomalies with optimizing compilers as the following example demon-strates.Example. Let us assume that you have profiled your code and the instruction mix isdetailed in Table 1. We now want to build an optimizing compiler for the CPU. The compilerdiscards 50% of the ALU instructions although it cannot reduce loads, stores, or branches.Assuming a 20-ns clock cycle time (or a 50-MHz clock), what is the MIPS rating for theoptimized code versus the unoptimized code? Does the MIPS rating agree with the rankingof execution time?1Table 1: The instruction mix and CPIs of individual instructionsOperation Frequency CPIALU Operations 43% 1Loads 21% 2Stores 12% 2branches 24% 2Answer. We use the CPU performance formula to compute the CPI of the unoptimizedcode asCPIunoptimized= .43 × 1 + .21 × 2 + .12 × 2 + .24 × 2 = 1.57So,MIPSunoptimized=50 MHz1.57 × 106= 31.85The performance of the unoptimized code, in terms of execution time, is given by:CPU timeunoptimized= Instruction countunoptimized× 1.57 × (20 × 10−9)= 31.4 × 10−9× Instruction countunoptimizedFor the optimized code,CPIoptimized=.432× 1 + .21 × 2 + .12 × 2 + .24 × 21 −.432= 1.73since 50% of the ALU op erations have been discarded (.43/2) and the instruction count isreduced by the missing ALU instructions. Thus,MIPSoptimized=50 MHz1.73 × 106= 28.90The performance of the optimized code, in terms of execution time, isCPU timeoptimized= (.785 × Instruction countunoptimized) × 1.73 × (20 × 10−9)= 27.2 × 10−9× Instruction countunoptimizedThe optimized code is 13% faster, but its MIPS rating is lower! As this example shows,MIPS can fail to give a true picture of performance in that it does not track execution time.Another popular alternative to measure execution time is million floating-point operationsper second, or MFLOPS (megaflops). The formula for MFLOPS is simplyMFLOPS =Number of floating-point operations in a programExecution time × 106(4)2The MFLOPS rating is dependent on the machine and on the program, and since MFLOPSare intended to measure floating-point performance, they are not applicable outside thatrange. For example, compilers have a MFLOPS rating of nearly zero no matter how fastthe CPU is since compilers rarely use floating-point arithmetic. When comparing the per-formance of different machines, MFLOPS is not dependable because the set of floating-pointoperations is not consistent across
View Full Document