DOC PREVIEW
ISU CPRE 381 - Book's Definition of Performance

This preview shows page 1-2-3-4-5-34-35-36-37-68-69-70-71-72 out of 72 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 72 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Book's Definition of PerformanceExampleNow that we understand cyclesPerformanceCPI Example# of Instructions ExampleMIPS exampleBenchmarksSPEC ‘89SPEC ‘95Slide 11Amdahl's LawSlide 13RememberWhere we are headedMIPS Instruction Format AgainOperation for Each InstructionMulticycle ApproachReview: finite state machinesMulti-Cycle DataPath OperationFive Execution StepsStep 1: Instruction FetchStep 2: Instruction Decode and Register FetchStep 3 (instruction dependent)Step 4 (R-type or memory-access)Write-back stepSummary:Instruction FormatSlide 29Slide 30LW Operation on Multi-Cycle Data Path: C1LW Operation on Multi-Cycle Data Path: C2LW Operation on Multi-Cycle Data Path: C3LW Operation on Multi-Cycle Data Path: C4LW Operation on Multi-Cycle Data Path: C5SW Operation on Multi-Cycle Data Path: C1SW Operation on Multi-Cycle Data Path: C2SW Operation on Multi-Cycle Data Path: C3SW Operation on Multi-Cycle Data Path: C4R-TYPE Operation on Multi-Cycle Data Path: C1R-TYPE Operation on Multi-Cycle Data Path: C2R-TYPE Operation on Multi-Cycle Data Path: C3R-TYPE Operation on Multi-Cycle Data Path: C4BR Operation on Multi-Cycle Data Path: C1BR Operation on Multi-Cycle Data Path: C2BR Operation on Multi-Cycle Data Path: C3JUMP Operation on Multi-Cycle Data Path: C1JUMP Operation on Multi-Cycle Data Path: C2Simple QuestionsImplementing the ControlDeciding the ControlGraphical Specification of FSMFinite State Machine: Control ImplementationPLA ImplementationROM ImplementationSlide 56ROM vs PLAAnother Implementation StyleDetails-1Details-2Microprogramming: What is a “microinstruction”MicroprogrammingMicroinstruction formatMaximally vs. Minimally EncodedMicrocode: Trade-offsThe Big PictureExceptionsHow Exceptions are HandledTwo new states for the Multi-cycle CPUVectored Interrupts/ExceptionsFinal Words on Single and Multi-Cycle SystemsConclusions on Chapter 51•For some program running on machine X, PerformanceX = 1 / Execution timeX•"X is n times faster than Y"PerformanceX / PerformanceY = n•Problem:–machine A runs a program in 20 seconds–machine B runs the same program in 25 secondsBook's Definition of Performance2•Our favorite program runs in 10 seconds on computer A, which has a 400 Mhz. clock. We are trying to help a computer designer build a new machine B, that will run this program in 6 seconds. The designer can use new (or perhaps more expensive) technology to substantially increase the clock rate, but has informed us that this increase will affect the rest of the CPU design, causing machine B to require 1.2 times as many clock cycles as machine A for the same program. What clock rate should we tell the designer to target?"•Don't Panic, can easily work this out from basic principlesExample3•A given program will require–some number of instructions (machine instructions)–some number of cycles–some number of seconds•We have a vocabulary that relates these quantities:–cycle time (seconds per cycle)–clock rate (cycles per second)–CPI (cycles per instruction) a floating point intensive application might have a higher CPI–MIPS (millions of instructions per second)this would be higher for a program using simple instructionsNow that we understand cycles4Performance•Performance is determined by execution time•Do any of the other variables equal performance?–# of cycles to execute program?–# of instructions in program?–# of cycles per second?–average # of cycles per instruction?–average # of instructions per second?•Common pitfall: thinking one of the variables is indicative of performance when it really isn’t.5•Suppose we have two implementations of the same instruction set architecture (ISA). For some program,Machine A has a clock cycle time of 10 ns. and a CPI of 2.0 Machine B has a clock cycle time of 20 ns. and a CPI of 1.2 What machine is faster for this program, and by how much?•If two machines have the same ISA which of our quantities (e.g., clock rate, CPI, execution time, # of instructions, MIPS) will always be identical? CPI Example6•A compiler designer is trying to decide between two code sequences for a particular machine. Based on the hardware implementation, there are three different classes of instructions: Class A, Class B, and Class C, and they require one, two, and three cycles (respectively). The first code sequence has 5 instructions: 2 of A, 1 of B, and 2 of CThe second sequence has 6 instructions: 4 of A, 1 of B, and 1 of C.Which sequence will be faster? How much?What is the CPI for each sequence?# of Instructions Example7•Two different compilers are being tested for a 100 MHz. machine with three different classes of instructions: Class A, Class B, and Class C, which require one, two, and three cycles (respectively). Both compilers are used to produce code for a large piece of software.The first compiler's code uses 5 million Class A instructions, 1 million Class B instructions, and 1 million Class C instructions.The second compiler's code uses 10 million Class A instructions, 1 million Class B instructions, and 1 million Class C instructions.•Which sequence will be faster according to MIPS?•Which sequence will be faster according to execution time?MIPS example8•Performance best determined by running a real application–Use programs typical of expected workload–Or, typical of expected class of applicationse.g., compilers/editors, scientific applications, graphics, etc.•Small benchmarks–nice for architects and designers–easy to standardize–can be abused•SPEC (System Performance Evaluation Cooperative)–companies have agreed on a set of real program and inputs–can still be abused (Intel’s “other” bug)–valuable indicator of performance (and compiler technology)Benchmarks9SPEC ‘89•Compiler “enhancements” and performance0100200300400500600700800tomcatvfppppmatrix300eqntottlinasa7doducspiceespressogccBenchmarkCompilerEnhanced compilerSPEC performance ratio10SPEC ‘95Benchmark Descriptiongo Artificial intelligence; plays the game of Gom88ksim Motorola 88k chip simulator; runs test programgcc The Gnu C compiler generating SPARC codecompress Compresses and decompresses file in memoryli Lisp interpreterijpeg Graphic compression and decompressionperl Manipulates strings and prime numbers in the special-purpose programming language Perlvortex A database programtomcatv A mesh generation programswim Shallow water model with 513 x 513 gridsu2cor quantum physics; Monte Carlo


View Full Document

ISU CPRE 381 - Book's Definition of Performance

Download Book's Definition of Performance
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Book's Definition of Performance and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Book's Definition of Performance 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?