DOC PREVIEW
UCLA COMSCI M151B - Lecture2

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Week 1 - Wednesday1_3Defining Performance We're always considering the performance bound Using airplanes as an analog for computer performance Different types of factors are considered for an airplane's performance: Passenger capacity (how many passengers it can carry) Cruising range (how far can it go) Cruising speed (how fast can it go) Passengers x mph (measure of overall throughput) Boeing 747 has the highest throughputResponse Time and Throughput Response time is how long it takes to do a task From start to finish Different with throughput Throughput is the total work done per unit time It's usually the number of tasks/transactions/... per hour How much bandwidth can a processor supply More important than response time Example: Response time of a single instruction may be in nanoseconds. Throughput will be how many instructions are run per second. Processors in response time and throughput Upgrading processor could reduce response time and increase throughput. Using a processor that supports multi-threading and others may also improve in response time and throughput. Adding more processors won't help response time of a single job (if the job is single-threaded) but it may improve overall throughput of the system. Talk more about how they are related with processor modificationRelative Performance We define Performance = 1 / Execution Time "X is n times faster than Y" Performance_X / Performance_Y = ExecutionTime_Y / ExecutionTime_X = n Example: Time taken to run a program will be: 10s on A 15s on B Therefore ExecutionTime_B / ExecutionTime_A = 15/10 = 1.5 X is 1.5 times faster than YMeasuring Execution Time Elapsed time Time taken to do a task from start to finish Total response time, including all aspects Processing, I/O, OS overhead, idle time Determines system performance CPU time Time spent processing a given job Discounts I/O time, other jobs' shares Doesn't consider idle time Comprises user CPU time and system CPU User CPU time - Time for the actual program System CPU time - Time for the kernel operating system in servicing the program Different programs are affected differently by CPU and system performance If a program has a lot of I/O and not a lot of crunching numbers or processing in general, upgrading the CPU will not help because CPU time won't beimproved (doesn't consider I/O time)CPU Clocking Many circuit pads in a processor have different delays and different times for routing different parts on the processorStabilizing the various pads and ensuring all the values are ready at the inputs/outputs of the processor, we'll measure time taken with a CPU clock Regularize/Synchronize the processor Clock period Duration of a clock cycle Clock frequency 1 / Clock Period Number of cycles per secondCPU Time CPU Time = CPU Clock Cycles * Clock Cycle Time = CPU Clock Cycles / Clock Rate Performance can be improved by: Reducing the number of cycles How long the processor takes in doing a cycle Increasing the clock rate How many executions it can do Hardware designer could trade off clock rate against cycle count Example: Computer A: 2GHz clock, 10s CPU time Designing Computer B Aim for 6s CPU time Can do faster clock, but causes 1.2 * clock cycles How fast must Computer B clock be? ClockRate_B = ClockCycles_B / CPUTime_B = 1.2 * ClockCycles_A / 6s ClockCycles_A = CPUTime_A * ClockRate_A = 10s * 2GHz = 20 * 10^9 ClockRate_B = 1.2 * 20 * 10^9 / 6s = 24 * 10^9 = 4GHz Summary: In order to take 4s less to process jobs and accommodate the increase of clock cycles, I would need a 4GHz clockInstruction Count and Cycles Per Instruction (CPI) Clock Cycles = Instruction Count * Cycles per Instruction CPU Time = Instruction Count * CPI * Clock Cycle Time = Instruction Count * CPI / Clock Rate Instruction Count for a program Determined by program, ISA, and compiler How many instructions are executed in the course of executing a program Static instruction count How many instructions does it have when it stores on disk/memory Dynamic instruction count How many instructions does it have when running the program Considering for/while/... loops Average cycles per instruction Instructions may be different, some can be arithmetic, some can be only accessing memory Depending on complexity, number of cycles are different If different instructions have different CPI Average CPI affected by instruction mixCPI Example Computer A: Cycle Time = 250ps, CPI = 2.0 Computer B: Cycle Time = 500ps, CPI = 1.2 Same ISA Which is faster and by how much? If we consider by Cycle Time, A is faster If we consider by CPI, B is faster So we have to consider both together by combining them CPUTime_A = InstructionCount * CPI_A * CycleTime_A= I * 2.0 * 250ps = I * 500ps CPUTime_B = InstructionCount * CPI_B * CycleTime_B = I * 1.2 * 500ps = I * 600ps CPUTime_B / CPUTime_A = I * 600ps / (I * 500ps) = 1.2 A is faster by 1.2 timesCPI in More Detail If different instruction classes take different numbers of cycles Clock Cycles = n Sigma i=1 (CPI_i * InstructionCount_i) Weighted average CPI CPI = Clock Cycles / Instruction Count = n Sigma i=1 (CPI_i * InstructionCount_i / InstructionCount) Example: Alternative compiled code sequences using instructions in classes A, B, and C Class A B C CPI for class 1 2 3 IC in sequence 1 2 1 2 IC in sequence 2 4 1 1 Sequence 1: IC = 5 Clock Cycles = 2*1 + 1*2 + 2*3 = 10 Avg. CPI = 10/5 = 2.0 Sequence 2: IC = 6 Clock Cycles = 4*1 + 1*2 + 1*3 = 9 Avg. CPI = 9/6 = 1.5Performance Summary (Big Picture)


View Full Document

UCLA COMSCI M151B - Lecture2

Download Lecture2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?