DOC PREVIEW
Berkeley COMPSCI 61C - Lecture 22 - Introduction to Performance

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS61C L221 Performance © UC Regents1CS61C - Machine StructuresLecture 22 - Introduction to PerformanceNovember 17, 2000David Pattersonhttp://www-inst.eecs.berkeley.edu/~cs61c/CS61C L221 Performance © UC Regents2Review (1/2)°Optimal Pipeline• Each stage is executing part of aninstruction each clock cycle.• One instruction finishes during eachclock cycle.• On average, execute far more quickly.°What makes this work?• Similarities between instructions allowus to use same stages for all instructions(generally).• Each stage takes about the same amountof time as all others: little wasted time.CS61C L221 Performance © UC Regents3Review (2/2)°Pipelining a Big Idea: widely usedconcept°What makes it less than perfect?• Structural hazards: suppose we hadonly one cache?⇒⇒ Need more HW resources• Control hazards: need to worry aboutbranch instructions? ⇒⇒ Delayed branch• Data hazards: an instruction depends ona previous instruction?CS61C L221 Performance © UC Regents4Outline°Performance Calculation°Benchmarks°Virtual Memory ReviewCS61C L221 Performance © UC Regents5Performance°Purchasing Perspective: given acollection of machines, which has the- best performance ?- least cost ?- best performance / cost ?°Computer Designer Perspective: facedwith design options, which has the- best performance improvement ?- least cost ?- best performance / cost ?°Both require: basis for comparison and metric for evaluationCS61C L221 Performance © UC Regents6Two Notions of “Performance”PlaneBoeing747BAD/SudConcordeTopSpeedDC toParisPassen-gersThroughput(pmph)610mph6.5hours470 286,7001350mph3hours132 178,200•Which has higher performance?•Time to deliver 1 passenger?•Time to deliver 400 passengers?•In a computer, time for 1 job calledResponse Time or Execution Time•In a computer, jobs per day called Throughput or BandwidthCS61C L221 Performance © UC Regents7Definitions°Performance is in units of things per sec• bigger is better°If we are primarily concerned with responsetime• performance(x) = 1 execution_time(x)" X is n times faster than Y" meansPerformance(X)n =Performance(Y)CS61C L221 Performance © UC Regents8Example of Response Time v. Throughput• Time of Concorde vs. Boeing 747?• Concord is 6.5 hours / 3 hours= 2.2 times faster• Throughput of Boeing vs. Concorde?• Boeing 747: 286,700 pmph / 178,200 pmph= 1.6 times faster• Boeing is 1.6 times (“60%”) faster interms of throughput• Concord is 2.2 times (“120%”) faster interms of flying time (response time)We will focus primarily on executiontime for a single jobCS61C L221 Performance © UC Regents9Confusing Wording on Performance°Will (try to) stick to “n times faster”;its less confusing than “m % faster”°As faster means both increasedperformance and decreased executiontime, to reduce confusion will use“improve performance” or“improve execution time”CS61C L221 Performance © UC Regents10What is Time?°Straightforward definition of time:• Total time to complete a task, including diskaccesses, memory accesses, I/O activities,operating system overhead, ...• “real time”, “response time” or“elapsed time”°Alternative: just time processor (CPU)is working only on your program (sincemultiple processes running at same time)• “CPU execution time” or “CPU time ”• Often divided into system CPU time (in OS)and user CPU time (in user program)CS61C L221 Performance © UC Regents11How to Measure Time?°User Time ⇒ seconds°CPU Time: Computers constructedusing a clock that runs at a constantrate and determines when events takeplace in the hardware• These discrete time intervals calledclock cycles (or informally clocks orcycles)• Length of clock period: clock cycle time(e.g., 2 nanoseconds or 2 ns) and clockrate (e.g., 500 megahertz, or 500 MHz),which is the inverse of the clock period;use these!CS61C L221 Performance © UC Regents12Measuring Time using Clock Cycles (1/2)°or= Clock Cycles for a program Clock Rate°CPU execution time for program = Clock Cycles for a program x Clock Cycle TimeCS61C L221 Performance © UC Regents13Measuring Time using Clock Cycles (2/2)°One way to define clock cycles:Clock Cycles for program = Instructions for a program(called “Instruction Count”) x Average Clock cycles Per Instruction (abbreviated “CPI”)°CPI one way to compare two machineswith same instruction set, sinceInstruction Count would be the sameCS61C L221 Performance © UC Regents14Performance Calculation (1/2)°CPU execution time for program= Clock Cycles for program x Clock Cycle Time°Substituting for clock cycles:CPU execution time for program= (Instruction Count x CPI) x Clock Cycle Time= Instruction Count x CPI x Clock Cycle TimeCS61C L221 Performance © UC Regents15Performance Calculation (2/2)CPU time = Instructions x Cycles x SecondsProgram Instruction CycleCPU time = Instructions x Cycles x SecondsProgram Instruction CycleCPU time = Instructions x Cycles x SecondsProgram Instruction CycleCPU time = SecondsProgram• Product of all 3 terms: if missing a term, can’tpredict time, the real measure of performanceCS61C L221 Performance © UC Regents16Administrivia: Rest of 61C•Rest of 61C slower pace• 1 project, 1 lab, no more homeworksF 11/17 Performance; Cache Sim ProjectW11/24 X86, PC buzzwords and 61C; RAID LabW11/29 Review: Pipelines; Feedback “lab”F 12/1 Review: Caches/TLB/VM; Section 7.5M 12/4 Deadline to correct your grade recordW 12/6 Review: Interrupts (A.7); Feedback labF 12/8 61C Summary / Your Cal heritage /HKN Course EvaluationSun 12/10 Final Review, 2PM (155 Dwinelle)Tues 12/12 Final (5PM 1 Pimintel)CS61C L221 Performance © UC Regents17How Calculate the 3 Components?°Clock Cycle Time: in specification ofcomputer (Clock Rate in advertisements)°Instruction Count:• Count instructions in loop of small program• Use simulator to count instructions• Hardware counter in spec. register (Pentium II)°CPI:• Calculate: Execution Time / Clock cycle timeInstruction Count• Hardware counter in special register (PII)CS61C L221 Performance © UC Regents18Calculating CPI Another Way°First calculate CPI for each individualinstruction (add, sub, and, etc.)°Next calculate frequency of eachindividual instruction°Finally multiply these two for eachinstruction and add them up to getfinal CPICS61C L221 Performance © UC


View Full Document

Berkeley COMPSCI 61C - Lecture 22 - Introduction to Performance

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Download Lecture 22 - Introduction to Performance
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 22 - Introduction to Performance and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 22 - Introduction to Performance 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?