DOC PREVIEW
Berkeley COMPSCI 61C - Lecture 29 Performance & Parallel Intro

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS61C L 29 Performance & Parallel (1) Beamer, Summer 2007 © UCBScott Beamer, Instructorinst.eecs.berkeley.edu/~cs61cCS61C : Machine Structures Lecture #29 Performance & Parallel Intro2007-8-14“Paper” Battery Developedby Researchers atRensselaerwww.bbc.co.ukCS61C L 29 Performance & Parallel (2) Beamer, Summer 2007 © UCB“Last time…”• Magnetic Disks continue rapid advance: 60%/yr capacity,40%/yr bandwidth, slow on seek, rotation improvements,MB/$ improving 100%/yr?• Designs to fit high volume form factor• PMR a fundamental new technology breaks through barrier• RAID• Higher performance with more disk arms per $• Adds option for small # of extra disks• Can nest RAID levels• Today RAID is > tens-billion dollar industry,80% nonPC disks sold in RAIDs,started at CalCS61C L 29 Performance & Parallel (3) Beamer, Summer 2007 © UCBPeer Instruction1. RAID 1 (mirror) and 5 (rotated parity) helpwith performance and availability2. RAID 1 has higher cost than RAID 53. Small writes on RAID 5 are slower than onRAID 1 ABC0: FFF1: FFT2: FTF3: FTT4: TFF5: TFT6: TTF7: TTTCS61C L 29 Performance & Parallel (5) Beamer, Summer 2007 © UCBWhy Performance? Faster is better!• Purchasing Perspective: given acollection of machines (or upgradeoptions), which has the best performance ? least cost ? best performance / cost ?• Computer Designer Perspective: facedwith design options, which has the best performance improvement ? least cost ? best performance / cost ?• All require basis for comparison andmetric for evaluation!• Solid metrics lead to solid progress!CS61C L 29 Performance & Parallel (6) Beamer, Summer 2007 © UCBTwo Notions of “Performance”PlaneBoeing747BAD/SudConcordeTopSpeedDC toParisPassen-gersThroughput(pmph)610mph6.5hours470 286,7001350mph3hours132 178,200•Which has higher performance?•Interested in time to deliver 100 passengers?•Interested in delivering as many passengers per day as possible?•In a computer, time for one task calledResponse Time or Execution Time•In a computer, tasks per unit time calledThroughput or BandwidthCS61C L 29 Performance & Parallel (7) Beamer, Summer 2007 © UCBDefinitions• Performance is in units of things per sec• bigger is better• If we are primarily concerned withresponse time• performance(x) = 1 execution_time(x)" F(ast) is n times faster than S(low) " means… performance(F) execution_time(S)n = = performance(S) execution_time(F)CS61C L 29 Performance & Parallel (8) Beamer, Summer 2007 © UCBExample of Response Time v. Throughput• Time of Concorde vs. Boeing 747?• Concord is 6.5 hours / 3 hours= 2.2 times faster• Throughput of Boeing vs. Concorde?• Boeing 747: 286,700 pmph / 178,200 pmph= 1.6 times faster• Boeing is 1.6 times (“60%”) faster interms of throughput• Concord is 2.2 times (“120%”) faster interms of flying time (response time)We will focus primarily on responsetime.CS61C L 29 Performance & Parallel (9) Beamer, Summer 2007 © UCBWords, Words, Words…• Will (try to) stick to “n times faster”;its less confusing than “m % faster”• As faster means both decreasedexecution time and increasedperformance, to reduce confusion wewill (and you should) use “improve execution time” or “improve performance”CS61C L 29 Performance & Parallel (10) Beamer, Summer 2007 © UCBWhat is Time?• Straightforward definition of time:• Total time to complete a task, including diskaccesses, memory accesses, I/O activities,operating system overhead, ...• “real time”, “response time” or“elapsed time”• Alternative: just time processor (CPU)is working only on your program (sincemultiple processes running at same time)• “CPU execution time” or “CPU time”• Often divided into system CPU time (in OS)and user CPU time (in user program)CS61C L 29 Performance & Parallel (11) Beamer, Summer 2007 © UCBHow to Measure Time?• Real Time ⇒ Actual time elapsed• CPU Time: Computers constructedusing a clock that runs at a constantrate and determines when events takeplace in the hardware• These discrete time intervals calledclock cycles (or informally clocks orcycles)• Length of clock period: clock cycle time(e.g., 2 nanoseconds or 2 ns) and clockrate (e.g., 500 megahertz, or 500 MHz),which is the inverse of the clock period;use these!CS61C L 29 Performance & Parallel (12) Beamer, Summer 2007 © UCBMeasuring Time using Clock Cycles (1/2)• or= Clock Cycles for a programClock Rate• CPU execution time for a program = Clock Cycles for a program x Clock PeriodCS61C L 29 Performance & Parallel (13) Beamer, Summer 2007 © UCBMeasuring Time using Clock Cycles (2/2)• One way to define clock cycles:Clock Cycles for program = Instructions for a program(called “Instruction Count”) x Average Clock cycles Per Instruction (abbreviated “CPI”)• CPI one way to compare two machineswith same instruction set, sinceInstruction Count would be the sameCS61C L 29 Performance & Parallel (14) Beamer, Summer 2007 © UCBPerformance Calculation (1/2)• CPU execution time for program= Clock Cycles for program x Clock Cycle Time• Substituting for clock cycles:CPU execution time for program= (Instruction Count x CPI) x Clock Cycle Time= Instruction Count x CPI x Clock Cycle TimeCS61C L 29 Performance & Parallel (15) Beamer, Summer 2007 © UCBPerformance Calculation (2/2)CPU time = Instructions x Cycles x SecondsProgram Instruction CycleCPU time = Instructions x Cycles x SecondsProgram Instruction CycleCPU time = Instructions x Cycles x SecondsProgram Instruction CycleCPU time = SecondsProgram• Product of all 3 terms: if missing a term, can’tpredict time, the real measure of performanceCS61C L 29 Performance & Parallel (16) Beamer, Summer 2007 © UCBHow Calculate the 3 Components?• Clock Cycle Time: in specification ofcomputer (Clock Rate in advertisements)• Instruction Count:• Count instructions in loop of small program• Use simulator to count instructions• Hardware counter in spec. register (Pentium II,III,4)• CPI:• Calculate: Execution Time / Clock cycle timeInstruction Count• Hardware counter in special register (PII,III,4)CS61C L 29 Performance & Parallel (17) Beamer, Summer 2007 © UCBCalculating CPI Another Way• First calculate CPI for each individualinstruction (add, sub, and, etc.)• Next calculate frequency of eachindividual


View Full Document

Berkeley COMPSCI 61C - Lecture 29 Performance & Parallel Intro

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Download Lecture 29 Performance & Parallel Intro
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 29 Performance & Parallel Intro and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 29 Performance & Parallel Intro 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?