DOC PREVIEW
Berkeley COMPSCI 61C - Performance & Parallel Intro

This preview shows page 1-2-3-22-23-24-45-46-47 out of 47 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

PowerPoint Presentation“Last time…”Peer InstructionPeer Instruction AnswerWhy Performance? Faster is better!Two Notions of “Performance”DefinitionsExample of Response Time v. ThroughputWords, Words, Words…What is Time?How to Measure Time?Measuring Time using Clock Cycles (1/2)Measuring Time using Clock Cycles (2/2)Performance Calculation (1/2)Performance Calculation (2/2)How Calculate the 3 Components?Calculating CPI Another WayExample (RISC processor)What Programs Measure for Comparison?BenchmarksExample Standardized Benchmarks (1/2)Example Standardized Benchmarks (2/2)Another BenchmarkPerformance Evaluation: An Aside DemoSlide 25Peer Instruction Answers“And in conclusion…”AdministriviaBig Problems Show Need for ParallelWhat Can We Do?Let’s Put Many CPUs Together!Performance RequirementsRecent History of Parallel ComputingCurrent Champions (June 2007)The Future of ParallelismDistributed Computing ThemesDistributed Computing ChallengesThings to Worry About: Parallelizing CodeBut… What About Overhead?Peer Instruction of AssumptionsSlide 41SummaryA New Hope: Google’s MapReduceMapReduce Programming ModelMapReduce Code ExampleMapReduce Example DiagramMapReduce Advantages/DisadvantagesCS61C L29 Performance & Parallel (1)Beamer, Summer 2007 © UCBScott Beamer, Instructorinst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #29 Performance & Parallel Intro2007-8-14“Paper” Battery Developed by Researchers at Rensselaerwww.bbc.co.ukCS61C L29 Performance & Parallel (2)Beamer, Summer 2007 © UCB“Last time…”•Magnetic Disks continue rapid advance: 60%/yr capacity, 40%/yr bandwidth, slow on seek, rotation improvements, MB/$ improving 100%/yr?•Designs to fit high volume form factor•PMR a fundamental new technologybreaks through barrier•RAID •Higher performance with more disk arms per $•Adds option for small # of extra disks•Can nest RAID levels•Today RAID is > tens-billion dollar industry, 80% nonPC disks sold in RAIDs,started at CalCS61C L29 Performance & Parallel (3)Beamer, Summer 2007 © UCBPeer Instruction1. RAID 1 (mirror) and 5 (rotated parity) help with performance and availability2. RAID 1 has higher cost than RAID 53. Small writes on RAID 5 are slower than on RAID 1 ABC0: FFF1: FFT2: FTF3: FTT4: TFF5: TFT6: TTF7: TTTCS61C L29 Performance & Parallel (4)Beamer, Summer 2007 © UCBPeer Instruction Answer1. RAID 1 (mirror) and 5 (rotated parity) help with performance and availability2. RAID 1 has higher cost than RAID 53. Small writes on RAID 5 are slower than on RAID 1 ABC0: FFF1: FFT2: FTF3: FTT4: TFF5: TFT6: TTF7: TTT1. All RAID (0-5) helps with performance, only RAID0 doesn’t help availability. TRUE2. Surely! Must buy 2x disks rather than 1.25x (from diagram, in practice even less) TRUE3. RAID5 (2R,2W) vs. RAID1 (2W). Latency worse, throughput (|| writes) better. TRUECS61C L29 Performance & Parallel (5)Beamer, Summer 2007 © UCBWhy Performance? Faster is better!•Purchasing Perspective: given a collection of machines (or upgrade options), which has the best performance ?least cost ?best performance / cost ?•Computer Designer Perspective: faced with design options, which has the best performance improvement ?least cost ?best performance / cost ?•All require basis for comparison and metric for evaluation!•Solid metrics lead to solid progress!CS61C L29 Performance & Parallel (6)Beamer, Summer 2007 © UCBTwo Notions of “Performance”PlaneBoeing 747BAD/Sud ConcordeTopSpeedDC to ParisPassen-gersThroughput (pmph)610 mph6.5 hours470 286,7001350 mph3 hours132 178,200•Which has higher performance?•Interested in time to deliver 100 passengers?•Interested in delivering as many passengers per day as possible?•In a computer, time for one task calledResponse Time or Execution Time•In a computer, tasks per unit time calledThroughput or BandwidthCS61C L29 Performance & Parallel (7)Beamer, Summer 2007 © UCBDefinitions•Performance is in units of things per sec•bigger is better•If we are primarily concerned with response time•performance(x) = 1 execution_time(x)" F(ast) is n times faster than S(low) " means… performance(F) execution_time(S)n = = performance(S) execution_time(F)CS61C L29 Performance & Parallel (8)Beamer, Summer 2007 © UCBExample of Response Time v. Throughput•Time of Concorde vs. Boeing 747?•Concord is 6.5 hours / 3 hours = 2.2 times faster•Throughput of Boeing vs. Concorde?•Boeing 747: 286,700 pmph / 178,200 pmph = 1.6 times faster•Boeing is 1.6 times (“60%”) faster in terms of throughput•Concord is 2.2 times (“120%”) faster in terms of flying time (response time)We will focus primarily on response time.CS61C L29 Performance & Parallel (9)Beamer, Summer 2007 © UCBWords, Words, Words…•Will (try to) stick to “n times faster”; its less confusing than “m % faster”•As faster means both decreased execution time and increased performance, to reduce confusion we will (and you should) use “improve execution time” or “improve performance”CS61C L29 Performance & Parallel (10)Beamer, Summer 2007 © UCBWhat is Time?•Straightforward definition of time: •Total time to complete a task, including disk accesses, memory accesses, I/O activities, operating system overhead, ...•“real time”, “response time” or “elapsed time” •Alternative: just time processor (CPU) is working only on your program (since multiple processes running at same time)•“CPU execution time” or “CPU time”•Often divided into system CPU time (in OS) and user CPU time (in user program)CS61C L29 Performance & Parallel (11)Beamer, Summer 2007 © UCBHow to Measure Time?•Real Time  Actual time elapsed•CPU Time: Computers constructed using a clock that runs at a constant rate and determines when events take place in the hardware•These discrete time intervals called clock cycles (or informally clocks or cycles)•Length of clock period: clock cycle time (e.g., 2 nanoseconds or 2 ns) and clock rate (e.g., 500 megahertz, or 500 MHz), which is the inverse of the clock period; use these!CS61C L29 Performance & Parallel (12)Beamer, Summer 2007 © UCBMeasuring Time using Clock Cycles (1/2)•or= Clock Cycles for a program Clock Rate•CPU execution time for a program = Clock Cycles for a program x Clock PeriodCS61C L29 Performance & Parallel (13)Beamer, Summer 2007 © UCBMeasuring Time


View Full Document

Berkeley COMPSCI 61C - Performance & Parallel Intro

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Download Performance & Parallel Intro
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Performance & Parallel Intro and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Performance & Parallel Intro 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?