DOC PREVIEW
Berkeley COMPSCI 61C - Great Ideas in Computer Architecture

This preview shows page 1-2-3-24-25-26-27-48-49-50 out of 50 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 50 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Slide 2New-School Machine Structures (It’s a bit more complicated!)AgendaAgendaWhat is Performance?Running Systems to 100% UtilizationThe Iron Law of Queues (aka Little’s Law)Cloud Performance: Why Application Latency MattersGoogle Instant Search “Instant Efficiency”Defining CPU PerformanceDefining Relative CPU PerformanceMeasuring CPU PerformanceCPU Performance FactorsCPU Performance FactorsRestating Performance EquationWhat Affects Each Component? Instruction Count, CPI, Clock RatePeer Instruction QuestionAgendaAdministriviaAgendaWorkload and BenchmarkSPEC (System Performance Evaluation Cooperative)SPECINT2006 on AMD BarcelonaSummarizing Performance …… Depends Who’s SellingSummarizing SPEC PerformanceEnergy and Power (Energy = Power x Time)Peak Power vs. Lower Energy (Power x Time = Energy)Energy Proportional ComputingSPECPowerSPECPower on BarcelonaWhich is Better? (1 Red Machine vs. 5 Green Machines)Other Benchmark AttemptsDhystone ShortcomingsAgendaSlide 42AgendaCompiler Optimization and DhrystoneDetailed –O1, -O2 OptimizationsMeasuring TimeHow to get RDTSC access in C?gcc Optimization ExperimentWhere Do You Spend the Time in Your Program?gprofgprof exampleTest Program to Profile with SaturnSlide 54Cautionary TaleAnd In Conclusion, …CS 61C: Great Ideas in Computer Architecture (Machine Structures)PerformanceInstructors:Randy H. KatzDavid A. Pattersonhttp://inst.eecs.Berkeley.edu/~cs61c/sp111Spring 2011 -- Lecture #1001/14/201901/14/2019 Spring 2011 -- Lecture #10 2New-School Machine Structures(It’s a bit more complicated!)•Parallel RequestsAssigned to computere.g., Search “Katz”•Parallel ThreadsAssigned to coree.g., Lookup, Ads•Parallel Instructions>1 instruction @ one timee.g., 5 pipelined instructions•Parallel Data>1 data item @ one timee.g., Add of 4 pairs of words•Hardware descriptionsAll gates @ one time01/14/2019 Spring 2011 -- Lecture #10 3SmartPhoneWarehouse Scale ComputerSoftware HardwareHarnessParallelism &Achieve HighPerformanceLogic Gates CoreCoreCoreCore… Memory (Cache) Memory (Cache)Input/OutputInput/OutputComputerMain MemoryMain MemoryCore Instruction Unit(s) Instruction Unit(s) FunctionalUnit(s) FunctionalUnit(s)A3+B3A2+B2A1+B1A0+B0How dowe know?Agenda•Defining Performance•Administrivia•Workloads and Benchmarks•Technology Break•Measuring Performance•Summary01/14/2019 Spring 2011 -- Lecture #10 4Agenda•Defining Performance•Administrivia•Workloads and Benchmarks•Technology Break•Measuring Performance•Summary01/14/2019 Spring 2011 -- Lecture #10 5What is Performance?•Latency (or response time or execution time)– Time to complete one task•Bandwidth (or throughput)– Tasks completed per unit time01/14/2019 Spring 2011 -- Lecture #10 6Running Systemsto 100% Utilization•Implication of the graph at the right?•Can you explain why this happens? 01/14/2019 Spring 2011 -- Lecture #10 7Utilization100%Service Timeaka Latency orResponsiveness“Knee”Student Roulette?The Iron Law of Queues(aka Little’s Law)01/14/2019 Spring 2011 -- Lecture #10 8L = l WAverage number of customers in system (L) = average interarrival rate (l ) x average service time (W)Cloud Performance:Why Application Latency Matters•Key figure of merit: application responsiveness–Longer the delay, the fewer the user clicks, the less the user happiness, and the lower the revenue per user01/14/2019 Spring 2011 -- Lecture #10 9Google Instant Search“Instant Efficiency”01/14/2019 Spring 2011 -- Lecture #10 10Typical search takes 24 seconds, Google’s search algorithm is only 300 ms of this“It’s not search ‘as you type’, but ‘search before you type’!”“We can predict what you are likely to type and give you those results in real time”Defining CPU Performance•What does it mean to say X is faster than Y?•Ferrari vs. School Bus?•2009 Ferrari 599 GTB –2 passengers, 11.1 secs in quarter mile•2009 Type D school bus–54 passengers, quarter mile time? http://www.youtube.com/watch?v=KwyCoQuhUNA •Response Time/Latency: e.g., time to travel ¼ mile•Throughput/Bandwidth: e.g., passenger-mi in 1 hour01/14/2019 Spring 2011 -- Lecture #10 11Defining Relative CPU Performance•PerformanceX = 1/Program Execution TimeX•PerformanceX > PerformanceY =>1/Execution TimeX > 1/Execution Timey =>Execution TimeY > Execution TimeX•Computer X is N times faster than Computer YPerformanceX / PerformanceY = N orExecution TimeY / Execution TimeX = N•Bus is to Ferrari as 12 is to 11.1:Ferrari is 1.08 times faster than the bus!01/14/2019 Spring 2011 -- Lecture #10 12Measuring CPU Performance•Computers use a clock to determine when events takes place within hardware•Clock cycles: discrete time intervals–aka clocks, cycles, clock periods, clock ticks •Clock rate or clock frequency: clock cycles per second (inverse of clock cycle time)•3 GigaHertz clock rate => clock cycle time = 1/(3x109) seconds clock cycle time = 333 picoseconds (ps)01/14/2019 Spring 2011 -- Lecture #10 13CPU Performance Factors•To distinguish between processor time and I/O, CPU time is time spent in processor•CPU Time/Program = Clock Cycles/Program x Clock Cycle Time•Or CPU Time/Program = Clock Cycles/Program ÷ Clock Rate01/14/2019 Spring 2011 -- Lecture #10 14CPU Performance Factors•But a program executes instructions•CPU Time/Program = Clock Cycles/Program x Clock Cycle Time = Instructions/Program x Average Clock Cycles/Instruction x Clock Cycle Time•1st term called Instruction Count•2nd term abbreviated CPI for average Clock Cycles Per Instruction •3rd term is 1 / Clock rate01/14/2019 Spring 2011 -- Lecture #10 15Restating Performance Equation•Time = SecondsProgram Instructions Clock cyclesSeconds Program Instruction Clock Cycle01/14/2019 Spring 2011 -- Lecture #10 16××=What Affects Each Component? Instruction Count, CPI, Clock RateHardware or software component?Affects What?AlgorithmProgramming LanguageCompilerInstruction Set Architecture01/14/2019 Spring 2011 -- Lecture #10 17Student Roulette?Peer Instruction Question•Computer A clock cycle time 250 ps, CPIA = 2•Computer B clock cycle time 500 ps, CPIB = 1.2•Assume A and B have same instruction set•Which statement is true?Red. Computer A is ~1.2 times faster than BOrange. Computer A is ~4.0 times faster than BGreen. Computer B is ~1.7 times faster than AYellow.


View Full Document

Berkeley COMPSCI 61C - Great Ideas in Computer Architecture

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Download Great Ideas in Computer Architecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Great Ideas in Computer Architecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Great Ideas in Computer Architecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?