Berkeley COMPSCI 61C - Lecture Notes (9 pages)

Previewing pages 1, 2, 3 of 9 page document View the full content.
View Full Document

Lecture Notes



Previewing pages 1, 2, 3 of actual document.

View the full content.
View Full Document
View Full Document

Lecture Notes

148 views


Pages:
9
School:
University of California, Berkeley
Course:
Compsci 61c - Machine Structures
Machine Structures Documents
Unformatted text preview:

2 14 11 CS 61C Great Ideas in Computer Architecture Machine Structures Performance Instructors Randy H Katz David A PaGerson hGp inst eecs Berkeley edu cs61c sp11 2 14 11 Spring 2011 Lecture 10 1 2 14 11 New School Machine Structures It s a bit more complicated Parallel Threads Assigned to core e g Lookup Ads Smart Phone Warehouse Scale Computer Harness How do Parallelism Achieve High we know Performance Computer Parallel Instruc ons 1 instruc on one me e g 5 pipelined instruc ons Parallel Data 1 data item one me e g Add of 4 pairs of words Hardware descrip ons All gates one me 2 14 11 Core Core Memory Cache Input Output Instruc on Unit s Core Func onal Unit s De ning Performance Administrivia Workloads and Benchmarks Technology Break Measuring Performance Summary A0 B0 A1 B1 A2 B2 A3 B3 Main Memory Logic Gates Spring 2011 Lecture 10 3 2 14 11 Agenda Spring 2011 Lecture 10 Spring 2011 Lecture 10 4 What is Performance De ning Performance Administrivia Workloads and Benchmarks Technology Break Measuring Performance Summary 2 14 11 2 Agenda So ware Hardware Parallel Requests Assigned to computer e g Search Katz Spring 2011 Lecture 10 Latency or response me or execu on me Time to complete one task Bandwidth or throughput Tasks completed per unit me 5 2 14 11 Spring 2011 Lecture 10 6 1 2 14 11 The Iron Law of Queues aka LiGle s Law Running Systems to 100 U liza on Implica on of the graph at the right Service Time aka Latency or Responsiveness Can you explain why this happens Knee 100 U liza on 2 14 11 Spring 2011 Lecture 10 Student RouleGe L W 7 2 14 11 Average number of customers in system L average interarrival rate x average service me W Spring 2011 Lecture 10 8 Google Instant Search Instant E ciency Cloud Performance Why Applica on Latency MaGers Key gure of merit applica on responsiveness Longer the delay the fewer the user clicks the less the user happiness and the lower the revenue per user 2 14 11 Spring 2011 Lecture 10 Typical search takes 24 seconds Google s search algorithm is only 300 ms of this It s not search as you type but search before you type We can predict what you are likely to type and give you those results in real me 9 De ning CPU Performance 2 passengers 11 1 secs in quarter mile 2009 Type D school bus 54 passengers quarter mile me hGp www youtube com watch v KwyCoQuhUNA Response Time Latency e g me to travel mile Throughput Bandwidth e g passenger mi in 1 hour Spring 2011 Lecture 10 Spring 2011 Lecture 10 10 De ning Rela ve CPU Performance What does it mean to say X is faster than Y Ferrari vs School Bus 2009 Ferrari 599 GTB 2 14 11 2 14 11 11 PerformanceX 1 Program Execu on TimeX PerformanceX PerformanceY 1 Execu on TimeX 1 Execu on Timey Execu on TimeY Execu on TimeX Computer X is N mes faster than Computer Y PerformanceX PerformanceY N or Execu on TimeY Execu on TimeX N Bus is to Ferrari as 12 is to 11 1 Ferrari is 1 08 mes faster than the bus 2 14 11 Spring 2011 Lecture 10 12 2 2 14 11 Measuring CPU Performance CPU Performance Factors Computers use a clock to determine when events takes place within hardware Clock cycles discrete me intervals To dis nguish between processor me and I O CPU me is me spent in processor aka clocks cycles clock periods clock cks Clock rate or clock frequency clock cycles per second inverse of clock cycle me 3 GigaHertz clock rate clock cycle me 1 3x109 seconds clock cycle me 333 picoseconds ps 2 14 11 Spring 2011 Lecture 10 13 CPU Performance Factors CPU Time Program Clock Cycles Program x Clock Cycle Time Or CPU Time Program Clock Cycles Program Clock Rate 2 14 11 Spring 2011 Lecture 10 14 Resta ng Performance Equa on But a program executes instruc ons CPU Time Program Clock Cycles Program x Clock Cycle Time Instructions Program x Average Clock Cycles Instruction x Clock Cycle Time Time Seconds Program Instruc ons Clock cycles Seconds Program Instruc on Clock Cycle 1st term called Instruc on Count 2nd term abbreviated CPI for average Clock Cycles Per Instruc on 3rd term is 1 Clock rate 2 14 11 Spring 2011 Lecture 10 15 2 13 11 What A ects Each Component Instruc on Count CPI Clock Rate Hardware or so ware component Algorithm Computer A clock cycle me 250 ps CPIA 2 Computer B clock cycle me 500 ps CPIB 1 2 Assume A and B have same instruc on set Which statement is true Red Computer A is 1 2 mes faster than B Orange Computer A is 4 0 mes faster than B Green Computer B is 1 7 mes faster than A Yellow Computer B is 3 4 mes faster than A Pink None of the above A ects What Instruc on Set Architecture Spring 2011 Lecture 10 16 Peer Instruc on Ques on Programming Language Compiler 2 13 11 Spring 2011 Lecture 10 Student RouleGe 17 2 13 11 Spring 2011 Lecture 10 19 3 2 14 11 Agenda Administrivia De ning Performance Administrivia Workloads and Benchmarks Technology Break Measuring Performance Summary 2 13 11 Spring 2011 Lecture 10 Lab 5 posted Project 2 1 Due Sunday 11 59 59 HW 4 Due Sunday 11 59 59 Midterm in less than three weeks No discussion during exam week TA Review Su Mar 6 2 5 PM 2050 VLSB Exam Tu Mar 8 6 9 PM 145 155 Dwinelle Small number of special considera on cases due to class con icts etc contact Dave or Randy 21 2 14 11 Agenda Spring 2011 Lecture 10 Workload Set of programs run on a computer Actual collec on of applica ons run or made from real programs to approximate such a mix Speci es both programs and rela ve frequencies Benchmark Program selected for use in comparing computer performance Benchmarks form a workload Usually standardized so that many use them 23 2 14 11 System Performance Evalua on Coopera ve Computer Vendor coopera ve for benchmarks started in 1989 SPECCPU2006 Description Interpreted string processing Block sorting compression 24 O en turn into number where bigger is faster SPECra o reference execu on me on old reference computer divide by execu on me on new computer to get an e ec ve speed up 400 637 9 770 15 3 2 389 0 85 400 817 9 650 11 8 1 050 1 72 400 724 8 050 11 1 336 10 0 400 1 345 Go game 1 658 1 09 400 721 Search gene sequence 2 783 0 80 400 890 9 330 10 5 Chess game 2 176 0 96 400 837 12 100 14 5 Quantum computer simulation 1 623 1 61 400 1 047 20 720 19 8 3 102 0 80 400 993 22 130 22 3 587 2 94 …


View Full Document

Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?