inst eecs berkeley edu cs61c CS61C Machine Structures Lecture 41 Performance II Lecturer PSOE Dan Garcia www cs berkeley edu ddgarcia UWB Ultra Wide Band The FCC moved one step closer to approving a standard for this technology which uses spread spectrum pulses to send its information Imagine no data wires to ANY of your devices www nytimes com 2005 05 04 technology techspecial 04markoff html CS61C L41 Performance II 1 Garcia UCB Review RAID Motivation In the 1980s there were 2 classes of drives expensive big for enterprises and small for PCs They thought make one big out of many small Higher perf with more disk arms per Raid 0 through 5 are solutions with tradeoffs 32 B industry Started Cal by CS Profs Katz Patterson Latency v Throughput Time for one job vs aggregate time for many CS61C L41 Performance II 2 Garcia UCB What is Time Straightforward definition of time Total time to complete a task including disk accesses memory accesses I O activities operating system overhead real time response time or time elapsed Alternative just time processor CPU is working only on your program since multiple processes running at same time CPU execution time or CPU time Often divided into system CPU time in OS and user CPU time in user program CS61C L41 Performance II 3 Garcia UCB How to Measure Time User Time seconds CPU Time Computers constructed using a clock that runs at a constant rate and determines when events take place in the hardware These discrete time intervals called clock cycles or informally clocks or cycles Length of clock period clock cycle time e g 2 nanoseconds or 2 ns and clock rate e g 500 megahertz or 500 MHz which is the inverse of the clock period use these CS61C L41 Performance II 4 Garcia UCB Measuring Time using Clock Cycles 1 2 CPU execution time for a program Clock Cycles for a program x Clock Cycle Time or Clock Cycles for a program Clock Rate CS61C L41 Performance II 5 Garcia UCB Measuring Time using Clock Cycles 2 2 One way to define clock cycles Clock Cycles for program Instructions for a program called Instruction Count x Average Clock cycles Per Instruction abbreviated CPI CPI one way to compare two machines with same instruction set since Instruction Count would be the same CS61C L41 Performance II 6 Garcia UCB Performance Calculation 1 2 CPU execution time for program Clock Cycles for program x Clock Cycle Time Substituting for clock cycles CPU execution time for program Instruction Count x CPI x Clock Cycle Time Instruction Count x CPI x Clock Cycle Time CS61C L41 Performance II 7 Garcia UCB Performance Calculation 2 2 CPU time Instructions x Cycles Program Instruction CPU time Instructions x Cycles Program Cycle x Seconds Instruction CPU time Instructions x Cycles Program CPU time Seconds x Seconds Cycle x Seconds Instruction Cycle Program Product of all 3 terms if missing a term can t predict time the real measure of performance CS61C L41 Performance II 8 Garcia UCB How Calculate the 3 Components Clock Cycle Time in specification of computer Clock Rate in advertisements Instruction Count Count instructions in loop of small program Use simulator to count instructions Hardware counter in spec register Pentium II III 4 CPI Calculate Execution Time Clock cycle time Instruction Count Hardware counter in special register PII III 4 CS61C L41 Performance II 9 Garcia UCB Calculating CPI Another Way First calculate CPI for each individual instruction add sub and etc Next calculate frequency of each individual instruction Finally multiply these two for each instruction and add them up to get final CPI the weighted sum CS61C L41 Performance II 10 Garcia UCB Example RISC processor Op Freqi CPIi ALU Load Store Branch 50 20 10 20 1 5 3 2 Instruction Mix Prod Time 5 23 1 0 45 3 14 4 18 2 2 Where time spent What if Branch instructions twice as fast CS61C L41 Performance II 11 Garcia UCB What Programs Measure for Comparison Ideally run typical programs with typical input before purchase or before even build machine Called a workload For example Engineer uses compiler spreadsheet Author uses word processor drawing program compression software In some situations its hard to do Don t have access to machine to benchmark before purchase Don t know workload in future Next benchmarks PC Mac showdown CS61C L41 Performance II 12 Garcia UCB Benchmarks Obviously apparent speed of processor depends on code used to test it Need industry standards so that different processors can be fairly compared Companies exist that create these benchmarks typical code used to evaluate systems Need to be changed every 2 or 3 years since designers could and do target for these standard benchmarks CS61C L41 Performance II 13 Garcia UCB Example Standardized Benchmarks 1 2 Standard Performance Evaluation Corporation SPEC SPEC CPU2000 CINT2000 12 integer gzip gcc crafty perl CFP2000 14 floating point swim mesa art All relative to base machine Sun 300MHz 256Mb RAM Ultra5 10 which gets score of 100 www spec org osg cpu2000 They measure System speed SPECint2000 System throughput SPECint rate2000 CS61C L41 Performance II 14 Garcia UCB Example Standardized Benchmarks 2 2 SPEC Benchmarks distributed in source code Members of consortium select workload 30 companies 40 universities Compiler machine designers target benchmarks so try to change every 3 years The last benchmark released was SPEC 2000 They are still finalizing SPEC 2005 CINT2000 gzip vpr Routing gcc mcf crafty parser eon perlbmk gap vortex bzip2 twolf C C Compression FPGA Circuit Placement and C C C C C C C C C C C Programming Language Compiler Combinatorial Optimization Game Playing Chess Word Processing Computer Visualization PERL Programming Language Group Theory Interpreter Object oriented Database Compression Place and Route Simulator CS61C L41 Performance II 15 CFP2000 wupwise swim mgrid applu Equations mesa galgel art equake facerec ammp lucas fma3d sixtrack apsi Fortran77 Fortran77 Fortran77 Fortran77 Physics Quantum Chromodynamics Shallow Water Modeling Multi grid Solver 3D Potential Field Parabolic Elliptic Partial Differential C Fortran90 C C Fortran90 C Fortran90 Fortran90 Fortran77 Fortran77 3 D Graphics Library Computational Fluid Dynamics Image Recognition Neural Networks Seismic Wave Propagation Simulation Image Processing Face Recognition Computational Chemistry Number Theory Primality Testing Finite element Crash Simulation High Energy Nuclear Physics Accelerator Design Meteorology Pollutant Distribution Garcia UCB Example PC
View Full Document
Unlocking...