CPE 619 Workloads: Types, Selection, CharacterizationPart II: Measurement Techniques and ToolsTypes of WorkloadsTest Workloads for Computer SystemsAddition InstructionsInstruction MixesExample: Gibson Instruction MixProblems with Instruction MixesKernelsSynthetic ProgramsExample of Synthetic Workload Generation ProgramSlide 12Application WorkloadsBenchmarksPopular BenchmarksSieve (1 of 2)Sieve (2 of 2)Ackermann’s Function (1 of 2)Ackermann’s Function (2 of 2)WhetstoneLINPACKDhrystoneLawrence Livermore LoopsSPECSPEC (cont’d)SPEC Benchmark Suits (Current)SPEC CPU BenchmarksSPEC CPU2006 Speed MetricsSPEC CPU2006 Throughput MetricsDebit-Credit (1/3)Debit-Credit (2/3)Debit-Credit (3/3)TPCEMBSThe Art of Workload SelectionSlide 36Services ExercisedServices Exercised (cont’d)Example: Timesharing SystemsExample: NetworksExample: Magnetic Tape Backup SystemMagnetic Tape System (cont’d)Slide 43Level of DetailLevel of Detail (Cont)RepresentativenessTimelinessOther Considerations in Workload SelectionSummaryWorkload CharacterizationWorkload Characterization TechniquesTerminologyChoosing ParametersTechniques for Workload CharacterizationAveragingCase Study: Program Usage in Educational EnvironmentsCharacteristics of an Average Editing SessionTechniques for Workload CharacterizationSingle Parameter HistogramsMulti-parameter HistogramsSlide 61Principal-Component AnalysisPrincipal Component Analysis (cont’d)Slide 64Finding Principal FactorsPrincipal Component Analysis ExamplePrincipal Component AnalysisSlide 68Slide 69Slide 70Slide 71Slide 72Slide 73Slide 74Slide 75Slide 76Markov ModelsMarkov Models (cont’d)Transition ProbabilityTransition Probability (cont’d)Slide 81ClusteringClustering Steps1) Sampling2) Parameter Selection3) Transformation4) Outliers5) Data Scaling5) Data Scaling (cont’d)5) Data Scaling (cont’d)6) Distance Metric6) Distance Metric (cont’d)7) Clustering Techniques7) Clustering Techniques (cont’d)Clustering Techniques: Minimum Spanning Tree MethodMinimum Spanning Tree Example (1/5)Minimum Spanning Tree Example(2/5)Minimum Spanning Tree Example (3/5)Minimum Spanning Tree Example (4/5)Minimum Spanning Tree Example (5/5)Representing ClusteringNearest Centroid MethodInterpreting ClustersProblems with ClusteringProblems with Clustering (Cont)Homework #2CPE 619Workloads: Types, Selection, CharacterizationAleksandar MilenkovićThe LaCASA LaboratoryElectrical and Computer Engineering DepartmentThe University of Alabama in Huntsvillehttp://www.ece.uah.edu/~milenkahttp://www.ece.uah.edu/~lacasa2Part II: Measurement Techniques and ToolsMeasurements are not to provide numbers but insight - Ingrid BucherMeasure computer system performanceMonitor the system that is being subjected to a particular workloadHow to select appropriate workloadIn general performance analysis should know1. What are the different types of workloads?2. Which workloads are commonly used by other analysts?3. How are the appropriate workload types selected?4. How is the measured workload data summarized?5. How is the system performance monitored?6. How can the desired workload be placed on the system in a controlled manner?7. How are the results of the evaluation presented?3Types of WorkloadsTest workload – denotes any workload used in performance studyReal workload – one observed on a system while being usedCannot be repeated (easily)May not even exist (proposed system)Synthetic workload – similar characteristics to real workloadCan be applied in a repeated mannerRelatively easy to port; Relatively easy to modify without affecting operationNo large real-world data files; No sensitive dataMay have built-in measurement capabilitiesBenchmark == WorkloadBenchmarking is process of comparing 2+ systems with workloadsbenchm ark v. trans. To subject (a system) to a series of testsIn order to obtain prearranged results not available on Competitive systems. – S. Kelly-Bootle, The Devil’s DP Dictionary4Test Workloads for Computer SystemsAddition instructionsInstruction mixesKernelsSynthetic programsApplication benchmarks5Addition InstructionsEarly computers had CPU as most expensive componentSystem performance == Processor PerformanceCPUs supported few operations; the most frequent one was additionComputer with faster addition instruction performed betterRun many addition operations as test workloadProblemMore operations, not only additionSome more complicated than others6Instruction MixesNumber and complexity of instructions increasedAdditions were no longer sufficientCould measure instructions individually, but they are used in different amounts=> Measure relative frequencies of various instructions on real systemsUse as weighting factors to get average instruction timeInstruction mix – specification of various instructions coupled with their usage frequencyUse average instruction time to compare different processorsOften use inverse of average instruction timeMIPS – Million Instructions Per SecondFLOPS – Millions of Floating-Point Operations Per SecondGibson mix: Developed by Jack C. Gibson in 1959 for IBM 704 systems7Example: Gibson Instruction Mix1. Load and Store 13.22. Fixed-Point Add/Sub 6.13. Compares3.84. Branches 16.65. Float Add/Sub 6.96. Float Multiply 3.87. Float Divide 1.58. Fixed-Point Multiply 0.69. Fixed-Point Divide 0.210. Shifting 4.411. Logical And/Or 1.612. Instructions not using regs 5.313. Indexing 18.0Total 1001959,IBM 650IBM 7048Problems with Instruction MixesIn modern systems, instruction time variable depending uponAddressing modes, cache hit rates, pipeliningInterference with other devices during processor-memory accessDistribution of zeros in multiplierTimes a conditional branch is takenMixes do not reflect special hardware such as page table lookupsOnly represents speed of processorBottleneck may be in other parts of system9KernelsPipelining, caching, address translation, … made computer instruction times highly variableCannot use individual instructions in isolationInstead, use higher level functions Kernel = the most frequent function (kernel = nucleus)Commonly used kernels: Sieve, Puzzle, Tree Searching, Ackerman's Function, Matrix Inversion, and SortingDisadvantagesDo not make use of I/O devicesAd-hoc selection of kernels (not based on real measurements)10Synthetic ProgramsProliferation
View Full Document