DOC PREVIEW
UMD CMSC 411 - Lecture 2 Computer Design and Evaluation

This preview shows page 1-2 out of 7 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Administrivia Unit 1 Homework due Thursday 2 12 posted on homework project web page later today as stated on syllabus 2 problems will be graded CMSC 411 Computer Systems y Architecture Lecture 2 Computer Design and Evaluation Finish reading Ch 1 of H P Alan Sussman a s cs u d edu als cs umd edu CMSC 411 2 Manufacture of DRAM and other chips 2 Wafers and dies To find the cost of a die Chips are manufactured on wafers circular disks containing many dies chips The wafer is tested and chopped into dies dies CMSC 411 2 Fig g 1 12 in H P 117 AMD Opterons Number of dies per wafer is at most the area of the wafer divided by the area of the die The cost of the wafer divided by the number of working dies per wafer is the cost of each die die The fraction of working dies is called the die yield yield which decreases as the area of the die increases p 20 Cost of die is Rule of thumb p proportional to the square of the die area 3 CMSC 411 2 4 Comparing performance of two machines What is time Unix time command example 90 7u 12 9s 2 39 65 The user used the CPU for 90 7 seconds user CPU time The system used it for 12 9 seconds system CPU time Elapsed time from the user s user s request to completion of the task was 2 minutes 39 seconds 159 seconds And A d 90 7 90 7 12 9 159 12 9 159 65 Definition Performance is equal to 1 divided by execution time Problem How to measure execution time the rest of the time was spent waiting for I O or running other programs CMSC 411 2 5 Time cont CMSC 411 2 6 How to measure CPU performance Usual measurements of time Benchmark a program used to measure p performance system performance measures the elapsed time on unloaded i l user system single t CPU performance measures user CPU time on unloaded system real programs what is reality kernels loops in which most of time is spent in a real program toy programs synthetic programs F Fact t Computer C t manufacturers f t ttune th their i product d t to the popular benchmarks Your results may vary unless you run benchmark programs and nothing else See Figure 1 13 listing programs in the SPEC CPU2006 benchmark suite CMSC 411 2 7 CMSC 411 2 8 Reproducibility Reporting results Benchmarking is a laboratory experiment and needs to be documented as fully as a well run chemistry h i t experiment i t Example of performance for SPEC CFP2000 benchmark is in H P Figure 1 14 1 14 for Sun Ultra5 AMD Opteron Intel Itanium 2 need to look on SPEC web site for all the parameters of the g machines including data on hardware CPU how many primary and secondary caches memory disk data on software OS compilers file system type and compiler flags f used very important Identify each variable hardware component Identify compiler flags and measure variability Verify reproducibility and provide data for others to reproduce the benchmarking results How to summarize results 9 CMSC 411 2 Sample results CMSC 411 2 10 Statistical reporting It is rare indeed when advertisers politicians pop p p p economists and drumbeaters for medical programs offer a statistical argument that is not either misleading or downright deceptive deceptive Martin Gardner don t understand in mathematics you don t things you just get used to them John von Neumann if you torture your data long enough they will tell you what you want to hear James L L Mills Mills M M D D From Figure 1 15 in H P 3 e which machine is fastest Computer A Computer B Computer C Program P1 sec 1 10 20 Program P2 sec 1000 100 20 Program P3 sec 1001 110 40 CMSC 411 2 11 CMSC 411 2 12 Statistical reporting cont Statistical reporting cont The average execution time is the arithmetic mean Arithmetic mean is good if the weights represent p your y own workload then y you can compare how each machine will perform for you sum the execution times for each program and divide by the number of programs n Don t D t sett th the weights i ht based b d on performance f on any particular machine can get confusing results by doing that The harmonic mean is n divided by the sum of some function of each execution time Often both of these means are modified to g give more weight g to more important p programs CMSC 411 2 Harmonic mean gives same sort of information but is used when rates e g instructions per second are e g measured instead of times 13 CMSC 411 2 Geometric mean How to make computers faster To measure relative performance use the geometric mean g Make the common case faster p Put more effort and funds into Example optimizing the hardware for addition than to optimize square root Amdahl Amdahl s s law quantifies this principle Choose a reference machine divide all execution times by the corresponding times on the reference machine machine multiply those ratios together and take the nth root of the product 14 Define speedup as the time the task took originally divided by the time the task takes after improvement Geometric means have the nice property that you get consistent results relative performance regardless of which machine is used to normalize Figure 1 14 CMSC 411 2 15 CMSC 411 2 16 Amdahl s Law Amdahl s Law cont Then Amdahl tells us what the speedup of a particular task is given Suppose that the original task runs for 1 second so it takes f seconds in the critical piece and 1 f i other in th things thi Then the task on the improved machine will take only y f s seconds in the critical p piece but will still take 1 f seconds in other things fraction f of the original execution time that the task could use the improvement the speedup s of a task that always uses the improvement Then what is the speedup of the task old time speedup new time 17 CMSC 411 2 1 1 f CMSC 411 2 f s 18 Example 1 Example 2 Suppose we work very hard improving the square q root hardware and that our task originally spends 1 of its time doing square roots Even if the improvement reduces the square root time to zero zero the speedup is no better than Suppose that for the same cost we can speed up integer arithmetic by a factor of 20 or speed up floating fl ti point i t arithmetic ith ti by b a factor f t off 2 2 If our task spends 10 of its time in integer arithmetic and 40 of its time in floating point arithmetic ith ti which hi h should h ld we do d speedup 1 1 f 1 99 1 01 …


View Full Document

UMD CMSC 411 - Lecture 2 Computer Design and Evaluation

Documents in this Course
Load more
Download Lecture 2 Computer Design and Evaluation
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 2 Computer Design and Evaluation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 Computer Design and Evaluation and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?