Introduction to Computer SystemsIntroduction to Computer SystemsTopics:Topics:ThemeFive great realities of computer systemsHow this fits within CS curriculumStaff, text, and policiesLecture topics and assignmentsLab rationaleCS 213 F ’02class01a.ppt15-213“The Class That Gives CMU Its Zip!”Seth Goldstein & Bruce MaggsJanuary 14, 2003–2–15-213, S’03Course ThemeCourse ThemeAbstraction is good, but don’t forget reality!Courses to date emphasize abstractionCourses to date emphasize abstractionAbstract data typesAsymptotic analysisThese abstractions have limitsThese abstractions have limitsEspecially in the presence of bugsNeed to understand underlying implementationsUseful outcomesUseful outcomesBecome more effective programmerszAble to find and eliminate bugs efficientlyzAble to tune program performancePrepare for later “systems” classes in CS & ECEzCompilers, Operating Systems, Networks, Computer Architecture, Embedded Systems–3–15-213, S’03Great Reality #1Great Reality #1Int’sInt’sare not Integers, Float’s are not are not Integers, Float’s are not RealsRealsExamplesExamplesIs x2≥ 0?zFloat’s: Yes!zInt’s:» 40000 * 40000 --> 1600000000» 50000 * 50000 --> ??Is (x + y) + z = x + (y + z)?zUnsigned & Signed Int’s: Yes!zFloat’s:» (1e20 + -1e20) + 3.14 --> 3.14» 1e20 + (-1e20 + 3.14) --> ??–4–15-213, S’03Computer ArithmeticComputer ArithmeticDoes not generate random valuesDoes not generate random valuesArithmetic operations have important mathematical propertiesCannot assume “usual” propertiesCannot assume “usual” propertiesDue to finiteness of representationsInteger operations satisfy “ring” propertieszCommutativity, associativity, distributivityFloating point operations satisfy “ordering” propertieszMonotonicity, values of signsObservationObservationNeed to understand which abstractions apply in which contextsImportant issues for compiler writers and serious application programmers–5–15-213, S’03Great Reality #2Great Reality #2You’ve got to know assemblyYou’ve got to know assemblyChances are, you’ll never write program in assemblyChances are, you’ll never write program in assemblyCompilers are much better & more patient than you areUnderstanding assembly key to machineUnderstanding assembly key to machine--level level execution modelexecution modelBehavior of programs in presence of bugszHigh-level language model breaks downTuning program performancezUnderstanding sources of program inefficiencyImplementing system softwarezCompiler has machine code as targetzOperating systems must manage process state–6–15-213, S’03Assembly Code ExampleAssembly Code ExampleTime Stamp CounterTime Stamp CounterSpecial 64-bit register in Intel-compatible machinesIncremented every clock cycleRead with rdtsc instructionApplicationApplicationMeasure time required by procedurezIn units of clock cyclesdouble t;start_counter();P();t = get_counter();printf("P required %f clock cycles\n", t);–7–15-213, S’03Code to Read CounterCode to Read CounterWrite small amount of assembly code using GCC’s asmfacilityInserts assembly code into machine code generated by compilerstatic unsigned cyc_hi = 0;static unsigned cyc_lo = 0;/* Set *hi and *lo to the high and low order bitsof the cycle counter. */void access_counter(unsigned *hi, unsigned *lo){asm("rdtsc; movl %%edx,%0; movl %%eax,%1": "=r" (*hi), "=r" (*lo) :: "%edx", "%eax");}–8–15-213, S’03Code to Read CounterCode to Read Counter/* Record the current value of the cycle counter. */void start_counter(){access_counter(&cyc_hi, &cyc_lo);}/* Number of cycles since the last call to start_counter. */double get_counter(){unsigned ncyc_hi, ncyc_lo;unsigned hi, lo, borrow;/* Get cycle counter */access_counter(&ncyc_hi, &ncyc_lo);/* Do double precision subtraction */lo = ncyc_lo - cyc_lo;borrow = lo > ncyc_lo;hi = ncyc_hi - cyc_hi - borrow;return (double) hi * (1 << 30) * 4 + lo;}–9–15-213, S’03Measuring TimeMeasuring TimeTrickier than it Might LookTrickier than it Might LookMany sources of variationExampleExampleSum integers from 1 to nn Cycles Cycles/n100 961 9.611,000 8,407 8.411,000 8,426 8.4310,000 82,861 8.2910,000 82,876 8.291,000,000 8,419,907 8.421,000,000 8,425,181 8.431,000,000,000 8,371,2305,591 8.37–10–15-213, S’03main(int argc, char** argv){...for (i=0; i<t; i++) {start_counter();count(n);times[i] = get_counter();}...}int count(int n){int i;int sum = 0;for (i=0; i<n; i++) {sum += i;}return sum;}Timing System PerformanceTiming System Performanceint count(int n){int i;int sum = 0;for (i=0; i<n; i++) {sum += i;}return sum;}main(int argc, char** argv){...for (i=0; i<t; i++) {start_counter();count(n);times[i] = get_counter();}...}–11–15-213, S’03Timing System PerformanceTiming System Performanceint count(int n){...}main(int argc, char** argv){...}main(int argc, char** argv){...}int count(int n){...}Experiment n cycles/n1 10 1649.22 10 17.23 1000 24.34 1000 6.1Experiment n cycles/n1 10 1657.6210261a 10 202a 10 16.43a 1000 1.74a 1000 1.6It’s the system, stupid!It’s the system, stupid!–12–15-213, S’03Great Reality #3Great Reality #3Memory MattersMemory MattersMemory is not unboundedMemory is not unboundedIt must be allocated and managedMany applications are memory dominatedMemory performance is not uniformMemory performance is not uniformCache and virtual memory effects can greatly affect program performanceAdapting program to characteristics of memory system can lead to major speed improvementsMemory referencing bugs especially perniciousMemory referencing bugs especially perniciousEffects are distant in both time and space–13–15-213, S’03Hardware Organization (Naïve)Hardware Organization (Naïve)–14–15-213, S’03Memory Performance ExampleMemory Performance ExampleImplementations of Matrix MultiplicationImplementations of Matrix MultiplicationMultiple ways to nest loops/* ijk */for (i=0; i<n; i++) {for (j=0; j<n; j++) {sum = 0.0;for (k=0; k<n; k++) sum += a[i][k] * b[k][j];c[i][j] = sum;}} /* ikj */for (i=0; i<n; i++) {for (k=0; k<n; k++) {sum = 0.0;for (j=0; j<n; j++)sum += a[i][k] * b[k][j];c[i][j] = sum}}–15–15-213, S’03020406080100120140160matrix size (n)ijkikjjikjkikijkjiMatmult Performance (Alpha 21164)Matmult Performance (Alpha 21164)Too big for L1 Cache Too big for L2
View Full Document