Edgar GabrielCOSC 6385Computer ArchitectureIntroduction and Organizational IssuesEdgar GabrielSpring 2011COSC 6385 – Computer ArchitectureEdgar GabrielOrganizational issues (I) • Classes:– Monday, 2.30pm – 4.00pm, AH AUD2– Wednesday, 2.30pm – 4.00pm, AH AUD2• Evaluation as planned right now– 1 homework: 25%– 3 quizzes: 75% (25% each)• About the pop-up quizzes:– Not planned right now. However, can be installed any time during the semester depending on participation of students• unannounced, unspecified number of mini quizzes• at the beginning of a lecture, 10 minutes only• open book, focusing on last lecture onlyCOSC 6385 – Computer ArchitectureEdgar GabrielOrganizational issues (II) • In case of questions:– email: [email protected]– Tel: (713) 743 3358– Office hours: PGH 524, Tue, 11am-12pm or by appointment• All slides available on the website: –http://www.cs.uh.edu/~gabriel/cosc6385_s11/– Videos of some lectures will be posted on the course web pageCOSC 6385 – Computer ArchitectureEdgar GabrielOrganizational Issues (III)• TA’s for the course:– Vishwanath Venkatesan, PGH 526, [email protected]• Tentative dates for the quizzes:– 1stquiz: Wednesday, Feb 23– 2ndquiz: Wednesday, March 30– 3rdquiz: Monday, May 2 • Homework – Announced: Wednesday, Feb. 28– Due on: Friday, March 11COSC 6385 – Computer ArchitectureEdgar GabrielContents• Textbook:John L. Hennessy, David A. Patterson“Computer Architecture –A Quantitative Approach”4thEditionMorgan Kaufmann PublishersCOSC 6385 – Computer ArchitectureEdgar GabrielContents (II)• Most of chapters 1 to 5• Appendix A, B, C• Selected sections regarding – Storage systems– Vector Processors• Selected literature to multi-core processors• Selected literature to virtualizationCOSC 6385 – Computer ArchitectureEdgar GabrielContents(III)Jan. 19 cancelledJan. 24 Overview, Motivation, OrganizationJan. 26 Performance MeasurementJan 31 Instruction Set ArchitecturesFeb 2 Memory Hierarchy (I) Feb 7 Memory Hierarchy (II) Feb. 9 Pipelining (I) Feb. 14 Pipelining (II)Feb. 16 Recap for 1st quiz, exercises (I)Feb. 21 Tomasulo's algorithm (I)Feb. 23 1st quizFeb. 28 homework announcementMar. 2 Tomasulo's algorithm (IIMar. 7 discussion of midterm quiz; Mar. 9 ILP with software approaches Mar. 14 and 16: spring break, no lectureMar. 21 Vector processors Mar. 23 Exercises (II); homework discussionMar. 28 Multi-processor systems (I): Flynn's taxonomy, cache coherence protocolsMar 30 2nd quizApr. 4 Multi-processor systems (II): Distr. shared memory and directory protocolsApr. 6 Multi-processor systems (III): synchronization problemsApr. 11 Multi-processor systems (IV): multi-core and multi-threading Apr. 13 Multi-processor systems (V):Intel Larabee and Nvidia GPGPUApr. 18 VirtualizationApr. 20 File I/OApr. 25 recap for final quizApr. 27 History of ComputersMay. 2 Final QuizCOSC 6385 – Computer ArchitectureEdgar GabrielWhy learning about Computer Architecture?• Every loop iteration requires 3 memory operations– 2 loads– 1 store• For a micro-processor having a frequency of 2 GHz this loop requiresto satisfy one Floating Point Unit (FPU) • Most modern processors have 2 FPUs and two or more Integer Units which could work in parallel for (i=0; i<n; i++ ) {c[i] = a[i] + b[i];}sGBytessBytes /2410*2*4*319=−COSC 6385 – Computer ArchitectureEdgar GabrielMemory technology (www.kingston.com/newtech)• Memory Bandwidth withCycleOpfSBSBBUSBus/**max=maxSBBUSSBBUSf: max. memory bandwidth: Bandwidth of the memory bus (64 Bit = 8 Bytes): Frequency of the memory bus COSC 6385 – Computer ArchitectureEdgar GabrielMemory bandwidthName Frequency of memory bus (MHz)max. bandwidthPC100 SDRAM 100 800 MB/sPC133 SDRAM 133 1.1 GB/sPC1600 DDR 100 1.6 GB/sPC2100 DDR 133 2.1 GB/sPC2700 DDR 166 2.7 GB/sPC3200 DDR 200 3.2 GB/sPC3700 DDR 233 3.7 GB/sPC4200 DDR 266 4.2 GB/sCOSC 6385 – Computer ArchitectureEdgar GabrielMemory modules (cont.)• Dual Channel Memory: 2 I/O Channels between memory controller und memory module• DDR2 and DDR3: further evolution of the DDR technologyName Frequency of memory busBandwidth of a moduleDual Channel DDR2 bandwidthPC2-3200 400 MHz 3.2 GB/s 6.4 GB/sPC2-4200 533 MHz 4.2 GB/s 8.4 GB/sPC2-5300 667 MHz 5.3 GB/s 10.6 GB/sPC2-6400 800 MHz 6.4 GB/s 12.8 GB/sPC3-8500 1066 MHz 8.5GB/s 17.0 GB/sPC3-10600 1333 MHz 10.6 GB/s 21.2 GB/sPC3-12800 1600 MHz 12.8 GB/s 25.6 GB/sCOSC 6385 – Computer ArchitectureEdgar GabrielMemory hierarchiesSize Access time[cycles]Backup (tape) TB, PTPrimary data storage (disk)~ 100 GB > 106main memory ~ 1-4 GB 100 - 1000Caches ~ 1-4 MB 2 – 50Register < 256 Words 1 - 2COSC 6385 – Computer ArchitectureEdgar GabrielMemory hierarchies • Do I have to care about memory hierarchies?• Example: Matrix-multiply of two dense matrices– “Trivial” codefor ( i=0; i<dim; i++ ) {for ( j=0; j<dim; j++ ) {for ( k=0; k<dim; k++) {c[i][j] += a[i][k] * b[k][j];}}}COSC 6385 – Computer ArchitectureEdgar GabrielMatrix-multiply• Performance of the trivial implementation on an 2.2 GHz AMD Opteron with 2 GB main memory 1 MB 2ndlevel cacheMatrix dimension Execution time [sec]Performance [MFLOPS]256x256 0.118 284512x512 2.05 130COSC 6385 – Computer ArchitectureEdgar GabrielMatrix-multiply (II)• Peak floating point performance of the processor2 * (2.2 * 109) Floating point operations/sec = 4.4 * 109= 4.4 GFLOPS• Where are the missing FLOPS between theoretical peek and achieved performance?– Memory wait timeNumber of floating point unitsFrequency of the processor→ assuming that each FPU can finish an operation per cycleTheoretical floating point peakperformance of the processor COSC 6385 – Computer ArchitectureEdgar GabrielBlocked codefor ( i=0; i<dim; i+=block ) {for ( j=0; j<dim; j+=block ) {for ( k=0; k<dim; k+=block) {for (ii=i; ii<(i+block); ii++) {for (jj=j; jj<(j+block); jj++) {for (kk=k; kk<(k+block);kk++) {c[ii][jj] += a[ii][kk] * b[kk][jj];}}}}}}COSC 6385 – Computer ArchitectureEdgar GabrielPerformance of the blocked codeMatrix dimensionblock Execution time[sec]Performance[MFLOPS]“trivial” [MFLOPS]256x256 4 0.065 513 2848 0.046 72616 0.51 65732 0.043 77764 0.049 677128 0.113 296512x512 4 0.686 391 1308 0.422 63516 0.447 59932 0.501 53564 1.00 266128 0.994 269COSC 6385 – Computer ArchitectureEdgar GabrielCOSC 6385 – Computer ArchitectureEdgar GabrielCOSC 6385 – Computer ArchitectureEdgar
View Full Document