GT CS 4803 - CS 4803 Quiz 1 - D2394671

Home> Schools> Georgia Tech> Computer Science (CS) > CS 4803> CS 4803 Quiz 1

GT CS 4803 - CS 4803 Quiz 1

Pages 7

Download Save

Unformatted text preview:

School of Computer ScienceGeorgia Institute of TechnologyCS4803DGC, Spring 2011Prof. Hyesoon KimSample QuizName :GTID :Problem 1 (10 points):Problem 2 (10 points):Problem 3 (20 points):Problem 4 (10 points):Problem 5 (10 points):Problem 6 (10 points):Total (70 points):Note: Please be sure that your answers to all questions (and all supporting work that is required) arecontained in the space provided.Note: Please be sure your name is recorded on each sheet of the exam.GOOD LUCK!Name:Problem 1 (10 points):How many cycles would it take to exe c ute the following c ode segments in the following pipe line design?Assume that re gister write and read can be performed at the same cycle.I−cachePCID_stageFE_stageREGSIGN EXTBranch Unit+ sizeEX_stageMEM_stage WB_stageFigure 1: case c1. ADD R0, R1, R2XOR R2, R1, R02. ADD R0, R1, R2AND R0, R3, R43. AND R2, R1, x0ADD R1, R6, R14. ADD R7, R1, R2BRz X // This branch is taken.X XOR R2, R3, R02Name:Problem 2 (10 points):Part a (5 pts) List at lest 2 hardware structures that must be replicated in a data pa th to support SMTarchitectures.Part b. (5 pts) Discuss at least two major differences between designing game console architectures anddesktop processors.3Name:Problem 3 (20 points):Part a. (5 pts) Xbox 360 employees several write merge buffers(store gathering buffers). Discuss benefits ofthese buffers.Part b. (5 pts) If the cache block size is 4B instead of 128B, is the write merge buffer still useful? Explainthe reason.Part c. (5 pts) What is the cache-set-locking mechanism and what’s the benefit of using it at XBox360 ?Part d. (5 pts) Discus s negative effects when prefetching requests are not accurate.4Name:Problem 4 (10 points)Part a. (5 pts) A GP U has 8 SMs and each SM has 512 floating point units. The latency of ADD/MULoperation is 1 cycle each and the latency of DIV is 4 cycles. The frequency of SM is 1GHz. What is thepeak flop/s?Part b. (5 pts) Discus s differences between superscalar proce ssors and SIMD proces sors.5Name:Problem 5 (10 points) Describe how you would implement the following code in CUDA.for (ii = 1; ii < 200000; ii=ii+2) {sum += X[ii-1] + X[ii];}6Name:Problem 6 (10 points) A new processor ha s 5-wide SIMD units. SIMADD, SIMLDB, SIMLDW, alo ng withADD, LDB (Load Byte), LDW (Load Word), BR. The following code will be translated into a RISC ISA asfollows. Convert the code into a SIMD style using the above instructions.for (ii = 1; ii < 200000; ii=ii+2) {sum += X[ii-1] + X[ii];}(a) origianl source code (X is double word type)MOV R0, \#1LOOP ADD R1, R3, R0ADD R2, R1, -1LDW R5 MEM[R1]LDW R6 MEM[R2]ADD R0 R0, \#2BR.LESS R0, \#200000, LOOP(b) RISC codeADD R1, R3, R0 means R1=R3+R0. BR.LESS R0, #200000, LOOP mea ns, if R0 is less than 200000jump to LOOP. MOV R0, #1 means R0=#1, LD R4 MEM[R1] means

View Full Document

GT CS 4803 - CS 4803 Quiz 1

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

GT CS 4803 - CS 4803 Quiz 1

Sign up for free to view:

Please select your school