DOC PREVIEW
GT CS 4803 - Sample Quiz
School name Georgia Tech
Pages 7

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

School of Computer ScienceGeorgia Institute of TechnologyCS4803DGC, Spring 2010Prof. Hyesoon KimSample Quiz, Feb 8, 2010Name :GTID :Problem 1 (10 points):Problem 2 (10 points):Problem 3 (20 points):Problem 4 (10 points):Problem 5 (10 points):Problem 6 (10 points):Total (70 points):Note: Please be sure that your answers to all questions (and all supporting work that is required) arecontained in the space provided.Note: Please be sure your name is recorded on each sheet of the exam.GOOD LUCK!Name:Problem 1 (10 points):How many cycles would it take to execute the following code segments in the following pipeline design?Assume that register write and read can be performed at the same cycle.I−cachePCID_stageFE_stageREGSIGN EXTBranch Unit+ sizeEX_stageMEM_stage WB_stageFigure 1: case c1. ADD R0, R1, R2XOR R2, R1, R02. ADD R0, R1, R2AND R0, R3, R43. AND R2, R1, x0ADD R1, R6, R14. ADD R7, R1, R2BRz X // This branch is taken.X XOR R2, R3, R02Name:Problem 2 (10 points):Part a (5 pts) List at lest 2 hardware structures that must be replicated in a data path to support SMTarchitectures.Part b. (5 pts) Discuss at least two major differences between designing game console architectures anddesktop processors.3Name:Problem 3 (20 points):Part a. (5 pts) Xbox 360 employees several write merge buffers(store gathering buffers). Discuss benefits ofthese buffers.Part b. (5 pts) If the cache block size is 4B instead of 128B, is the write merge buffer still useful? Explainthe reason.Part c. (5 pts) What is the cache-set-locking mechanism and what’s the benefit of using it at XBox360?Part d. (5 pts) Discuss negative effects when prefetching requests are not accurate.4Name:Problem 4 (10 points)Part a. (5 pts) A GPU has 8 SMs and each SM has 512 floating point units. The latency of ADD/MULoperation is 1 cycle each and the latency of DIV is 4 cycles. The frequency of SM is 1GHz. What is thepeak flop/s?Part b. (5 pts) Discuss differences between superscalar processors and SIMD processors.5Name:Problem 5 (10 points) Describe how you would implement the following code in CUDA.for (ii = 1; ii < 200000; ii=ii+2) {sum += X[ii-1] + X[ii];}6Name:Problem 6 (10 points) A new processor has 5-wide SIMD units. SIMADD, SIMLDB, SIMLDW, along withADD, LDB (Load Byte), LDW (Load Word), BR. The following code will be translated into a RISC ISA asfollows. Convert the code into a SIMD style using the above instructions.for (ii = 1; ii < 200000; ii=ii+2) {sum += X[ii-1] + X[ii];}(a) origianl source code (X is double word type)MOV R0, \#1LOOP ADD R1, R3, R0ADD R2, R1, -1LDW R5 MEM[R1]LDW R6 MEM[R2]ADD R0 R0, \#2BR.LESS R0, \#200000, LOOP(b) RISC codeADD R1, R3, R0 means R1=R3+R0. BR.LESS R0, #200000, LOOP means, if R0 is less than 200000jump to LOOP. MOV R0, #1 means R0=#1, LD R4 MEM[R1] means


View Full Document

GT CS 4803 - Sample Quiz

Download Sample Quiz
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Sample Quiz and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Sample Quiz 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?