DOC PREVIEW
CORNELL CS 3410 - 3410_2016sp_prelim2_soln

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Prelim 2Computer Science 3410, Cornell UniversitySpring 2016, Prof. Bracy 5 May 2016Solutions1. Data and Control Hazards [16 pts](a) [8 pts] Forwarding. In the pipeline below, the register file is read in the 1sthalf of the cycle and written in the 2nd half of the cycle. Bypasses #1-#5 areshaded. In the D/X latch, the 1st operand to an ALU instruction is stored in theA slot of the latch. The 2nd operand is stored in the B slot of the latch.RegisterFileSXrA rBrDDataMemadinstABinstDBinstDMinstF D X M W#1#2#3#5instmem#4lw r5,0(r2) sw r3,0(r2) addi r2,r1,1sub r4,r2,r3For each input specified below, state where the instruction gets its data. 1-5 refer to the bypasses shown in the above figure. RF stands for register file.ERROR means that the instruction cannot possibly get its data in order to executecorrectly given the processor above and the timing specified in the diagram.Circle Oner2 of the sw #1 #2 #3 #4 #5 RF ERRORr2 of the sub #1 #2 #3 #4 #5 RF ERRORr3 of the sub #1 #2 #3 #4 #5 RF ERRORr2 of the lw #1 #2 #3 #4 #5 RF ERRORAnswer: #4, #3, RF, ERROR(b) [8 pts] Control Dependences. Suppose we had asked you to implementa 2-instruction control delay slot for Project 2. In other words, the next twoinstructions immediately following a control instruction will always be executed.Which of the following statements about this concept are true? Check all thatapply.12 delay slots would be harder to implement in Logisim than 1.It will be harder for a compiler to fill the 2nd delay slot than to fill the 1st.Processors with 1 delay slot benefit more from branch prediction than onewith 2 delay slots.A processor with 2 delay slots might support a faster clock than one withonly 1.Answer: Items 2-4 should be checked. (Item 1 is not not true.)2. ISAs [8 pts]For each multiple choice question there is only one correct answer.(a) [2 pts] What is one advantage of a CISC ISA?A It naturally supports a faster clock.B Instructions are easier to decode.C The static footprint of the code will be smaller.D The code is easier for a compiler to optimize.E You have a lot of registers to use.Correct Answer: C(b) [2 pts] What processor feature does a compiler not need to know about?A There is a control delay slot.B The number of inputs each instruction can have.C The processor performs statically-scheduled multiple issue execution.D The processor performs dynamically-scheduled multiple issue execution.E Whether the processor supports predication.Correct Answer: D(c) [2 pts] What is not a good reason to limit the number of instructions offeredby a particular ISA?A Offering more instructions leads to executables with more static instructions.B More complicated instructions are difficult to pipeline.C More instructions may lead to needing more bits in the opcode field whichcould make static instructions larger.D It is very difficult to ever stop supporting a particular instruction once it hasbeen added to the ISA.E More instructions will complicate the decode logic of the processor.Correct Answer: A2(d) [2 pts] Which of the following statements about ISAs is true?A Because clock frequencies are no longer increasing, a return to CISC ISAsmakes sense.B Which cache coherence protocol is implemented needs to needs to be specifiedin the ISA.C Atomic operations need to be specified in the ISA.D ISA design has stagnated; no new ISAs of any commercial significance haveappeared in the past 10 years.E A system call is OS-specific and not an ISA-level feature.Correct Answer: C3. Calling Conventions [16 pts]The following assembly is the body of a function compiled for a MIPS processor witha control delay slot:Body: addiu $s3, $a0, 0addiu $s5, $a1, 0addiu $s6, $a2, 0Loop: sll $t1,$s3,2add $t1,$t1,$s6lw $t0,0($t1)bne $t0,$s5, Exitnopaddi $s3,$s3,1j LoopnopExit: addiu $a0, $s3, 0jal recordnopaddiu $v0, $s3, 0Write the prologue and epilogue for this body. Use the class calling conventions.PROLOGUE: EPILOGUE:ADDIU $sp, $sp, -36 LW $s6, 16($sp)SW $ra, 32($sp) LW $s5, 20($sp)SW $fp, 28($sp) LW $s3, 24($sp)SW $s3, 24($sp) LW $fp, 28($sp)SW $s5, 20($sp) LW $ra, 32($sp)SW $s6, 16($sp) ADDIU $sp, $sp, 36ADDIU $fp, $sp, 32 JR $raNOP34. Linkers and Loaders. [10 pts]Below is a modified version of heaptest.c from Project 4:----------------------------------------------------------------#include <stdio.h>#include heaplib.h#define HEAP SIZE 16static int ARR SIZE = 4;int main() {char heap[HEAP SIZE];hl init(heap, HEAP SIZE * sizeof(char));char* ptr = (char *) hl alloc(heap, ARR SIZE * sizeof(char));ptr[0] = ’h’;ptr[1] = ’i’;ptr[2] = ’\0’;printf(%s\n, ptr);return 0;}----------------------------------------------------------------Where does the assembler place each of the following symbols from the above programin the object file that it creates?Choose one:(a) Text Segment(b) Data Segment(c) Exported reference in the symbol table(d) Imported reference in the symbol table(e) None of the aboveCircle One:(1) HEAP SIZE E(2) ARR SIZE B(3) heap[] E(4) hl init D(5) ’h’ B45. Caches [24 pts] (parts a–d)(a) [9 pts] For each workload, choose the best block size for your cache among thechoices given. Assume that integers and pointers are all 4 bytes each. Use theintuitions you have developed in this class to decide what best means for a cache.Choose one: (a) 1 byte (b) 4 bytes (c) 8 bytes (d) 16 bytes (e) 32 bytesBEST BLOCK SIZE (Circle One)int scores[NUM STUDENTS] = 0; A B C D Eint sum = 0;for (i = 0; i < NUM STUDENTS; i++) {sum += scores[i];}------------------------------------------------------------------------typedef struct item t {int value; A B C D Eitem t *next;char *name;} item t;int sum = 0;item t *curr = list head;while (curr != NULL) {sum += curr->value;curr = curr->next;}------------------------------------------------------------------------// H = 16, W = 16 Assume a 256 byte cache.int A[H][W];for(x=0; x < W; x++) A B C D Efor(y=0; y < H; y++)sum += A[y][x];Answers: E, C, B(b) [9 pts] Memory Access Time. Processor A is a 1 GHz processor (1 cycle =1 ns). There is an L1 cache with a 1 cycle access time and a 50% hit rate. Thereis an L2 cache with a 10 cycle access time and a 90% hit rate. It takes 100 ns toaccess memory.(3 points) What is the average memory access time of Processor A in ns?51 + 50% (10 + 10% x 100) =1 + 5 + 5 = 11 nsProcessor B attempts to improve upon the hit rate of Processor A by making theL1 cache larger. The new hit rate is 90%. In order to maintain a 1 cycle accesstime, Processor B can only run at 500 MHz


View Full Document

CORNELL CS 3410 - 3410_2016sp_prelim2_soln

Download 3410_2016sp_prelim2_soln
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 3410_2016sp_prelim2_soln and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 3410_2016sp_prelim2_soln 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?