CS152 Quiz 1 Answer Guide Distributed 2 25 2008 Problem Q 1 Microprogramming Bus Based Architectures 28 points In this problem we explore microprogramming by writing microcode for the bus based implementation of the MIPS machine described in Handout 1 Bus Based MIPS Implementation which we have included at the end of this quiz for your reference In order to further simplify this problem ignore the busy signal and assume that the memory is as fast as the register file The final solution should be elegant and efficient You are to implement in microcode a double indirect addressing mode as described below In this addressing mode the source register contains a pointer to a location in memory whose value is a pointer to the location in memory whose value is to be loaded The instruction has the following format LWmm rd rs LWmm performs the following operation rd M M rs Fill in Worksheet Q1 1 with the microcode for LWmm Use don t cares for fields where it is safe to use don t cares Study the hardware description well and make sure all your microinstructions are legal Please comment your code clearly If the pseudo code for a line does not fit in the space provided or if you have additional comments you may write in the margins as long as you do it neatly Your code should exhibit clean behavior and not modify any registers except rd in the course of executing the instruction Finally make sure that the instruction fetches the next instruction i e by doing a microbranch to FETCH0 as discussed in the Handout State FETCH0 PseudoCode ld IR Reg Sel Reg W en Reg ld A ld B ALUOp en ALU ld MA Mem W en Mem Ex Sel en Imm B r Next State MA PC A PC IR Mem 0 PC 0 1 1 0 1 0 0 N 1 0 0 0 0 0 1 0 N PC A 4 0 PC 1 1 0 INC A 4 1 0 0 D microbranch back to FETCH0 MA R rs 0 0 0 0 0 J FETCH0 rs 0 1 0 1 0 0 N MA Mem 0 0 1 0 1 0 N R rd Mem ubranch back to fetch rd 1 1 0 0 1 0 J FETCH0 NOP0 LWMM0 Worksheet Q1 1 Problem Q2 Dual ALU Pipeline 33 points Problem Q2 A ALU Usage add lw add add add lw add r1 r4 r5 r7 r1 r4 r5 ALU1 or ALU2 ALU1 r2 r3 0 r1 r4 r6 r5 r8 r2 r3 0 r1 r1 r6 ALU2 ALU2 ALU1 ALU1 The following timeline diagrams the execution of the instructions with the stage where each instruction produces its result highlighted in bold and the bypassing between instructions shown by arrows add1 lw1 add2 add3 add4 lw2 add5 IF ID IF EX1 ID IF EX2 EX1 ID IF WB MEM EX1 ID IF WB EX2 EX1 ID IF WB EX2 EX1 ID IF WB EX2 EX1 ID WB MEM EX1 WB EX2 WB The pipeline is initially idle so the first add reads its operands from the register file in ID and uses ALU1 The second add uses the result of the lw which is not available by the end of ID therefore the add uses ALU2 and the load data is bypassed to it at the end of EX1 The third add uses the result of the second so its data is not available by the end of ID it also uses ALU2 allowing the data to be bypassed to it at the end of EX1 The fourth add has no dependencies on the previous instructions it reads its operands from the register file in ID and uses ALU1 The fifth add uses the result of the fourth add This value is bypassed to it at the end of ID from EX2 MEM and it uses ALU1 Problem Q2 B Instruction Sequences Causing Stalls add lw lw add lw lw lw r1 r4 r1 r3 r5 r1 r3 r2 r3 0 r1 0 r2 r1 r4 0 r1 0 r2 0 r1 lw sw lw add sw r1 r1 r1 r3 r5 0 r2 0 r3 0 r2 r1 r4 0 r3 lw r1 0 r2 add r3 r1 r4 stall yes no explanation No The add in EX1 uses ALU1 and bypasses its result to the LW in ID No The first LW in EX2 MEM bypasses its result to the add in EX1 which will use ALU2 and also to the second LW in ID Yes No Yes No The result of the first LW in EX1 is not available in time for the second LW in ID so the second LW must stall The LW in EX2 MEM bypasses its result to the SW in EX1 in time for it to store the data in EX2 MEM The LW in EX2 MEM bypasses its result to the add in EX1 which will use ALU2 But the result of the add in EX1 is not available in time for the SW in ID so the SW must stall The LW in EX2 MEM bypasses its result to the add in EX1 which will use ALU2 Note that the base address operand for both LW and SW must be available by the end of ID but the data operand for SW must only be available by the end of EX1 Problem Q3 Processor Design Short Yes No Questions 10 points The following questions describe two variants of a processor which are otherwise identical In each case circle Yes if the variants might generate different results from the same compiled program and circle No otherwise You must also briefly explain your reasoning Ignore differences in the time each machine takes to execute the program Problem Q3 A Interlock vs Bypassing No Data dependencies are preserved with either interlocks or bypassing so the processors always generate the same results Bypassing improves performance by eliminating stalls Problem Q3 B Delay Slot Yes The instruction following a taken branch is executed on processor A but killed on processor B so the processors can generate different results Problem Q3 C Structural Hazard No Both processors retrieve the same data values There is only a performance difference because processor A must stall an instruction fetch to allow a load instruction to access memory Problem Q3 D Microcode size No A wide variety of possible microded machines can implement the same user level ISA semantics and generate the same results for all programs Problem Q3 E Register Size Either answer depending on assumptions about microcode ISA changes No With appropriate microcode both machines could generate identical results for a 32bit ISA Also machine A could implement a 64 bit ISA using two 32 bit registers for each 64 bit value and carefully handling overflow conditions Yes Assuming microcode was literally unchanged the machines would generate different results due to the different overflow properties of 32 bit and 64 bit registers For example if a value is shifted left bits are lost using 32 bit registers that are retained with 64 bit registers Problem Q 4 …
View Full Document
Unlocking...