Brown EN 164 - Homework Assignment #1 - D2633506

Home> Schools> Brown University> (EN) > EN 164> Homework Assignment #1

DOC PREVIEW

Brown EN 164 - Homework Assignment #1

School name Brown University

Course En 164- Design of Computing Systems

Pages 3

This preview shows page 1 out of 3 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 3 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 3 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

EN164: Design of Computing Systems Homework Assignment #1 Due Friday, February 23, 2007, in class 1) As mentioned in class, assembly languages sometimes use “psuedoinstructions” to simplify the translation and programming. Given the MIPS psuedoinstructions below, produce a minimal sequence of actually MIPS instructions to accomplish the same thing. If you need to use another register, you can use $at for this purpose. In the following table, big refers to a specific number that requires 32 bits to represent and small to a number that can fit in 16 bits. Pseudoinstruction Function move $t1, $t2 $t1 = $t2 clear $t0 $t0 = 0 beq $t1, small, L If ($t1 = small) goto l beq $t2, big, L If ($t2 = big) goto l li $t1, small $t1 = small li $t2, big $t2 = big ble $t3, $t5, L If ($t3 <= $t5) goto l bgt $t4, $t5, L If ($t4 > $t5) goto l bge $t5, $t3, L If ($t5 >= $t3) goto l addi $t0, $t2, big $t0 = $t2 + big lw $t5, big($t2) $t5 = memory[$t2 + big] What would be the implication in the hardware if the pseudoinstructions would be implemented directly in hardware? 2) In this exercise, assume that we are considering enhancing a machine by adding vector hardware to it. When a computation is run in vector mode on the vector hardware, it is 10 times faster than the normal mode of execution. We call the percentage of time that could be spent using vector mode the percentage of vectorization. a. What percentage of vectorization is needed to achieve a speedup of 2? b. What percentage of the computation run time is spent in vector mode if a speedup of 2 is achieved? c. What percentage of vectorization is needed to achieve one-half the maximum speedup attainable from using vector mode? d. Suppose you have measure the percentage of vectorization for programs to be 70%. The hardware design group says they can double the speed of the vector hardware with a significant additional engineering investment. You wonder whether the compiler crew could increase the use of vector mode as another approach to increasing performance. How much of an increase in the percentage of vectorization (relative to current usage) would you need to obtain the same performance gain as doubling vector hardware speed? Which investment would you recommend?3) Assume that we make an enhancement to a computer that improves some mode of execution by a factor of 10. Enhanced mode is used 50% of the time, measured as a percentage of the execution time when the enhanced mode is in use Recall that Amdahl’s Law depends on the fraction of the original, unenhanced execution time that could make use of enhanced mode. Thus, we cannot directly use this 50% measurement to compute speedup with Amdahl’s Law a. What is the speedup we have obtained from fast mode? b. What percentage of the original execution time has been converted to fast mode? 4) Consider the following code fragment: Loop: LD R1, 0(R2) ; load R1 from address 0+R2 DADD I R1, R1, #1 ; R1 = R1+1 SD 0(R2), R1 ; store R1 at address 0+R2 DADDI R2, R2, #4 ; R2=R2+4 DSUB R4, R3, R2 ; R4=R3-R2 BNEZ R4, Loop ; branch to Loop if R4!=0 Assume that the initial value of R3 is R2 + 396. Throughout this exercise use the classic RISC 5-stage integer pipeline (see Figure A.1) and assume all memory accesses take 1 clock cycle. a. Show the timing of this instruction sequence for the RISC pipeline without any forwarding or bypassing hardware, but assuming a register read and write in the same clock cycle “forwards” through the register file, as in Figure A.6. Assume that the branch is handled by flushing the pipeline. How many cycles does this lop take to execute? Use a pipeline timing chart as in Figure A.1 or A.6 to show your work. b. Show the timing of this instruction sequence for the RISC pipeline with normal forwarding and bypassing hardware. Assume that the branch is handled by predicting it as not taken. How many cycles does this loop take to execute? c. Assume the RISC pipeline with a single-cycle delayed branch and normal forwarding and bypassing hardware. Schedule the instructions in the loop including the branch delay slot. You may reorder instructions and modify the individual instruction operands, but do no undertake other loop transformations that change the number or opcode of the instructions in the loop. Now how many cycles does this loop take to execute?5) Loop unrolling is a common compiler optimization technique used to improve program performance. Basically, the body of the loop is repeated some number of times (using different data), while the end-loop test condition only needs to be scheduled once. The instructions are then statically reordered by the compiler to optimize performance through the pipeline. The following code has been unrolled once but not yet scheduled. Assume the loop index is a multiple of two (i.e. $r10 is a multiple of 8): Loop: lw $r2, 0($r10) sub $r4, $r2, $r3 sw $r4, 0($r10) lw $r5, 4($r10) sub $r6, $r5, $r3 sw $r6, 4($r10) addi $r10, $r10, 8 bne $r10, $r30, Loop Schedule this code for fast execution on the standard MIPS pipeline. Assume initially that $r10 is 0 and $r30 is 400 and that branches are resolved in the MEM stage. How does the scheduled code compare against the original unscheduled code? 6) Consider adding a new index addressing mode to MIPS. The addressing mode adds two registers and an 11-bit signed offset to get the effective address. Our compiler will be changed so that code sequences of the form ADD R1, R1, R2 LW Rd, 100(R1) (or store) will be replaced with a load (or store) using the new addressing mode. Use the overall average instruction frequencies from Figure B.27 in the class textbook in evaluating this addition. a. Assume that the addressing mode can be used for 10% of he displacement loads and stores (accounting for both the frequency of this type of address calculation and the shorter offset). What is the ratio of instruction count on the enhanced MIPS compared to the original MIPS? b. If the new addressing mode lengthens the clock cycle by 5%, which machine will be faster and by how

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 3 pages.

Brown EN 164 - Homework Assignment #1

Sign up for free to view:

Please select your school