Unformatted text preview:

CS152 Computer Architecture and Engineering Assigned March 9 Complex Pipelines Problem Set 3 Due March 18 http inst eecs berkeley edu cs152 sp10 The problem sets are intended to help you learn the material and we encourage you to collaborate with other students and to ask questions in discussion sections and office hours to understand the problems However each student must turn in his own solution to the problems The problem sets also provide essential background material for the quizzes The problem sets will be graded primarily on an effort basis but if you do not work through the problem sets you are unlikely to succeed at the quizzes We will distribute solutions to the problem sets on the day the problem sets are due to give you feedback Homework assignments are due at the beginning of class on the due date Late homework will not be accepted 1 Problem 3 1 Superscalar Processor Consider the out of order superscalar CPU shown in the diagram It has the following features o Four fully pipelined functional units ALU MEM FADD FMUL o Instruction Fetch and Decode Unit that renames and sends 2 instructions per cycle to the ROB assume perfect branch prediction and no cache misses o An unbounded length Reorder Buffer that can perform the following operations on every cycle o Accept two instructions from the Instruction Fetch and Decode Unit o Dispatch an instruction to each functional unit including Data Memory o Let Writeback update an unlimited number of entries o Commit up to 2 instructions in order o There is no bypassing or short circuiting For example data entering the ROB cannot be passed on to the functional units or committed in the same cycle ALU Data Mem Instruction Queue 2 Instr per cycle ROB Issue as many as possible Writeback as many as possible infinite FADD Commit at most 2 instr per cycle FMUL Regfile 2 Now consider the execution of the following program on this machine using I1 I2 I3 I4 I5 I6 I7 I8 I9 I10 I11 I12 I13 I14 loop LD F2 0 R2 LD F3 0 R3 FMUL F4 F2 F3 LD F2 4 R2 LD F3 4 R3 FMUL F5 F2 F3 FMUL F6 F4 F5 FADD F4 F4 F5 FMUL F6 F4 F5 FADD F1 F1 F6 ADD R2 R2 8 ADD R3 R3 8 ADD R4 R4 1 BNEZ R4 loop Problem 3 1 A Fill in the renaming tags in the following two tables for the execution of instructions I1 to I10 Tags should not be reused Instr I1 I2 I3 I4 I5 I6 I7 I8 I9 I10 Instruction LD F2 0 R2 LD F3 0 R3 FMUL F4 F2 F3 LD F2 4 R2 LD F3 4 R3 FMUL F5 F2 F3 FMUL F6 F4 F5 FADD F4 F4 F5 FMUL F6 F4 F5 FADD F1 F1 F6 Dest T1 T2 Src1 R2 R3 Src2 0 0 R2 R3 4 4 F1 Renaming table I1 R2 R3 F1 F2 F3 F4 F5 F6 I2 I3 I4 I5 T1 T2 3 I6 I7 I8 I9 I10 Problem 3 1 B Consider the execution of one iteration of the loop I1 to I14 In the following diagram draw the data dependencies between the instructions after register renaming Problem 4 1 C The attached table is a data structure to record the times when some activity takes place in the ROB For example one column records the time when an instruction enters ROB while the last two columns record respectively the time when an instruction is dispatched to the FU s and the time when results are written back to the ROB This data structure has been designed to test your understanding of how a Superscalar machine functions Fill in the blanks in last two columns up to slot T13 You may use the source columns for book keeping no credit will be taken off for the wrong entries there 4 Slot Instruction Cycle instruction entered ROB T1 T2 LD F2 0 R2 LD F3 0 R3 1 1 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 FMUL F4 F2 F3 LD F2 4 R2 LD F3 4 R3 FMUL F5 F2 F3 FMUL F6 F4 F5 FADD F4 F4 F5 FMUL F6 F4 F5 FADD F1 F1 F6 ADD R2 R2 8 ADD R3 R3 8 ADD R4 R4 1 2 2 3 3 4 4 5 5 6 6 7 T14 T15 T16 T17 T18 T19 T20 T21 T22 T23 T24 T25 BNEZ R4 loop LD F2 0 R2 LD F3 0 R3 FMUL F4 F2 F3 LD F2 4 R2 LD F3 4 R3 FMUL F5 F2 F3 FMUL F6 F4 F5 FADD F4 F4 F5 FMUL F6 F4 F5 FADD F1 F1 F6 ADD R2 R2 8 7 8 8 9 9 10 10 11 11 12 12 13 Argument 1 src1 cycle available C C Argument 2 Src2 cycle available dst Cycle dst reg dispatched 1 1 R2 R3 1 1 F2 F3 7 C C 2 3 F3 R2 R3 R2 R3 R4 6 6 7 C C C 6 6 7 F4 F2 F3 F5 F6 F4 F6 F1 R2 R3 R4 C Loop C C 8 8 C C 9 10 C 5 13 F2 F3 F4 F2 F3 F5 F6 F4 F6 F1 R2 Cycle written back to ROB 2 3 6 7 10 11 14 15 T26 T27 T28 ADD R3 R3 8 ADD R4 R4 1 BNEZ R4 loop 13 14 14 C C C 6 13 14 Loop R3 R4 Problem 3 1 D Identify the instructions along the longest latency path in completing this iteration of the loop up to instruction 13 Suppose we consider an instruction to have executed when its result is available in the ROB How many cycles does this iteration take to execute Problem 3 1 E Do you expect the same behavior i e the same dependencies and the same number of cycles for the next iteration You may use the slots from T15 onwards in the attached diagram for bookkeeping to answer this question Please give a simple reason why the behavior may repeat or identify a resource bottleneck or dependency that may preclude the repetition of the behavior Problem 3 1 F Can you improve the performance by adding at most one additional memory port and a FP Multiplier Explain briefly Yes No Problem 3 1 G What is the minimum number of cycles needed to execute a typical iteration of this loop if we keep the same latencies for all the units but are allowed to use as many FUs and memory ports and are allowed to fetch and commit as many instructions as we want 7 Problem 3 2 Register Renaming and Static vs Dynamic Scheduling The following MIPS code calculates the floating point expression E A B C D where the addresses of A B C D and E are stored in R1 R2 R3 R4 and R5 respectively L S L S MUL S L S L S MUL S ADD S S S F0 F1 F0 F2 F3 F2 F0 F0 0 R1 …


View Full Document

Berkeley COMPSCI 152 - Complex Pipelines Problem Set 3

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Loading Unlocking...
Login

Join to view Complex Pipelines Problem Set 3 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Complex Pipelines Problem Set 3 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?