DOC PREVIEW
U of I CS 433 - Homework 2

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS433G: Computer System Organization, Spring 2007 Homework 2 Due on 4:30pm, Friday, May. 4th On-line students: Please send your solutions to [email protected] by due time. PDF is preferred. In your email please indicate alias you would like to use. On-campus students: Please print this page and attach to your solutions. Write down your name, NetID and alias. Hand in your solutions to Molly Fresner at 4120. Everyone: You must read and obey course policies. Policies can be found at: http://www.cs.uiuc.edu/class/sp07/cs433g/policies.html Name: NetID: Alias: Problem Max Points 1a. 10 1b. 10 1c. 20 2. 40 3. 20 Total: 1001. Dynamic Branch Prediction Consider the following MIPS code fragment. MOV R0, #0 DADDI R1, R0, #2 LOOP1: DADDI R12, R0, #4 LOOP2: DSUBI R12, R12, #1 BNEZ R12, LOOP2 ; Branch 1 DSUBI R1, R1, #1 BNEZ R1, LOOP1 ; Branch 2 a) [10 points] Assume that 1 bit branch predictors are used. When the processor starts to execute the aforementioned code, both predictors contain value N (not taken). What is the number of correct predictions? Use the following tables to record the prediction and action of each branch. Branch1 Step Prediction Action1 N T 2 3 4 5 6 7 8 # Correct Predictions Branch2 Step Prediction Action 1 N T 2 # Correct Predictionsb) [10 points] Now assume 4-bit saturation counters are used. When the processor starts to execute the above code, both counters contain the value 7. Use the following tables to record the prediction and action of each branch. Branch1 Step Counter Prediction Action 1 0111 N T 2 3 4 5 6 7 8 # Correct Predictions Branch2 Step Counter Prediction Action 1 0111 N T 2 # Correct Predictionsc) [20 points] Now assume that 2 level correlating predictors of the form (2,1) are used. When the processor starts to execute the above code, the outcome of the previous two branches is not taken (N). Also, assume the initial state of predictors of all branches is not taken (N). Use the following tables to record your steps. Record the “New State” of predictors in the form W/X/Y/Z where • W - state corresponds to the case where the previous two branches are both taken (T) • X - state corresponds to the case where the last branch is taken (T) and the branch before the last is not taken (N) • Y - state corresponds to the case where the last branch is not taken (N) and the branch before the last is taken (T) • Z - state corresponds to the case where the previous two branches are both not taken (N) In the examples above ‘new state’ is considered the global history state, where only one position can be as T (True). It is possible (and it makes more sense) to interpret the ‘new state’ as a branch predictor state. In this case positions decode the predicted outcome (Taken) for a specific branching history. If you prefer such approach you can fill in the following tables: Branch1 Step Prediction Action New State 1 N T N/T/N/N 2 3 4 5 6 7 8 # Correct Predictions Branch2 Step Prediction Action New State 1 N T N/T/N/N 2 # Correct Predictions Branch1 Step Prediction Action New State 1 N T N/N/N/T 2 3 4 5 6 7 8 # Correct Predictions Branch2 Step Prediction Action New State 1 N T N/N/T/N 2 # Correct Predictions2. [40 points] Tomasulo’s Algorithm This exercise examines Tomasulo’s algorithm on a simple loop operation. Consider the following code fragment: LOOP: L.D F2, 0(R1) DIV.D F4, F2, F0 L.D F6, 0(R2) ADD.D F2, F2, F6 DIV.D F6, F4, F2 S.D F6, 16(R1) DADDI R2, R2, #8 DADDI R1, R1, #32 BNEZ R1, LOOP Running on a system with the following specifications: The pipeline functional units are described by Table 1. Table 1 Functional Unit Cycles in EX # Functional Units #Reservation Stations Integer 1 1 5 FP add/substract 6 1 4 FP mul/div 12 2 4 • Functional units are not pipelined. • All stages except EX take one cycle to complete. • There is no forwarding between functional units. Both integer and floating point results are communicated through the CDB. • Memory accesses use the integer functional unit (and therefore integer reservation station) to perform effective address calculation. All loads and stores access memory during the EX stage. • There are unlimited load/store buffers and an infinite instruction queue. • Loads and stores take one cycle to execute. • If an instruction is in the WR stage in cycle x, then an instruction that is waiting on the same functional unit (due to a structural hazard) can begin execution in cycle x + 1. • Only one instruction can write to the CDB in a clock cycle. • Branches and stores do not need the CDB. • Whenever there is a conflict for a functional unit or the CDB, assume program order. • When an instruction is done executing in its functional unit and is waiting for the CDB, it is still occupying the functional unit and its reservation station (meaning no other instruction may enter). • The BNEZ instruction takes 0 cycles. • Initially, R1 < -32. Continued on the next page…Fill in the execution profile for the first two iterations of the above code fragment in Table 2, including: • The reservation station used by each instruction. This should include both the functional unit type and the number of the reservation station. If multiple reservation stations of a particular type are available, associate early program order with lower cardinality. • The cycles that each instruction occupies in the IS, EX, and WR stages. • Comments to justify your answer such as type of hazards and the registers involved. Table 2 Instruction Reservation Station IS EX WR Comments (if appropriate) L.D F2, 0(R1) Integer 1 1 2 3 DIV.D F4, F2, F0 L.D F6, 0(R2) ADD.D F2, F2, F6 DIV.D F6, F4, F2 S.D F6, 16(R1) DADDI R2, R2, #8 DADDI R1, R1, #32 L.D F2, 0(R1) DIV.D F4, F2, F0 L.D F6, 0(R2) ADD.D F2, F2, F6 DIV.D F6, F4, F2 S.D F6, 16(R1) DADDI R2, R2, #8 DADDI R1, R1, #323. [20 points] Memory hierarchy. Assume memory hierarchy with 4 levels. Local miss rates for each level are: • Level1 – 1% • Level2 – 1% • Level3 – 1% Hit time for each level is the following: • Level1 – 1ns • Level2 – 2ns •


View Full Document

U of I CS 433 - Homework 2

Download Homework 2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Homework 2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Homework 2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?