DOC PREVIEW
UCLA COMSCI M151B - Lecture11

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Week 7 - Wednesday4_12Stalls and PerformanceTHE BIG PICTURE Mechanisms to correct hazards through hardware Stalls Reduce performance Maintain correctness Compiler can arrange code to avoid hazards and stalls Hurts portability Must know the hardware infrastructure beyond the ISA Understand the pipeline structure Will talk about advantages of using hardware stalls or compilers Static vs dynamic optimization Static - know what to do when there are hazards Dynamic - works according to the situationControl Hazards Arise because we have changes in control flow due to instructions Branches Jumps Procedural calls These all change the actual PC which changes the flow of control of the program Fetching next instruction depends on branch outcome Pipeline can't always fetch correct instruction Still working on ID stage of the branch In MIPS pipeline Need to compare registers and compute target early in the pipeline Add hardware to do it in the ID stageDealing With Branch Hazards Both hardware solutions and software solutions Hardware Just wait Stall until you know which direction branch goes Exposing the pipeline in depth, reducing performance by increasing CPI Allow pipeline to continue Make a prediction Guess which direction Then start executing chosen path If it's right, you break dependencies (be prepared to undo any mistakes!) Penalty has two parts How often you are wrong How big of a penalty when you see you are wrong Penalty for stalling is fixed Penalty for making predictions can be none but can also be larger than stalling Predictions are useful when you have a reasonable accuracy Two types of prediction Static: guess on instruction type Dynamic: guess on execution history Prediction is more powerful Reduce the branch delay Difficult to reduce in the processor once we settle on a pipe depthStall on Branch Wait until branch outcome determined before fetching next instruction Once branch is resolved, instruction fetch can continue Problem: With every branch, you have to do a stall Large penaltyGets complicated with a more complex pipeline with multiple stages of fetch and multiple stages of decodeBranch Prediction More realistic and more dynamically-scheduled Longer pipelines can't readily determine branch outcome early Stall penalty becomes unacceptable Predict outcome of branch Only stall if prediction is wrong This is so that we can correct our mistakes How it works Guess which way it's going to go Continue to fetch down the path you get and assume you're correct If wrong, fix things up In MIPS pipeline Start with static prediction Predict branches not taken Use PC+4 as the next address Fetch instruction after branch, no delayMIPS with Predict Not Taken Prediction correct: add $4, $5, $6 beq $1, $2, 40 lw $3, 300($0) Prediction incorrect: add $4, $5, $6 beq $1, $2, 40 STALL or $7, $8, $9More-Realistic Branch Prediction While Predict Not Taken is a good mechanism, if assume branches taken 50% ofthe time and not taken 50% of the time, we'll be wrong 50% of the time Get another strategy Static branch prediction Based on more sophisticated things like a forward branch or a backward branch If statement - forward branch Loop statements - backward branch Predict backward branches taken Predict forward branches not taken Dynamic branch prediction Maintain extra hardware structures Usually tables Indexed with PC of the branch Figure out whether the branch will be taken or not by the actual branch behaviorBranch Hazards If branch outcome determined in MEM EX stage would compute whether the branch was taken or not MEM stage would write into PC Why can't write in PC during EX? Assume our instructions are like this: beq and or add lw If assuming we had these instructions loaded in the pipeline After cycle 4, beq will determine whether the branch was taken.Beginning of cycle 5, lw enters the pipeline Therefore, if we assumed incorrectly that the branch was not taken, we have to flush the next set of instructions until lw Flushing means we convert the instructions to nops Setting control values to 0 and only reached the execution stage Hasn't affected the register file or memory because it hasn't reached the MEM/WB stageReducing Branch Delay Move hardware to determine outcome to ID stage Target address adder Register comparator If we could avoid the overhead, it will be much more effective Reduce the number of stages to flush to 2 instead of 34_13Data Hazards for Branches If the branch is dependent on the instructions that are before it, and it hasn't gotten the correct register values, it would not benefit from the existing forwarding hardware For example: add $1, $2, $3 add $4, $5, $6 ... beq $1, $4, target No forwarding logic that would come to the ID stage, forwarding stage comes from the ALU stage But if we waited until both instructions are executed and written back to the register, it will be too late because the branch will already be calculated Instead of forwarding to the execution unit, we have to forward to the ID stage Forward the values from add to $1's MEM stage and add to $4's EX stage to the beq's ID stage If a comparision register is a destination of preceding ALU instruction or 2nd preceding load instruction We need to stall with 1 cycle So that it could forward from MEM/WB of lw and EX/MEM of add For example: lw $1, addr add $4, $5, $6 beq $1, $4, target // We have to stall this instruction because itsregisters are


View Full Document

UCLA COMSCI M151B - Lecture11

Download Lecture11
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture11 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture11 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?