DOC PREVIEW
Berkeley COMPSCI 252 - Lec 4 – Issues in Basic Pipelines (stalls, exceptions, branch prediction)

This preview shows page 1-2-3-4-25-26-27-52-53-54-55 out of 55 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 55 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

EECS 252 Graduate Computer Architecture Lec 4 Issues in Basic Pipelines stalls exceptions branch prediction David Culler Electrical Engineering and Computer Sciences University of California Berkeley http www eecs berkeley edu culler http www inst eecs berkeley edu cs252 Debate Review ISA a Critical Interface Extremely well defined abstraction software Prog Lang Compiler Operating Systems instruction set hardware Huge quantitative base of implementation costs and performance Convergence trend enable optimizations support HLL OS support contain complexity Properties of a good abstraction Lasts through many generations portability Used in many different ways generality Provides convenient functionality to higher levels Permits an efficient implementation at lower levels 1 27 2005 Huge quantitative base of usage data for real applications filtered through SOA compiler technology Lots of marketing ignores misuses or selective use of established data Worse when we get beyond scalar operations Translation Interpretation boundary becoming less sharp CS252S05 L4 Pipe Issues 2 Discussion Exercise In terms of the iron triangle what are the performance implications of condition codes 1 27 2005 CS252S05 L4 Pipe Issues 3 Ordering Properties of basic inst pipeline Ifetch DMem Reg Ifetch Reg Ifetch Reg Reg DMem Issue Execution window Reg DMem ALU Reg ALU O r d e r Ifetch ALU I n s t r ALU Time clock cycles Cycle 1Cycle 2 Cycle 3 Cycle 4Cycle 5 Cycle 6 Cycle 7 Reg DMem Reg Complete Instructions issued in order Operand fetch is stage 2 operand fetched in order Write back in stage 5 no WAW no WAR hazards Common pipeline flow operands complete in order Stage changes only at end of instruction 1 27 2005 CS252S05 L4 Pipe Issues 4 What does forwarding do or r8 r1 r9 Reg Ifetch Reg Ifetch Reg DMem Reg Ifetch xor r10 r1 r4 Reg DMem Reg Reg Destination register is a name for instr s result Source registers are names for sources Forwarding logic builds data dependence flow graph for instructions in the execution window 1 27 2005 CS252S05 L4 Pipe Issues Reg DMem ALU and r6 r1 r7 Ifetch DMem ALU sub r4 r1 r3 Reg ALU O r d e r add r1 r2 r3 Ifetch ALU I n s t r ALU Time clock cycles Reg DMem x 5 Control Pipeline nPC Rr kill MEM res mop wop wop Rr kill imed 4 PC3 CS252S05 L4 Pipe Issues PC2 PC1 nPC 1 27 2005 Ra Rb Rr B byp dcd mop wop fwd ctrl D Mem ALU mux Op B IR I fetch PC Next PC aop op A res mux Op A Registers brch 6 Historical Perspective Microprogramming macro instructions Main Memory ADD SUB AND DATA execution unit CPU control memory User program plus Data one of these is mapped into a sequence of these Micro sequencer control Datapath control Writable control store Supported complex instructions a sequence of simple micro inst RTs Pipelined micro instruction processing but very limited view Could not reorganize macroinstructions to enable pipelining 1 27 2005 CS252S05 L4 Pipe Issues 7 Multicycle stages Datapath Stage Nxt Pipeline Contr Reg Pipeline Control Reg Stall Stage microsequencer spits micro ops into the pipe 1 27 2005 CS252S05 L4 Pipe Issues 8 Typical simple Pipeline Example MIPS R4000 integer unit ex FP int Multiply IF ID m1 m2 m3 m4 m5 m6 m7 MEM WB FP adder a1 a2 a3 a4 FP int divider Div lat 25 Init inv 25 1 27 2005 CS252S05 L4 Pipe Issues 9 Branch prediction Datapath parallelism only useful if you can keep it fed Easy to fetch multiple consecutive instructions per cycle essentially speculating on sequential flow Jump unconditional change of control flow Always taken Branch conditional change of control flow Taken about 50 of the time Backward 30 x 80 taken Forward 70 x 40 taken 1 27 2005 CS252S05 L4 Pipe Issues 10 A Big Idea for Today Reactive past actions cause system to adapt use do what you did before better ex caches TCP windows URL completion Proactive uses past actions to predict future actions optimize speculatively anticipate what you are about to do branch prediction long cache blocks 1 27 2005 CS252S05 L4 Pipe Issues 11 Case for Branch Prediction when Issue N instructions per clock cycle 1 Branches will arrive up to n times faster in an nissue processor 2 Amdahl s Law relative impact of the control stalls will be larger with the lower potential CPI in an n issue processor conversely need branch potential parallelism 1 27 2005 prediction CS252S05 L4 Pipe Issues to see 12 Branch Prediction Schemes 0 Static Branch Prediction 1 bit Branch Prediction Buffer 2 bit Branch Prediction Buffer Correlating Branch Prediction Buffer Tournament Branch Predictor Branch Target Buffer Integrated Instruction Fetch Units Return Address Predictors 1 27 2005 CS252S05 L4 Pipe Issues 13 Dynamic Branch Prediction Performance accuracy cost of misprediction Branch History Table Lower bits of PC address index table of 1 bit values Says whether or not branch taken last time No address check saves HW but may not be right branch If inst BR update table with outcome Problem in a loop 1 bit BHT will cause 2 mispredictions End of loop case when it exits instead of looping as before First time through loop on next time through code when it predicts exit instead of looping avg is 9 iterations before exit Only 80 accuracy even if loop 90 of the time Local history This particular branch inst Or one that maps into same lost PC 1 27 2005 CS252S05 L4 Pipe Issues 14 2 bit Dynamic Branch Prediction J Smith 1981 2 bit scheme where change prediction only if get misprediction twice T Predict Taken NT Predict Taken T T NT NT Predict Not Taken T Predict Not Taken NT Red stop not taken Green go taken Adds hysteresis to decision making process Generalize to n bit saturating counter 1 27 2005 CS252S05 L4 Pipe Issues 15 Consider 3 Scenarios Branch for loop test Check for error or exception Alternating taken not taken example predictors Global history taken Your worst case prediction scenario How could HW predict this loop will execute 3 times using a simple mechanism 1 27 2005 CS252S05 L4 Pipe Issues 16 Correlating Branches Idea taken not taken of recently executed branches is related to behavior of next branch as well as the history of that branch behavior Branch address 4 bits 2 bits per branch local predictors Then behavior of recent branches selects between say 4 predictions of next branch updating just that prediction Prediction Prediction 2 2 predictor 2 bit global 2 bit local 1 27 2005 2 bit recent global branch history 01 not taken then taken CS252S05 L4 Pipe Issues 17 Accuracy of Different Schemes Figure 3 15 p 206 20


View Full Document

Berkeley COMPSCI 252 - Lec 4 – Issues in Basic Pipelines (stalls, exceptions, branch prediction)

Documents in this Course
Quiz

Quiz

9 pages

Caches I

Caches I

46 pages

Lecture 6

Lecture 6

36 pages

Lecture 9

Lecture 9

52 pages

Figures

Figures

26 pages

Midterm

Midterm

15 pages

Midterm

Midterm

14 pages

Midterm I

Midterm I

15 pages

ECHO

ECHO

25 pages

Quiz  1

Quiz 1

12 pages

Load more
Download Lec 4 – Issues in Basic Pipelines (stalls, exceptions, branch prediction)
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lec 4 – Issues in Basic Pipelines (stalls, exceptions, branch prediction) and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lec 4 – Issues in Basic Pipelines (stalls, exceptions, branch prediction) 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?