CSUN COMP 546 - INSTRUCTION PIPELINING (I) - D208435

Home> Schools> California State University, Northridge> Computer Science (COMP) > COMP 546> INSTRUCTION PIPELINING (I)

DOC PREVIEW

CSUN COMP 546 - INSTRUCTION PIPELINING (I)

School name California State University, Northridge

Course Comp 546-

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Datorarkitektur Fö 3 - 1Petru Eles, IDA, LiTHINSTRUCTION PIPELINING (I)1. The Instruction Cycle2. Instruction Pipelining3. Pipeline Hazards4. Structural Hazards5. Data Hazards6. Control HazardsDatorarkitektur Fö 3 - 2Petru Eles, IDA, LiTHThe Instruction CycleFetchinstructionDecodeFetchoperandExecuteinstructionFIDI- Calculate operand address (CO)- Fetch operand (FO)- Execute instruction (EI)- Write back operand (WO)Datorarkitektur Fö 3 - 3Petru Eles, IDA, LiTHInstruction Pipelining• Instruction execution is extremely complex andinvolves several operations which are executedsuccessively (see slide 2). This implies a largeamount of hardware,but only one part of thishardware works at a given moment.• Pipelining is an implementation technique wherebymultiple instructions are overlapped in execution.This is solved without additional hardware but onlyby letting different parts of the hardware work fordifferent instructions at the same time.• The pipeline organization of a CPU is similar to anassembly line: the work to be done in an instructionis broken into smaller steps (pieces), each of whichtakes a fraction of the time needed to complete theentire instruction. Each of these steps is apipestage (or apipe segment).• Pipe stages are connected to form a pipe:• The time required for moving an instruction fromone stage to the next: amachine cycle(often this isone clock cycle). The execution of one instructiontakes several machine cycles as it passes throughthe pipeline.Stage 1 Stage 2 Stage nDatorarkitektur Fö 3 - 4Petru Eles, IDA, LiTHAcceleration by PipeliningTwo stage pipeline: FI: fetch instructionEI: execute instructionWe consider that each instruction takes execution timeTex.Execution time for the 7 instructions, with pipelining:(Tex/2)*8= 4*TexFIEIFIEIFIEIFIEIFIEIFIEIFIEI12 834567Clock cycle →Instr. iInstr. i+1Instr. i+2Instr. i+3Instr. i+4Instr. i+5Instr. i+6Datorarkitektur Fö 3 - 5Petru Eles, IDA, LiTHAcceleration by Pipelining (cont’d)Six stage pipeline (see also slide 2):FI: fetch instruction FO: fetch operandDI: decode instruction EI: execute instructionCO: calculate operand address WO:write operandExecution time for the 7 instructions, with pipelining:(Tex/6)*12= 2*Tex• After a certain time (N-1 cycles) all the N stages ofthe pipeline are working: the pipeline is filled. Now,theoretically, the pipeline works providing maximalparallelism (N instructions are active simultaneously).FI DI12 834567Clock cycle →Instr. iInstr. i+1Instr. i+2Instr. i+3Instr. i+4Instr. i+5Instr. i+6COFO EI WOFI DI COFO EI WOFI DI COFO EI WOFI DI COFO EI WOFI DI COFO EI WOFI DI COFO EI WOFI DI COFO EI WO9 101112Datorarkitektur Fö 3 - 6Petru Eles, IDA, LiTHAcceleration by Pipelining (cont’d)• Apparently a greater number of stages alwaysprovides better performance. However:- a greater number of stages increases the over-head in moving information between stagesand synchronization between stages.- with the number of stages the complexity of theCPU grows.- it is difficult to keep a large pipeline at maximumrate because ofpipeline hazards.80486 and Pentium: five-stage pipeline for integer instr.eight-stage pipeline for FP instr.PowerPC: four-stage pipeline for integer instr.six-stage pipeline for FP instr.Datorarkitektur Fö 3 - 7Petru Eles, IDA, LiTHPipeline Hazards• Pipeline hazards are situations that prevent thenext instruction in the instruction stream fromexecuting during its designated clock cycle. Theinstruction is said to bestalled. When an instructionis stalled, all instructions later in the pipeline thanthe stalled instruction are also stalled. Instructionsearlier than the stalled one can continue. No newinstructions are fetched during the stall.•Types of hazards:1. Structural hazards2. Data hazards3. Control hazardsDatorarkitektur Fö 3 - 8Petru Eles, IDA, LiTHStructural Hazards• Structural hazards occur when a certain resource(memory, functional unit) is requested by more thanone instruction at the same time.Instruction ADD R4,X fetches in the FO stage operand Xfrom memory. The memory doesn’t accept anotheraccess during that cycle.Penalty: 1 cycle• Certain resources are duplicated in order to avoidstructural hazards. Functional units (ALU, FP unit)can be pipelined themselves in order to supportseveral instructions at a time. A classical way toavoid hazards at memory access is by providingseparate data and instruction caches.FI DI12 834567Clock cycle →ADD R4,XInstr. i+1Instr. i+2Instr. i+3Instr. i+4COFO EI WOFI DI COFO EI WOFI DI COFO EI WOFI DI COFO EI WOFI DI COFO EI WO9 101112stallDatorarkitektur Fö 3 - 9Petru Eles, IDA, LiTHData Hazards• We have two instructions, I1 and I2. In a pipelinethe execution of I2 can start before I1 hasterminated. If in a certain stage of the pipeline, I2needs the result produced by I1, but this result hasnot yet been generated, we have a data hazard.I1: MUL R2,R3 R2 ← R2 * R3I2: ADD R1,R2 R1 ← R1 + R2Before executing its FO stage, the ADD instruction isstalled until the MUL instruction has written the resultinto R2.Penalty: 2 cyclesFI DI12 834567Clock cycle →MUL R2,R3ADD R1,R2Instr. i+2COFO EI WOFI DI COFO EI WOFI DI COFO EI WO9 101112stallstallDatorarkitektur Fö 3 - 10Petru Eles, IDA, LiTHData Hazards (cont’d)• Some of the penalty produced by data hazards canbe avoided using a technique calledforwarding(bypassing).• The ALU result is always fed back to the ALU input.If the hardware detects that the value needed for thecurrent operation is the one produced by the previousoperation (but which has not yet been written back)it selects the forwarded result as the ALU input, in-stead of the value read from register or memory.After the EI stage of the MUL instruction the result is avail-able by forwarding. The penalty is reduced to one cycle.MUX MUXALUto register or memoryfrom reg-ister ormemoryfrom reg-ister ormemorybypasspathbypasspathFI DI12 834567Clock cycle →MUL R2,R3ADD R1,R2COFO EI WOFI DI CO9 101112stallFO EI WODatorarkitektur Fö 3 - 11Petru Eles, IDA, LiTHControl Hazards• Control hazards are produced by branchinstructions.Unconditional branch- - - - - - - - - - - - - -BR TARGET- - - - - - - - - - - - - -TARGET - - - - - - - - - - - - - -Penalty: 3 cyclesFI DI12 834567Clock cycle →BR TARGETtargettarget+1COFO EI WOFIFI DI COFO EI WO9 101112stall stallFI DI COFO EI WOThe instruction following thebranch is fetched; beforethe DI is finished it is notknown that a branch is exe-cuted. Later the fetched

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

CSUN COMP 546 - INSTRUCTION PIPELINING (I)

Sign up for free to view:

Please select your school