Page 1PipeliningSeptember 18, 2002Topics - 1• Objective• Instruction formats• Instruction processing• Principles of pipelining• Inserting pipe registersTopics - 2• Data Hazards– Stalling and Forwarding– Systematic testing of hazard-handling logic• Control Hazards– Stalling, Predict not taken• Exceptions• Multicycle InstructionsCS 740 F ’02–2–ObjectiveDesign Processor for Alpha Subset• Interesting but not overwhelming quantity• High level functional blocksInitial Design• One instruction at a time• Single cycle per instruction– Follows H&P Ch. A.1Refined Design• 5-stage pipeline– Similar to early RISC processors– Follows H&P Ch. A.2-3• Goal: approach 1 cycle per instruction but with shorter cycle timeWhat Makes it Hard• Hazards, exceptions (A.4-6)Page 2CS 740 F ’02–3–Alpha Arithmetic InstructionsOp ra rb funct rc31-26 25-21 20-16 15-13 11-5 4-0RR-type instructions (addq, subq, xor, bis, cmplt): rc <-- ra funct rb000 012Op ra ib funct rc31-26 25-21 20-13 11-5 4-0RI-type instructions (addq, subq, xor, bis, cmplt): rc <-- ra funct ib112Encoding• ib is 8-bit unsigned literalOperation Op field funct fieldaddq 0x10 0x20subq 0x10 0x29bis 0x11 0x20xor 0x11 0x40cmoveq 0x11 0x24cmplt 0x11 0x4DCS 740 F ’02–4–Alpha Load/Store InstructionsEncoding• offset is 16-bit signed offsetOperation Op fieldldq 0x29stq 0x2DLoad: Ra <-- Mem[Rb +offset]Store: Mem[Rb + offset] <-- Ra31-26 25-21 20-16 15-0Op ra rb offsetPage 3CS 740 F ’02–5–Branch InstructionsEncoding• disp is 21-bit signed displacementOperation Op field Condbeq 0x39 Ra == 0bne 0x3D Ra != 031-26 25-21 20-0Cond. Branch: PC <-- Cond(Ra) ? PC + 4 + disp*4 : PC + 4Op ra dispBranch [Subroutine] (br, bsr): Ra <-- PC + 4; PC <-- PC + 4 + disp*431-26 25-21 20-0Op ra dispOperation Op fieldbr 0x30bsr 0x34CS 740 F ’02–6–Transfers of ControlEncoding• High order 2 bits of Hint encode jump type• Remaining bits give information about predicted destination• Hint does not affect functionalityJump Type Hint 15:14jmp 00jsr 01ret 10jmp, jsr, ret: Ra <-- PC+4; PC <-- Rb31-26 25-21 20-16 15-00x1A ra rb Hintcall_pal31-26 25-00x00 Number• Use as halt instructionPage 4CS 740 F ’02–7–Instruction EncodingObject Code• Instructions encoded in 32-bit words• Program behavior determined by bit encodings• Disassembler simply converts these words to readable instructions0x0: 40220403 addq r1, r2, r30x4: 4487f805 xor r4, 0x3f, r50x8: a4c70abc ldq r6, 2748(r7)0xc: b5090123 stq r8, 291(r9)0x10: e47ffffb beq r3, 00x14: d35ffffa bsr r26, 0(r31)0x18: 6bfa8001 ret r31, (r26), 10x1c: 000abcde call_pal 0xabcdeCS 740 F ’02–8–Decoding Examples0x0: 40220403 addq r1, r2, r3401000000020010200100000040100000003001110 01 02 03200x8: a4c70abc ldq r6, 2748(r7)a101040100c11007011100000a1010b1011c110029 06 07 0abc = 2748100x10: e47ffffb beq r3, 0e11104010070111f1111f1111f1111f1111b101139 03 1ffffb = -5100x18: 6bfa8001 ret r31, (r26), 160110b1011f1111a1010810000000000000100011a 1f=311021a=2610Target = 16 # Current PC+ 4 # Increment+4 * -5# Disp= 0Page 5CS 740 F ’02–9–DatapathIFinstruction fetchIDinstruction decode/register fetchMEMmemoryaccessEXexecute/address calculationWBwrite backPCInstr. Mem.Reg. ArrayregAregBregWdatWdatAdatBALU25:2120:16+4Data Mem.datInaddrdatOutaluAaluBIncrPCInstr4:0WdestWdata20:13Xtnd25:21WdataWdest15:0Xtnd << 2Zero Test25:21WdataWdest20:025:21WdataWdestCS 740 F ’02–10–Hardware UnitsStorage• Instruction Memory– Fetch 32-bit instructions• Data Memory– Load / store 64-bit data• Register Array– Storage for 32 integer registers– Two read ports: can read two registers at once– Single write portFunctional Units• +4 PC incrementer• Xtnd Sign extender• ALU Arithmetic and logical instructions• Zero Test Detect whether operand == 0Page 6CS 740 F ’02–11–RR-type instructionsIF: Instruction fetch• IR <-- IMemory[PC]• PC <-- PC + 4ID: Instruction decode/register fetch• A <-- Register[IR[25:21]]• B <-- Register[IR[20:16]]Ex: Execute• ALUOutput <-- A op BMEM: Memory• nopWB: Write back• Register[IR[4:0]] <-- ALUOutputOp ra rb funct rc31-26 25-21 20-16 15-13 11-5 4-0RR-type instructions (addq, subq, xor, bis, cmplt): rc <-- ra funct rb000 012CS 740 F ’02–12–Active Datapath for RR & RIALU Operation• Input B selected according to instruction type–datBfor RR, IR[20:13] for RI• ALU function set according to operation typeWrite Back• To RcPCInstr. Mem.Reg. ArrayregAregBregWdatWdatAdatBALU25:2120:16+4Data Mem.datInaddrdatOutaluAaluBIncrPCInstr4:0WdestWdata20:13Page 7CS 740 F ’02–13–IF: Instruction fetch• IR <-- IMemory[PC]• PC <-- PC + 4ID:• A <-- Register[IR[25:21]]• B <-- Register[IR[20:16]]Ex: Execute• ALUOutput <-- A op BMEM: Memory• nopWB: Write back• Register[IR[4:0]] <-- ALUOutputActive Datapath for RR & RIALU Operation• Input B selected according to instruction type–datBfor RR, IR[20:13] for RI• ALU function set according to operation typeWrite Back• To RcPCInstr. Mem.Reg. ArrayregAregBregWdatWdatAdatBALU25:2120:16+4Data Mem.datInaddrdatOutaluAaluBIncrPCInstr4:0WdestWdata20:13CS 740 F ’02–14–RI-type instructionsIF: Instruction fetch• IR <-- IMemory[PC]• PC <-- PC + 4ID: Instruction decode/register fetch• A <-- Register[IR[25:21]]•B <-- IR[20:13]Ex: Execute• ALUOutput <-- A op BMEM: Memory• nopWB: Write back• Register[IR[4:0]] <-- ALUOutputOp ra ib funct rc31-26 25-21 20-13 11-5 4-0RI-type instructions (addq, subq, xor, bis, cmplt): rc <-- ra funct ib112Page 8CS 740 F ’02–15–Load instructionLoad: Ra <-- Mem[Rb +offset]31-26 25-21 20-16 15-0Op ra rb offsetIF: Instruction fetch• IR <-- IMemory[PC]• PC <-- PC + 4ID: Instruction decode/register fetch• B <-- Register[IR[20:16]]Ex: Execute• ALUOutput <-- B + SignExtend(IR[15:0])MEM: Memory• Mem-Data <-- DMemory[ALUOutput]WB: Write back• Register[IR[25:21]] <-- Mem-DataCS 740 F ’02–16–Active Datapath for Load & StoreALU Operation• Used to compute address– A input set to extended IR[15:0]• ALU function set to addMemory Operation• Read for load, write for storeWrite Back• To Ra for load• None for storeStoreLoadPCInstr. Mem.Reg. ArrayregAregBregWdatWdatAdatBALU25:2120:16+4Data Mem.datInaddrdatOutaluAaluBIncrPCInstrXtnd25:21WdataWdest15:0Page 9CS 740 F ’02–17–Store instructionIF: Instruction fetch• IR <-- IMemory[PC]• PC <-- PC + 4ID: Instruction decode/register fetch• A
View Full Document