Basic Pipelining September 20, 2000ObjectiveAlpha Arithmetic InstructionsAlpha Load/Store InstructionsBranch InstructionsTransfers of ControlInstruction EncodingDecoding ExamplesDatapathHardware UnitsRR-type instructionsActive Datapath for RR & RIRI-type instructionsLoad instructionActive Datapath for Load & StoreStore instructionBranch on equalActive Datapath for Branch and BSRBranch to SubroutineJumpActive Datapath for JumpsComplete DatapathPipelining Basics3 Stage PipeliningLimitation: Nonuniform PipeliningLimitation: Deep PipelinesLimitation: Sequential DependenciesPipelined datapathPipeline StructurePipe RegisterPipeline StageAlpha SimulatorSimulator ALU ExampleSimulator Store/Load ExamplesSimulator Branch ExamplesData Hazards in Alpha PipelineSimulator Data Hazard ExampleControl Hazards in Alpha PipelineBranch ExampleConclusionsBasic PipeliningSeptember 20, 2000Topics•Objective•Instruction formats•Instruction processing•Principles of pipelining•Inserting pipe registersCS 740 F ’00– 2 –ObjectiveDesign Processor for Alpha Subset•Interesting but not overwhelming quantity•High level functional blocksInitial Design•One instruction at a time•Single cycle per instruction–Follows H&P Ch. 3.1 (Chs. 5.1--5.3 in undergrad version of text)Refined Design•5-stage pipeline–Similar to early RISC processors–Follows H&P Ch. 3.2 (Chs. 6.1--6.7 in undergrad version of text)•Goal: approach 1 cycle per instruction but with shorter cycle timeCS 740 F ’00– 3 –Alpha Arithmetic InstructionsOp ra rb funct rc31-26 25-21 20-16 15-13 11-5 4-0RR-type instructions (addq, subq, xor, bis, cmplt): rc <-- ra funct rb000 012Op ra ib funct rc31-26 25-21 20-13 11-5 4-0RI-type instructions (addq, subq, xor, bis, cmplt): rc <-- ra funct ib112Encoding•ib is 8-bit unsigned literalOperation Op field funct fieldaddq 0x10 0x20subq 0x10 0x29bis 0x11 0x20xor 0x11 0x40cmoveq 0x11 0x24cmplt 0x11 0x4DCS 740 F ’00– 4 –Alpha Load/Store InstructionsEncoding•offset is 16-bit signed offsetOperation Op fieldldq 0x29stq 0x2DLoad: Ra <-- Mem[Rb +offset]Store: Mem[Rb + offset] <-- Ra31-26 25-21 20-16 15-0Op ra rb offsetCS 740 F ’00– 5 –Branch InstructionsEncoding•disp is 21-bit signed displacementOperation Op field Condbeq 0x39 Ra == 0bne 0x3D Ra != 031-26 25-21 20-0Cond. Branch: PC <-- Cond(Ra) ? PC + 4 + disp*4 : PC + 4Op ra dispBranch [Subroutine] (br, bsr): Ra <-- PC + 4; PC <-- PC + 4 + disp*431-26 25-21 20-0Op ra dispOperation Op fieldbr 0x30bsr 0x34CS 740 F ’00– 6 –Transfers of ControlEncoding•High order 2 bits of Hint encode jump type•Remaining bits give information about predicted destination•Hint does not affect functionalityJump Type Hint 15:14jmp 00jsr 01ret 10jmp, jsr, ret: Ra <-- PC+4; PC <-- Rb31-26 25-21 20-16 15-00x1A ra rb Hintcall_pal31-26 25-00x00 Number•Use as halt instructionCS 740 F ’00– 7 –Instruction EncodingObject Code•Instructions encoded in 32-bit words•Program behavior determined by bit encodings•Disassembler simply converts these words to readable instructions 0x0: 40220403 addq r1, r2, r3 0x4: 4487f805 xor r4, 0x3f, r5 0x8: a4c70abc ldq r6, 2748(r7) 0xc: b5090123 stq r8, 291(r9) 0x10: e47ffffb beq r3, 0 0x14: d35ffffa bsr r26, 0(r31) 0x18: 6bfa8001 ret r31, (r26), 1 0x1c: 000abcde call_pal 0xabcdeCS 740 F ’00– 8 –Decoding Examples 0x0: 40220403 addq r1, r2, r3401000000020010200100000040100000003001110 01 02 0320 0x8: a4c70abc ldq r6, 2748(r7)a101040100c11007011100000a1010b1011c110029 06 07 0abc = 274810 0x10: e47ffffb beq r3, 0e11104010070111f1111f1111f1111f1111b101139 03 1ffffb = -510 0x18: 6bfa8001 ret r31, (r26), 160110b1011f1111a1010810000000000000100011a 1f=311021a=2610Target = 16 # Current PC+ 4 # Increment+ 4 * -5 # Disp= 0CS 740 F ’00– 9 –DatapathIFinstruction fetchIDinstruction decode/register fetchMEMmemory accessEXexecute/address calculationWBwrite backPCInstr. Mem.Reg. ArrayregAregBregWdatWdatAdatBALU25:2120:16+4Data Mem.datInaddrdatOutaluAaluBIncrPCInstr4:0WdestWdata20:13Xtnd25:21WdataWdest15:0Xtnd << 2Zero Test25:21WdataWdest20:025:21WdataWdestCS 740 F ’00– 10 –Hardware UnitsStorage•Instruction Memory–Fetch 32-bit instructions•Data Memory–Load / store 64-bit data•Register Array–Storage for 32 integer registers–Two read ports: can read two registers at once–Single write portFunctional Units•+4 PC incrementer•Xtnd Sign extender•ALU Arithmetic and logical instructions•Zero Test Detect whether operand == 0CS 740 F ’00– 11 –RR-type instructionsIF: Instruction fetch•IR <-- IMemory[PC]•PC <-- PC + 4ID: Instruction decode/register fetch•A <-- Register[IR[25:21]]•B <-- Register[IR[20:16]]Ex: Execute•ALUOutput <-- A op BMEM: Memory•nopWB: Write back•Register[IR[4:0]] <-- ALUOutputOp ra rb funct rc31-26 25-21 20-16 15-13 11-5 4-0RR-type instructions (addq, subq, xor, bis, cmplt): rc <-- ra funct rb000 012CS 740 F ’00– 12 –Active Datapath for RR & RIALU Operation•Input B selected according to instruction type–datB for RR, IR[20:13] for RI•ALU function set according to operation typeWrite Back•To RcPCInstr. Mem.Reg. ArrayregAregBregWdatWdatAdatBALU25:2120:16+4Data Mem.datInaddrdatOutaluAaluBIncrPCInstr4:0WdestWdata20:13CS 740 F ’00– 13 –RI-type instructionsIF: Instruction fetch•IR <-- IMemory[PC]•PC <-- PC + 4ID: Instruction decode/register fetch•A <-- Register[IR[25:21]]•B <-- IR[20:13]Ex: Execute•ALUOutput <-- A op BMEM: Memory•nopWB: Write back•Register[IR[4:0]] <-- ALUOutputOp ra ib funct rc31-26 25-21 20-13 11-5 4-0RI-type instructions (addq, subq, xor, bis, cmplt): rc <-- ra funct ib112CS 740 F ’00– 14 –Load instructionLoad: Ra <-- Mem[Rb +offset]31-26 25-21 20-16 15-0Op ra rb offsetIF: Instruction fetch•IR <-- IMemory[PC]•PC <-- PC + 4ID: Instruction decode/register fetch•B <-- Register[IR[20:16]]Ex: Execute•ALUOutput <-- B + SignExtend(IR[15:0])MEM: Memory•Mem-Data <-- DMemory[ALUOutput]WB: Write back•Register[IR[25:21]] <-- Mem-DataCS 740 F ’00– 15 –Active Datapath for Load & StoreALU Operation•Used to compute address–A input set to extended IR[15:0]•ALU function set to addMemory Operation•Read for load, write for storeWrite Back•To Ra for load•None for storeStoreLoadPCInstr. Mem.Reg. ArrayregAregBregWdatWdatAdatBALU25:2120:16+4Data Mem.datInaddrdatOutaluAaluBIncrPCInstrXtnd25:21WdataWdest15:0CS 740 F ’00– 16 –Store instructionIF: Instruction fetch•IR
View Full Document