DOC PREVIEW
UH COSC 6385 - Pipelining

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Edgar GabrielCOSC 6385 Computer Architecture - PipeliningEdgar GabrielSpring 2011Some of the slides are based on a lecture by David Culler, University of California, Berkleyhttp://www.eecs.berkeley.edu/~culler/courses/cs252-s05COSC 6385 – Computer ArchitectureEdgar GabrielPipelining• Pipelining is an implementation technique whereby multiple instructions are overlapped in execution– Split an “expensive” operation into several sub-operations– Execute the sub-operations in a staggered manner• Real world analogy: assembly line in car manufacturing– Each station is doing something different– Each station working on a separate car• Pipelining increases the throughput, but does not reduce the latency of an operationCOSC 6385 – Computer ArchitectureEdgar GabrielClasses of instructions• ALU instructions– Take either 2 registers as operands or 1 register and one 16bit immediate offset– Results are stored in a 3rdregister• Load and store instructions• Branches and jumpsCOSC 6385 – Computer ArchitectureEdgar GabrielTypical implementation of an instruction (I)1. Instruction fetch cycle (IF):• send PC to memory • Fetch current instruction• Update PC to next sequential PC (+4 bytes)2. Instruction decode/register fetch cycle (ID)• Decode instruction• Read registers corresponding to register source specifiers from register file• Sign extend offset fields if needed• Compute possible branch target addressCOSC 6385 – Computer ArchitectureEdgar GabrielTypical implementation of an instruction (II)3. Execution /effective address cycle (EX)• ALU adds base register and offset to form effective address or• ALU performs operations on the values read from register file or• ALU performs operation on value read from register and sign-extended immediate4. Memory access cycle (MEM)• If instruction is a load, read memory using the effective address computed in step 3• If instruction is a store, write the data from the second register read of the register file to the effective address5. Write-back cycle (WB)• Write result into register file• From memory for a load instruction• From ALU for an ALU instructionCOSC 6385 – Computer ArchitectureEdgar GabrielTypical implementation of an instruction (III)MemoryAccessWriteBackInstructionFetchInstr. DecodeReg. FetchExecuteAddr. CalcLMDALUMUXMemoryReg FileMUX MUXDataMemoryMUXSignExtend4AdderZero?Next SEQ PCPCNext PCWB DataInstRDRS1RS2ImmSlide based on a lecture by David Culler, University of California, Berkleyhttp://www.eecs.berkeley.edu/~culler/courses/cs252-s05COSC 6385 – Computer ArchitectureEdgar GabrielDetails(I)4AdderPCReadaddressInstruction memoryInstructionFetching instructions and incrementing program count (PC)COSC 6385 – Computer ArchitectureEdgar GabrielDetails (II)Readregister 1Register fileALU instructions, e.g. add R1, R2, R3Readregister 2Writeregister WriteData Readdata 1Readdata 2555ALURegisternumbersDataDataRegWriteALU operation4ALUresultZeroRegister number input is 5 bit wide if you have 32(=25) registersWrite control signalALU operationcontrol signal (4 bits)COSC 6385 – Computer ArchitectureEdgar GabrielDetails (III)AddressData memoryLoad/Store instructions, e.g. LW R1,offset (R2)WriteData Readdata MemReadMemWriteSignExtend1632Basic steps for a load/store operation• sign extend the offset from 16 to 32 bit• add the sign extended offset to R2• Load the content of the resulting address into R1 or• store the data from R1 into the resulting memory address COSC 6385 – Computer ArchitectureEdgar GabrielDetails (IV)Combining Load/Store and ALU instructionsReadregister 1Register fileRead register 2Writeregister WriteData Readdata 1Readdata 2RegWriteInstructionSignExtend1632MUX01ALU4AddressData memoryWriteData Readdata MemReadMemWriteALUsrcMUX01MemtoRegALU operationCOSC 6385 – Computer ArchitectureEdgar GabrielDetails (V)Branches e.g. beq R1,R2,offsetBasic steps for a branch equal instruction• compute branch target address• sign extended offset field• shift offset field by 2 bits in order to ensure a word offset• add shifted, sign-extended offset to PC• compare registers R1 and R2 COSC 6385 – Computer ArchitectureEdgar GabrielDetails (VI)Implementation of branches, e.g. beq R1,R2,offsetReadregister 1Register fileRead register 2Writeregister WriteData Readdata 1Readdata 2RegWriteSignExtend1632ALU4ALU operationInstructionTo branch control logicShift Left 2AddPC+4 from instruction datapathBranchtargetCOSC 6385 – Computer ArchitectureEdgar GabrielVisualizing pipeliningInstr.OrderTime (clock ycles)IDALUMemIFWBIDALUMemIFWBIDALUMemIFWBIDALUMemIFWBCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7Cycle 5Slide based on a lecture by David Culler, University of California, Berkleyhttp://www.eecs.berkeley.edu/~culler/courses/cs252-s05COSC 6385 – Computer ArchitectureEdgar GabrielEffects of pipelining• A pipeline of depth n requires n-times the memory bandwidth of a non-pipelined processor for the same clock rate• Separate data and instruction cache eliminates some memory conflicts• Register file is used in stage ID and in WB– Usually not a conflict, since write’s are executed in the first half of the clock-cycle and read’s in the second half• Instructions in the pipeline should not attempt to use the same hardware resources at the same time– Introducing pipeline registers between successive stages of the pipeline– Registers named after the stages they connect (e.g. IF/ID, ID/ALU, etc.)COSC 6385 – Computer ArchitectureEdgar GabrielMemoryAccessWriteBackInstructionFetchInstr. DecodeReg. FetchExecuteAddr.CalcALUMemoryReg FileMUX MUXDataMemoryMUXSignExtendZero?IF/IDID/EXMEM/WBEX/MEM4AdderNext SEQ PCNext SEQ PCRD RD RDNext PCAddressRS1RS2ImmMUXSlide based on a lecture by David Culler, University of California, Berkleyhttp://www.eecs.berkeley.edu/~culler/courses/cs252-s05COSC 6385 – Computer ArchitectureEdgar GabrielPipeline Hazards• Limits to pipelining:Hazards prevent next instruction from executing during its designated clock cycle– Structural hazards: HW cannot support this combination of instructions – Data hazards: Instruction depends on result of prior instruction still in the pipeline – Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps).Slide based on a lecture by David Culler, University of California,


View Full Document

UH COSC 6385 - Pipelining

Download Pipelining
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Pipelining and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Pipelining 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?