DOC PREVIEW
UT CS 429H - Lecture Notes

This preview shows page 1-2-21-22 out of 22 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Pipelining IIITopicsTopics Hazard mitigation through pipeline forwarding Hardware support for forwarding Forwarding to mitigate control (branch)hazardsSystems I2How do we fix the Pipeline?Pad the program with Pad the program with NOPsNOPs Yuck!Stall the pipelineStall the pipeline Data hazards Wait for producing instruction to complete Then proceed with consuming instruction Control hazards Wait until new PC has been determined Then begin fetching How is this better than putting NOPs into the program?Forward data within the pipelineForward data within the pipeline Grab the result from somewhere in the pipe After it has been computed But before it has been written back This gives an opportunity to avoid performance degradation due tohazards!3Data ForwardingNaïve PipelineNaïve Pipeline Register isnʼt written until completion of write-back stage Source operands read from register file in decode stage Needs to be in register file at start of stageObservationObservation Value generated in execute or memory stageTrickTrick Pass value directly from generating instruction to decodestage Needs to be available at end of decode stage4Data Forwarding Example irmovl in write-back stage Destination value inW pipeline register Forward as valB fordecode stage0x000: irmovl $10,%edx1 2 3 4 5 6 7 8 9F D E M WF D E M W0x006: irmovl $3,%eaxF D E M WF D E M W0x00c: nopF D E M WF D E M W0x00d: nopF D E M WF D E M W0x00e: addl %edx,%eaxF D E M WF D E M W0x010: haltF D E M WF D E M W10# demo-h2.ysCycle 6WR[%eax] 3DvalAR[%edx] = 10valB  W_valE = 3•••W_dstE = %eaxW_valE = 3srcA = %edxsrcB = %eax5Bypass PathsDecode StageDecode Stage Forwarding logicselects valA and valB Normally from registerfile Forwarding: get valA orvalB from later pipelinestageForwarding SourcesForwarding Sources Execute: valE Memory: valE, valM Write back: valE, valMPCincrementPCincrementCCCCALUALUDatamemoryDatamemoryFetchDecodeExecuteMemoryWrite backRegisterfileRegisterfileA BMERegisterfileRegisterfileA BMEvalPd_srcA, d_srcBvalA, valBBche_valEAddr, Datam_valMPCW_valE, W_valM, W_dstE, W_dstMW_icode, W_valMicode, ifun,rA, rB, valCEMWFDvalPf_PCpredPCInstructionmemoryInstructionmemoryM_icode, M_Bch, M_valAM_valEW_valEW_valME_valA, E_valB, E_srcA, E_srcBForward6Data Forwarding Example #2Register Register %%edxedx Generated by ALUduring previous cycle Forward from memoryas valARegister Register %%eaxeax Value just generatedby ALU Forward from executeas valB0x000: irmovl $10,%edx1 2 3 4 5 6 7 8F D E MW0x006: irmovl $3,%eaxF D E MWF D E M W0x00c: addl %edx,%eaxF D E M W0x00e: halt# demo-h0.ysCycle 4MDvalA ! M_valE = 10valB ! e_valE = 3M_dstE = %edxM_valE = 10srcA = %edxsrcB = %eaxEE_dstE = %eaxe_valE ! 0 + 3 = 37ImplementingForwarding Add additional feedbackpaths from E, M, and Wpipeline registers intodecode stage Create logic blocks toselect from multiplesources for valA and valBin decode stageMFDInstructionmemoryInstructionmemoryPCincrementPCincrementRegisterfileRegisterfileCCCCALUALUDatamemoryDatamemorySelectPCrBdstE dstMALUAALUBMem.controlAddrsrcA srcBreadwriteALUfun.FetchDecodeExecuteMemoryWrite backdata outdata inA BMEM_valAW_valEW_valMW_valEM_valAW_valMf_PCPredictPCBchicode valE valA dstE dstMEicode ifun valC valA valB dstE dstM srcA srcBvalC valPicode ifun rApredPCd_srcBd_srcAe_BchM_BchSel+FwdAFwdBWicode valE valM dstE dstMm_valMW_valMM_valEe_valE8Implementing ForwardingMFDInstructionmemoryInstructionmemoryPCincrementPCincrementRegisterfileRegisterfileCCCCALUALUDatamemoryDatamemorySelectPCrBdstE dstMALUAALUBMem.controlAddrsrcA srcBreadwriteALUfun.FetchDecodeExecuteMemoryWrite backdata outdata inA BMEM_valAW_valEW_valMW_valEM_valAW_valMf_PCPredictPCBchicode valE valA dstE dstMEicode ifun valC valA valB dstE dstM srcA srcBvalC valPicode ifun rApredPCd_srcBd_srcAe_BchM_BchSel+FwdAFwdBWicode valE valM dstE dstMm_valMW_valMM_valEe_valE## What should be the A value?int new_E_valA = [ # Use incremented PCD_icode in { ICALL, IJXX } : D_valP; # Forward valE from executed_srcA == E_dstE : e_valE; # Forward valM from memoryd_srcA == M_dstM : m_valM; # Forward valE from memoryd_srcA == M_dstE : M_valE; # Forward valM from write back d_srcA == W_dstM : W_valM; # Forward valE from write backd_srcA == W_dstE : W_valE; # Use value read from register file 1 : d_rvalA;];9Limitation of ForwardingLoad-use dependencyLoad-use dependency Value needed by end ofdecode stage in cycle 7 Value read from memory inmemory stage of cycle 80x000: irmovl $128,%edx1 2 3 4 5 6 7 8 9F D E MW0x006: irmovl $3,%ecxF D E MW0x00c: rmmovl %ecx, 0(%edx)F D E M W0x012: irmovl $10,%ebxF D E M W0x018: mrmovl 0(%edx),%eax # Load %eaxF D E M W# demo-luh.ys0x01e: addl %ebx,%eax # Use %eax0x020: haltF D E M WF D E M W10F D E M W11ErrorMM_dstM = %eaxm_valM ! M[128] = 3Cycle 7 Cycle 8DvalA ! M_valE = 10valB ! R[%eax] = 0DvalA ! M_valE = 10valB ! R[%eax] = 0MM_dstE = %ebxM_valE = 10•••10Avoiding Load/Use Hazard Stall using instruction forone cycle Can then pick up loadedvalue by forwarding frommemory stage0x000: irmovl $128,%edx1 2 3 4 5 6 7 8 9F D E MWF D E MW0x006: irmovl $3,%ecxF D E MWF D E MW0x00c: rmmovl %ecx, 0(%edx)F D E M WF D E M W0x012: irmovl $10,%ebxF D E M WF D E M W0x018: mrmovl 0(%edx),%eax # Load %eaxF D E M WF D E M W# demo-luh.ys0x01e: addl %ebx,%eax # Use %eax0x020: haltF D E M WE M W10D D E M W11bubbleF D E M WFF12MM_dstM = %eaxm_valM ! M[128] = 3MM_dstM = %eaxm_valM ! M[128] = 3Cycle 8DvalA ! W_valE = 10valB ! m_valM = 3DvalA ! W_valE = 10valB ! m_valM = 3WW_dstE = %ebxW_valE = 10WW_dstE = %ebxW_valE = 10•••11Detecting Load/Use HazardMFDInstructionmemoryInstructionmemoryPCincrementPCincrementRegisterfileRegisterfileCCCCALUALUDatamemoryDatamemorySelectPCrBdstE dstMALUAALUBMem.controlAddrsrcA srcBreadwriteALUfun.FetchDecodeExecuteMemoryWrite backdata outdata inA BMEM_valAW_valEW_valMW_valEM_valAW_valMf_PCPredictPCBchicode valE valA dstE dstMEicode ifun valC valA valB dstE dstM srcA srcBvalC valPicode ifun rApredPCd_srcBd_srcAe_BchM_BchSel+FwdAFwdBWicode valE valM dstE dstMm_valMW_valMM_valEe_valEMFDInstructionmemoryInstructionmemoryPCincrementPCincrementRegisterfileRegisterfileCCCCALUALUDatamemoryDatamemorySelectPCrBdstE dstMALUAALUBMem.controlAddrsrcA srcBreadwriteALUfun.FetchDecodeExecuteMemoryWrite backdata outdata inA BMEM_valAW_valEW_valMW_valEM_valAW_valMf_PCPredictPCBchicode valE valA dstE dstMEicode


View Full Document

UT CS 429H - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?