DOC PREVIEW
GSU CSC 2010 - Exam2 Review

This preview shows page 1-2-3-22-23-24-44-45-46 out of 46 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 46 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 46 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 46 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 46 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 46 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 46 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 46 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 46 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 46 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 46 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Exam2 ReviewOutlineParallel processingParallel processing classification9.2 PipeliningSPEEDUPExampleSlide 8Example AnswerInstructions seperate5-Stage PipeliningPipeline HazardsData hazardSlide 14Slide 15Slide 16Slide 17Slide 18Branch hazardsSlide 20Branch Untaken (Freeze approach)Branch Taken (Freeze approach)Branch Untaken (Predicted-untaken)Branch Taken (Predicted-untaken)Branch Untaken (Predicted-taken)Branch taken (Predicted-taken)Delayed BranchSlide 28Slide 29Memory HierarchyRAMROMMemory Address MapSlide 34Cache memorySlide 36Associative mappingDirect MappingSlide 39Set-Associative MappingAverage memory access timeSlide 42Page FaultPerformance of Demand Paging9.4 Page ReplacementSlide 46Exam2 ReviewDr. Bernard Chen Ph.D.University of Central ArkansasSpring 2010OutlinePipelineMemory HierarchyParallel processingA parallel processing system is able to perform concurrent data processing to achieve faster execution timeThe system may have two or more ALUs and be able to execute two or more instructions at the same timeGoal is to increase the throughput – the amount of processing that can be accomplished during a given interval of timeParallel processing classificationSingle instruction stream, single data stream – SISDSingle instruction stream, multiple data stream – SIMDMultiple instruction stream, single data stream – MISDMultiple instruction stream, multiple data stream – MIMD9.2 PipeliningInstruction execution is divided into k segments or stagesInstruction exits pipe stage k-1 and proceeds into pipe stage kAll pipe stages take the same amount of time; called one processor cycleLength of the processor cycle is determined by the slowest pipe stagek segmentsSPEEDUPIf we execute the same task sequentially in a single processing unit, it takes (k * n) clock cycles.• The speedup gained by using the pipeline is:)1(1nknkTTSpeedu pkExampleA non-pipeline system takes 100ns to process a task; the same task can be processed in a FIVE-segment pipeline into 20ns, eachSpeedup Ratio for 1000 tasks: 100*1000 / (5 + 1000 -1)*20 = 4.98However, if the task cannot be evenly divided…ExampleA non-pipeline system takes 100ns to process a task; the same task can be processed in a six-segment pipeline with the time delay of each segment in the pipeline is as follows 20ns, 25ns, 30ns, 10ns, 15ns, and 30ns. Determine the speedup ratio of the pipeline for 10, 100, and 1000 tasks. What is the maximum speedup that can be achieved?Example AnswerSpeedup Ratio for 10 tasks:100*10 / (6+10-1)*30Speedup Ratio for 100 tasks:100*100 / (6+100-1)*30Speedup Ratio for 1000 tasks:100*1000 / (6+1000-1)*30Maximum Speedup:100*N/ (6+N-1)*30 = 10/3Instructions seperate 1. Fetch the instruction2. Decode the instruction3. Fetch the operands from memory 4. Execute the instruction5. Store the results in the proper place5-Stage PipeliningFetch Instruction (FI)FetchOperand (FO)Decode Instruction (DI)WriteOperand(WO)Execution Instruction (EI)S3 S4S1 S2 S51 2 3 4 98765S1S2S5S3S41 2 3 4 87651 2 3 4 7651 2 3 4 651 2 3 4 5TimePipeline Hazards There are situations, called hazards, that prevent the next instruction in the instruction stream from executing during its designated cycleThere are three classes of hazardsStructural hazardData hazardBranch hazardData hazardExample:ADD R1R2+R3SUB R4R1-R5AND R6R1 AND R7OR R8R1 OR R9XOR R10R1 XOR R11Data hazardFO: fetch data value WO: store the executed value Fetch Instruction (FI)FetchOperand (FO)Decode Instruction (DI)WriteOperand(WO)Execution Instruction (EI)S3 S4S1 S2 S5TimeData hazardDelay load approach inserts a no-operation instruction to avoid the data conflictADD R1R2+R3No-opNo-opSUB R4R1-R5AND R6R1 AND R7OR R8R1 OR R9XOR R10R1 XOR R11Data hazardData hazardIt can be further solved by a simple hardware technique called forwarding (also called bypassing or short-circuiting)The insight in forwarding is that the result is not really needed by SUB until the ADD execute completely If the forwarding hardware detects that the previous ALU operation has written the register corresponding to a source for the current ALU operation, control logic selects the results in ALU instead of from memoryData hazardBranch hazardsBranch hazards can cause a greater performance loss for pipelines When a branch instruction is executed, it may or may not change the PC If a branch changes the PC to its target address, it is a taken branchOtherwise, it is untakenBranch hazardsThere are FOUR schemes to handle branch hazardsFreeze scheme Predict-untaken schemePredict-taken schemeDelayed branchBranch Untaken (Freeze approach)The simplest method of dealing with branches is to redo the fetch following a branchFetch Instruction (FI)FetchOperand (FO)Decode Instruction (DI)WriteOperand(WO)Execution Instruction (EI)Branch Taken (Freeze approach)The simplest method of dealing with branches is to redo the fetch following a branchFetch Instruction (FI)FetchOperand (FO)Decode Instruction (DI)WriteOperand(WO)Execution Instruction (EI)Branch Untaken (Predicted-untaken)Fetch Instruction (FI)FetchOperand (FO)Decode Instruction (DI)WriteOperand(WO)Execution Instruction (EI)TimeBranch Taken (Predicted-untaken) Fetch Instruction (FI)FetchOperand (FO)Decode Instruction (DI)WriteOperand(WO)Execution Instruction (EI)Branch Untaken (Predicted-taken) Fetch Instruction (FI)FetchOperand (FO)Decode Instruction (DI)WriteOperand(WO)Execution Instruction (EI)Branch taken (Predicted-taken) Fetch Instruction (FI)FetchOperand (FO)Decode Instruction (DI)WriteOperand(WO)Execution Instruction (EI)Delayed BranchA fourth scheme in use in some processors is called delayed branchIt is done in compiler time. It modifies the code The general format is:branch instructionDelay slotbranch target if takenDelayed BranchOptimalOutlinePipelineMemory HierarchyMemory HierarchyThe main memory occupies a central position by being able to communicate directly with the CPU and with auxiliary memory devices through an I/O processorA special very-high-speed memory called cache is used to increase the speed of processing by making current programs and data available to the CPU at a rapid rateRAMROMMemory Address MapCache memoryWhen the CPU refers to memory and finds the word in cache, it is said to produce a


View Full Document

GSU CSC 2010 - Exam2 Review

Download Exam2 Review
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Exam2 Review and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Exam2 Review 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?