Instruction Level Parallelism ILPOutlineWhat’s ILPExample: Sequential vs ILPILP vs Parallel ProcessingILP ChallengesDependences and HazardsTypes of DependenciesName dependencesData DependencesControl DependencesResource dependencesILP ArchitecturesILP Architectures ClassificationsSequential architecture and superscalar processorsSuperscalar ProcessorsDependence architecture and data flow processorsDependence architectures Dataflow processorsDataflow strengths and limitationsIndependence architecture and VLIW processorsVLIW processorsVLIW strengthsVLIW limitationsSummary: ILP ArchitecturesILP SchedulingILP Scheduling: Trace schedulingTrace SchedulingTrace Scheduling in HWTrace scheduling in SWILP open problemsReferencesInstruction Level Parallelism ILP Advanced Computer Architecture CSE 8383Spring 2004 2/19/2004Presented By: Sa’ad Al-HarbiSaeed Abu NimehOutlineWhat’s ILPILP vs Parallel ProcessingSequential execution vs ILP execution Limitations of ILPILP ArchitecturesSequential ArchitectureDependence ArchitectureIndependence ArchitectureILP SchedulingOpen ProblemsReferencesWhat’s ILPArchitectural technique that allows the overlap of individual machine operations ( add, mul, load, store …)Multiple operations will execute in parallel (simultaneously)Goal: Speed Up the executionExample:load R1 R2 add R3 R3, “1”add R3 R3, “1” add R4 R3, R2add R4 R4, R2 store [R4] R0Example: Sequential vs ILPSequential execution (Without ILP)Add r1, r2 r8 4 cyclesAdd r3, r4 r7 4 cycles 8 cyclesILP execution (overlap execution)Add r1, r2 r8 Add r3, r4 r7Total of 5 cyclesILP vs Parallel ProcessingILPOverlap individual machine operations (add, mul, load…) so that they execute in parallelTransparent to the userGoal: speed up executionParallel ProcessingHaving separate processors getting separate chunks of the program ( processors programmed to do so)Nontransparent to the userGoal: speed up and quality upILP ChallengesIn order to achieve parallelism we should not have dependences among instructions which are executing in parallel:H/W terminology Data Hazards ( RAW, WAR, WAW)S/W terminology Data DependenciesDependences and HazardsDependences are a property of programsIf two instructions are data dependent they can not execute simultaneouslyA dependence results in a hazard and the hazard causes a stall Data dependences may occur through registers or memoryTypes of DependenciesName dependenciesOutput dependenceAnti-dependenceData True dependenceControl DependenceResource DependenceName dependencesOutput dependenceWhen instruction I and J write the same register or memory location. The ordering must be preserved to leave the correct value in the registeradd r7,r4,r3div r7,r2,r8Anti-dependenceWhen instruction j writes a register or memory location that instruction I readsi: add r6,r5,r4j: sub r5,r8,r11Data DependencesAn instruction j is data dependent on instruction i if either of the following hold:instruction i produces a result that may be used by instruction j , orinstruction j is data dependent on instruction k, and instruction k is data dependent on instruction iLOOP LD F0, 0(R1)ADD F4, F0, F2SD F4, 0(R1)SUB R1, R1, -8BNE R1, R2, LOOPControl DependencesA control dependence determines the ordering of an instruction i, with respect to a branch instruction so that the instruction i is executed in correct program order.Example:If p1 { S1;};If p2 { S2;};Two constraints imposed by control dependences:1. An instruction that is control dependent on a branch cannot be moved before the branch2. An instruction that is not control dependent on a branch cannot be moved after the branchResource dependencesAn instruction is resource-dependent on a previously issued instruction if it requires a hardware resource which is still being used by a previously issued instruction.e.g.div r1, r2, r3div r4, r2, r5ILP ArchitecturesComputer Architecture: is a contract (instruction format and the interpretation of the bits that constitute an instruction) between the class of programs that are written for the architecture and the set of processor implementations of that architecture.In ILP Architectures: + information embedded in the program pertaining to available parallelism between instructions and operations in the programILP Architectures ClassificationsSequential Architectures: the program is not expected to convey any explicit information regarding parallelism. (Superscalar processors)Dependence Architectures: the program explicitly indicates the dependences that exist between operations (Dataflow processors)Independence Architectures: the program provides information as to which operations are independent of one another. (VLIW processors)Sequential architecture and superscalar processorsProgram contains no explicit information regarding dependencies that exist between instructionsDependencies between instructions must be determined by the hardwareIt is only necessary to determine dependencies with sequentially preceding instructions that have been issued but not yet completedCompiler may re-order instructions to facilitate the hardware’s task of extracting parallelismSuperscalar ProcessorsSuperscalar processors attempt to issue multiple instructions per cycle However, essential dependencies are specified by sequential ordering so operations must be processed in sequential orderThis proves to be a performance bottleneck that is very expensive to overcomeDependence architecture and data flow processors The compiler (programmer) identifies the parallelism in the program and communicates it to the hardware (specify the dependences between operations)The hardware determines at run-time when each operation is independent from others and perform schedulingHere, no scanning of the sequential program to determine dependencesObjective: execute the instruction at the earliest possible time (available input operands and functional units).Dependence architectures Dataflow processorsDataflow processors are representative of Dependence architecturesExecute instruction at earliest possible time subject to availability of input operands and functional units Dependencies communicated by providing with each instruction a list of all successor instructions As soon as all
View Full Document