Unformatted text preview:

Fall 2005 Lectures 14 & 15: Instruction Scheduling Simple Machine Model • Instructions are executed in sequence – Fetch, decode, execute, store results – One instruction at a time • For branch instructions, start fetching from a different location if needed – Check branch condition – Next instruction may come from a new location given by the branch instruction Saman Amarasinghe 2 6.035 ©MIT Fall 1998 Simple Execution Model • 5 Stage pipe-line decode memory ifetch execute wr teback • Fetch: get the next instruction • Decode: figure-out what that instruction is • Execute: Perform ALU operation – address calculation in a memory op • Memory: Do the memory access in a mem. Op. • Write Back: write the results back Saman Amarasinghe 3 6.035 ©MIT Fall 1998 4 6.035 IF DE WB IF DE WB Inst 1 Inst 2 IF DE WB IF DE WB IF DE WB IF DE WB IF DE WB Inst 1 Inst 2 Inst 3 Inst 4 Inst 5 Saman Amarasinghe ©MIT Fall 1998 Simple Execution Model EXE MEM EXE MEM time EXE MEM EXE MEM EXE MEM EXE MEM EXE MEM From a Simple Machine Model to a Real Machine Model • Many pipeline stages – Pentium 5 – Pentium Pro 10 – Pentium IV (130nm) 20 – Pentium IV (90nm) 31 • Different instructions taking different amount of time to execute • Hardware to stall the pipeline if an instruction uses a result that is not ready Saman Amarasinghe 5 6.035 ©MIT Fall 1998 Real Machine Model cont. • Most modern processors have multiple execution units (superscalar) – If the instruction sequence is correct, multiple operations will happen in the same cycles – Even more important to have the right instruction sequence Saman Amarasinghe 6 6.035 ©MIT Fall 1998 1Constraints On Scheduling • Data dependencies • Control dependencies • Resource Constraints Saman Amarasinghe 7 6.035 ©MIT Fall 1998 Data Dependency between Instructions • If two instructions access the same variable, they can be dependent • Kind of dependencies – True: write → read – Anti: read → write – Output: write → write • What to do if two instructions are dependent. – The order of execution cannot be reversed – Reduce the possibilities for scheduling Saman Amarasinghe 8 6.035 ©MIT Fall 1998 Computing Dependencies Representing Dependencies • For basic blocks, compute dependencies by • Using a dependence DAG, one per basic block walking through the instructions • Nodes are instructions, edges represent • Identifying register dependencies is simple dependencies – is it the same register? • For memory accesses – simple: base + offset1 ?= base + offset2 – data dependence analysis: a[2i] ?= a[2i+1] – interprocedural analysis: global ?= parameter – pointer alias analysis: p1 ?= p Saman Amarasinghe 9 6.035 ©MIT Fall 1998 Saman Amarasinghe 10 6.035 ©MIT Fall 1998 Representing Dependencies • Using a dependence DAG, one per basic block • Nodes are instructions, edges represent dependencies 1: r2 = *(r1 + 4) 2: r3 = *(r1 + 8) 3: r4 = r2 + r3 4: r5 = r2 - 1 3 1 2 4 2 2 2 • Edge is labeled with Latency: –v(i → j) = delay required between initiation times of i and j minus the execution time required by i Saman Amarasinghe 11 6.035 ©MIT Fall 1998 Example 1: r2 = *(r1 + 4)2: r3 = *(r2 + 4) 3: r4 = r2 + r3 4: r5 = r2 - 1 3 1 2 4 2 2 2 2 Saman Amarasinghe 12 6.035 ©MIT Fall 1998 2Another Example Control Dependencies and Resource Constraints 1: r2 = *(r1 + 4) 2: *(r1 + 4) = r3 3 1 2 4 • For now, lets only worry about basic blocks 3: r3 = r2 + r3 • For now, lets look at simple pipelines 4: r5 = r2 - 1 Saman Amarasinghe 13 6.035 ©MIT Fall 1998 Saman Amarasinghe 14 6.035 ©MIT Fall 1998 Example List Scheduling Algorithm Results In 1: lea var_a, %rax 1 cycle• Idea 2: add $4, %rax 1 cycle3: inc %r11 1 cycle– Do a topological sort of the dependence DAG 4: mov 4(%rsp), %r10 3 cycles – Consider when an instruction can be scheduled 5: add %r10, 8(%rsp) without causing a stall 6: and 16(%rsp), %rbx 4 cycles– Schedule the instruction if it causes no stall and all 7: imul %rax, %rbx 3 cycles its predecessors are already scheduled • Optimal list scheduling is NP-complete – Use heuristics when necessary 1 2 3 4 st st 5 6 st st st 7 Saman Amarasinghe 15 6.035 ©MIT Fall 1998 Saman Amarasinghe 16 6.035 ©MIT Fall 1998 List Scheduling Algorithm Heuristics for selection • Create a dependence DAG of a basic block • Heuristics for selecting from the READY list • Topological Sort – pick the node with the longest path to a leaf in the READY = nodes with no predecessors Loop until READY is empty Schedule each node in READY when no stalling dependence graph – pick a node with most immediate successors – pick a node that can go to a less busy pipeline (in a Update READY superscalar) Saman Amarasinghe 17 6.035 ©MIT Fall 1998 Saman Amarasinghe 18 6.035 ©MIT Fall 1998 3Heuristics for selection • pick the node with the longest path to a leaf in the dependence graph • Algorithm (for node x) – If no successors dx = 0 –dx = MAX( dy + cxy) for all successors y of x – reverse breadth-first visitation order Saman Amarasinghe 19 6.035 ©MIT Fall 1998 Heuristics for selection • pick a node with most immediate successors • Algorithm (for node x): –fx = number of successors of x Saman Amarasinghe 20 6.035 ©MIT Fall 1998 Example Results In 1: lea var_a, %rax 1 cycle2: add $4, %rax 1 cycle3: inc %r11 1 cycle4: mov 4(%rsp), %r10 3 cycles5: add %r10, 8(%rsp) 6: and 16(%rsp), %rbx 4 cycles7: imul %rax, %rbx 8: mov %rbx, 16(%rsp) 3 cycles 9: lea var_b, %rax Saman Amarasinghe 21 6.035 ©MIT Fall 1998 Example 1 6 8 2 7 9 1 1 3 4 1 4 5 3 3 d=0 d=0 d=3 d=7d=4 d=5 f=0 f=1f=1 f=1 f=2 f=0 d=0 f=0 d=3 f=1 READY = { } d=0 f=0 6 1 2 4 7 3 5 8 9 Saman Amarasinghe 22 6.035 ©MIT Fall 1998 Example Results In 1: lea var_a, %rax 1 cycle2: add $4, %rax 1 cycle3: inc %r11 1 cycle4: mov 4(%rsp), %r10 3 cycles5: add %r10, 8(%rsp) 6: and 16(%rsp), %rbx 4 cycles7: imul %rax, %rbx 8: mov %rbx, 16(%rsp) 3 cycles 9: lea 1 2 3 4 st st 5 6 st st st 7 8 96 1 2 4 7 3 5 8 9 var_b, %rax 14 cycles vs 9 cyclesSaman Amarasinghe 23 6.035 ©MIT Fall 1998 Resource Constraints • Modern machines have many resource constraints • Superscalar architectures: – can run few parallel operations – But


View Full Document

MIT 6 035 - Instruction Scheduling

Download Instruction Scheduling
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Instruction Scheduling and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Instruction Scheduling 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?