Unformatted text preview:

1"Carnegie Mellon Lecture 12 Parallelization I. Basic Parallelization II. Data dependence analysis III. Interprocedural parallelization Chapter 11.1-11.1.4 M. Lam CS243: Parallelization 1 Carnegie Mellon Parallelization of Numerical Applications • DoAll loop parallelism – Find loops whose iterations are independent – Number of iterations typically scales with the problem – Usually much larger than the number of processors in a machine – Divide up iterations across machines M. Lam CS243: Parallelization 22"Carnegie Mellon Basic Parallelism Examples: FOR i = 1 to 100 A[i] = B[i] + C[i] FOR i = 11 TO 20 a[i] = a[i-1] + 3 FOR i = 11 TO 20 a[i] = a[i-10] + 3 • Does there exist a data dependence edge between two different iterations? • A data dependence edge is loop-carried if it crosses iteration boundaries • DoAll loops: loops without loop-carried dependences M. Lam CS243: Parallelization 3 Carnegie Mellon Recall: Data Dependences • True dependence: a = = a • Anti-dependence: = a a = • Output dependence a = a = M. Lam CS243: Parallelization 43"Carnegie Mellon Affine Array Accesses • Common patterns of data accesses: (i, j, k are loop indexes) A[i], A[j], A[i-1], A[0], A[i+j], A[2*i], A[2*i+1] A[i,j], A[i-1, j+1] • Array indexes are affine expressions of surrounding loop indexes – Loop indexes: in, in-1, ... , i1 – Integer constants: cn, cn-1, ... , c0 – Array index: cnin + cn-1in-1+ ... + c1i1+ c0 – Affine expression: linear expression + a constant term (c0) M. Lam CS243: Parallelization 5 Carnegie Mellon II. Formulating Data Dependence Analysis FOR i := 2 to 5 do A[i-2] = A[i]+1; • Between read access A[i] and write access A[i-2]there is a dependence if: – there exist two iterations ir and iw within the loop bounds, s.t. – iterations ir & iw read & write the same array element, respectively ∃integers iw, ir 2 ≤ iw, ir ≤ 5 ir = iw - 2 • Between write access A[i-2] and write access A[i-2] there is a dependence if: ∃integers iw, iv 2 ≤ iw, iv ≤ 5 iw – 2 = iv - 2 – To rule out the case when the same instance depends on itself: • add constraint iw ≠ iv M. Lam CS243: Parallelization 64"Carnegie Mellon Memory Disambiguation is Undecidable at Compile Time read(n) For i = a[i] = a[n] M. Lam CS243: Parallelization 7 Carnegie Mellon Domain of Data Dependence Analysis • Only use loop bounds and array indexes that are affine functions of loop variables for i = 1 to n for j = 2i to 100 a[i+2j+3][4i+2j][i*i] = … … = a[1][2i+1][j] • Assume a data dependence between the read & write operation if there exists: – a read instance with indexes ir, jr and – a write instance with indexes iw, jw ∃integers ir, jr, iw, jw 1 ≤ iw, ir ≤ n 2iw ≤ jw ≤ 100 2ir ≤ jr ≤ 100 iw + 2jw + 3 = 1 4iw + 2jw = 2ir + 1 • Equate each dimension of array access; ignore non-affine ones – No solution  No data dependence – Solution  there may be a dependence M. Lam CS243: Parallelization 85"Carnegie Mellon Complexity of Data Dependence Analysis • Equivalent to integer linear programming ∃ integer i A1i ≤ b1 A2i = b2 • Integer linear programming is NP-complete – O(size of the coefficients) or O(nn) M. Lam CS243: Parallelization 9 For every pair of accesses not necessarily distinct (F1 , f1 ) and (F2, f2) one must be a write operation Let B1i1+b1 ≥ 0, B2i2+b2 ≥ 0 be the corresponding loop bound constraints, ∃ integers i1, i2 B1i1 + b1 ≥ 0, B2i2 + b2 ≥ 0 F1 i1+ f1 = F2 i2+f2 If the accesses are not distinct, then add the constraint i1 ≠ i2 Carnegie Mellon Data Dependence Analysis Algorithm • Typically solving many tiny, repeated problems – Integer linear programming packages optimize for large problems – Use memoization to remember the results of simple tests • Apply a series of relatively simple tests – GCD: 2*i, 2*i+1; GCD for simultaneous equations – Test if the ranges overlap • Backed up by a more expensive algorithm – Use Fourier-Motzkin Elimination to test if there is a real solution • Keep eliminating variables to see if a solution remains • If there is no solution, then there is no integer solution6"Carnegie Mellon Fourier-Motzkin Elimination • To eliminate a variable from a set of linear inequalities. • To eliminate a variable x1 – Rewrite all expressions in terms of lower or upper bounds of x1 – Create a transitive constraint for each pair of lower and upper bounds. • Example: Let L, U be lower bounds and upper bounds resp – To eliminate x1: L1(x2,"…,"xn)"≤!!x1"≤""U1"(x2,"…,"xn)"L2(x2,"…,"xn)"≤!!x1"≤""U2"(x2,"…,"xn)"L1(x2,"…,"xn)"≤!!U1"(x2,"…,"xn)"L1(x2,"…,"xn)"≤!!U2"(x2,"…,"xn)"L2(x2,"…,"xn)"≤!!U1"(x2,"…,"xn)"L2(x2,"…,"xn)"≤!!U2"(x2,"…,"xn)"Carnegie Mellon Example FOR i = 1 to 5 FOR j = i+1 to 5 A[i,j] = f(A[i,i], A[i-1,j]) M. Lam CS243: Parallelization 12 """""""1"≤!i"""""""""""""""""i"≤!5"i"+"1""≤!!j"""""""""""""""j"≤!5"""""""""1"≤!i’""""""""""""""""""i’"≤!5"i’"+"1""≤!!j’""""""""""""""""j’"≤!5"1:"Data"dep"between"A[i,j],"A[i,i]""""""""i"="i’""""""""j"="i’""i’+1"≤!!i’"""2:"Data"dep"between"A[i,j]"and"A[iA1,j]"""""""""""""i"="i’"–"1"""=>"i+1"="i’"""""""""""""j"="j’""SubsGtuGng""""""""""""""1"≤!i"+"1,""""""""i"+"1"≤!5""""""""""i"+"2""≤!!j"""""""""""""""""""""j"≤!5""""""""Combining""""""""""""""1"≤"i;"i"≤"4""""""""""i"≤!!j"A2;"j"≤!5"EliminaGng"i:""""""""""""""1"≤"4;"1"≤!!j"A2;"j"≤!5""""""""""""""3"≤!!j;""""j"≤!5"EliminaGng"j:""""""""""""""3"≤!5""7"Carnegie Mellon Data Dependence Analysis Algorithm • Typically solving many tiny, repeated problems – Integer linear programming packages optimize for large problems – Use memoization to remember the results of simple tests • Apply a series of relatively simple tests – GCD: 2*i, 2*i+1; GCD for simultaneous equations – Test if the ranges overlap • Backed up by a more expensive algorithm – Use Fourier-Motzkin Elimination to test if there is a real solution • Keep eliminating variables


View Full Document

Stanford CS 243 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?