Chapter 9 DEPENDENCE DRIVEN LOOP MANIPULATION David A Padua 1998 1 9 1 DEPENDENCES Flow Dependence True Dependence S1 S2 X A B C X 1 S1 S2 Anti Dependence S1 S2 A X B X C D S1 S2 Output Dependence S1 S2 X A B X C D David A Padua 1998 S1 S2 2 9 2 DEPENDENCE AND PARALLELIZATION SPREADING S1 S4 S5 S2 S3 S6 S7 S10 S8 S11 S9 S1 S2 S3 can execute in parallel with S4 S5 S6 S 8 S 9 David A Padua 1998 S10 S11 3 C OMP SECTIONS C OMP SECTION A S1 S2 S3 SECTION B S4 S5 S6 C OMP C OMP END SECTIONS S7 C OMP C OMP C OMP C OMP SECTIONS SECTION S8 S9 SECTION S10 S11 END SECTIONS David A Padua 1998 4 9 3 RENAMING To remove memory related dependences S1 S1 A X B S2 X Y 1 S3 C X B S4 X Z B S5 D X 1 S2 S3 S4 S5 Use renaming S1 S1 A X B S2 S2 X1 Y 1 S3 C X1 B S4 X2 Z B S5 D X2 1 S3 S4 S5 David A Padua 1998 5 9 4 DEPENDENCES IN LOOPS DO I 1 N S1 S2 A B I 1 C I A 2 END DO S1 S1 S1 S1 S1 S2 S2 David A Padua 1998 S2 S2 S2 6 9 5 DEPENDENCES IN LOOPS Cont DO I 1 N S1 X I 1 B I 1 S1 A I X I END DO DO I 1 N S1 X I B I 1 S1 A I X I 1 1 S1 S2 S1 S2 END DO David A Padua 1998 7 9 6 DEPENDENCE ANALYSIS DO I 1 N S1 X F I B I 1 S2 A I X G I 2 END DO We say that S1 S2 We say that S1 IFF I1 I2 F I1 G I2 ALSO I1 I2 1 N IFF I1 I2 F I2 G I1 S2 David A Padua 1998 8 9 7 LOOP PARALLELIZATION AND VECTORIZATION A loop whose dependence graph is cycle free can be parallelized or vectorized e g S1 DO I 1 N X I B I 1 A I X I 1 END DO X 1 N B 1 N 1 A 1 N X 1 N 1 S2 PARALLEL DO I 1 N X I B I 1 A I X I 1 END PARALLEL DO The reason is that if there are no cycles in the dependence graph then there will be no races in the parallel loop David A Padua 1998 9 9 8 ALGORITHM REPLACEMENT Some program patterns occur frequently in programs They can be replaced with a parallel algorithm e g DO I 1 N A I A I 1 B I END DO A 1 N REC1N B 1 N A 0 N X A 1 DO I 2 N IF X GT A I X A I END DO X MIN A 1 N David A Padua 1998 10 9 9 LOOP DISTRIBUTION To insulate these patterns we can decompose loops into several loops one for each strongly connected component block in the dependence graph DO I 1 N S1 S2 S3 S4 A I B I C I D I D I 1 A I IF X GT A I THEN X A I ENDIF END DO DO I 1 N A I B I C I END DO DO I 1 N D I D I 1 A I END DO DO I 1 N IF X GT A I THEN X A I END IF END DO David A Padua 1998 S1 S2 S3 S4 11 9 10 LOOP INTERCHANGING The dependence information detremines whether or not the loop headers can be interchanged For example the following loop headers can be interchanged do i 1 n do j 1 n a i j a i j 1 a i 1 j end do end do However the headers in the following loop cannot be interchanged David A Padua 1998 12 do i 1 n do j 1 n a i j a i j 1 a i 1 j end do end do David A Padua 1998 13 9 11 DEPENDENCE REMOVAL Some cycles in the dependence graph can be eliminated by using elementary transformations Scalar Expansion DO S1 S2 I 1 N A B I 1 C I A D I END DO S1 S2 DO S1 S2 I 1 N A1 I B I 1 C I A1 I D I END DO A A1 N S1 S2 David A Padua 1998 14 9 12 S1 S2 Induction variable recognition DO I 1 N J J 2 X I X I J END DO S1 S1 S2 DO I 1 N J1 J 2 I X I X I J1 END DO S1 S2 DO I 1 N J1 I J 2 I X I X I J1 I END DO S2 S1 S2 David A Padua 1998 15 9 13 More about the DO to PARALLEL DO transformation When the dependence graph inside a DO loop has no cross iteration dependences it can be transformed into a PARALLEL DO Example 1 do i 1 n S1 a i b i c i S2 d i x i 1 end do S1 S2 Example 2 S1 do i 1 n S a i b i c i 1 S2 d i x i 1 S1 end do David A Padua 1998 16 Example 3 do i 1 n S1 b i a i S2 do while b i 2 a i gt epsilon S b i b i a i b i 2 0 3 end do while end do S1 S1 S2 S2 S3 S3 David A Padua 1998 17 When there are cross iteration dependences but no cycles do loops can be aligned to be transformed into DOALLs Example 1 do i 1 n S1 a i b i 1 S c i a i 1 2 2 end do S1 1 S2 do i 0 n S1 if i 0 then a i b i 1 S if i n then c i 1 a i 2 2 end do David A Padua 1998 S1 0 S2 18 Sometimes we have to replicate to achieve alignment Example 2 do i 1 n a i b i c i d i a i a i 1 end do do i 1 n a i b i c i a1 i b i c i d i a1 i a i 1 end do do i 0 n if i 0 then a i b i c i if i n then a1 i 1 b i 1 c i 1 d i 1 a1 i 1 a i end do David A Padua 1998 19 Need for replication could propagate Example 3 do i 1 n c i 2 f i a i c i c i 1 d i a i a i 1 end do do i 1 n …
View Full Document
Unlocking...