**Unformatted text preview:**

15-499: Parallel Algorithms Lecturer: Guy BlellochTopic: Sets and Sequences II Date: Thursday January 29, 2009Scribe: Michael Rule5.1 Efficient Function CompositionIn the previous lecture we saw how to perform a prefix-sum operation using contraction. This procedure canbe viewed as a special case of function composition. The general case seems inherently sequential, but cansometimes be converted to a parallel procedure if certain conditions are met.Since every iteration generates the state for the next iteration, this seem like an inherently linear algorithmthat is not trivial to parallelise like quicksort. Last class we talked about how prefix-sum could be parallelizedby the contraction method. We started to generalise this to arbitrary function composition where a givenelement :Xi= Fi(Fi−1(Fi−2(...F0(X0)...))) (5.1.1)That is, we first apply F0, then F1then F2, F3all the way up to Fi−1and finally FiWe could calculate each Xidirectly if we could generate this composition and apply it in parallel. In NESLthese would be something likeA = [I, F0◦ I, F1◦ F0◦ I, F2◦ F1◦ F0◦ I, ..., Fn−1◦ ... ◦ F 0] (5.1.2)f(x0) : f inA (5.1.3)Examining the case of prefix sum for array [3, 1, 7, 2, 6, . . . ], our functions Fiare :Fi= [+3, +1, +7, +2, +6, ...] (5.1.4)Composing Add is pretty trivialA = [I, +3, +3 + 1, +3 + 1 + 7, +3 + 1 + 7 + 2, +3 + 1 + 7 + 2 + 6, ...] (5.1.5)A = [I, +3, +4, +11, +13, +19, ...] (5.1.6)This is similar to the prefix sum described in the previous class. In this case function composition is trivial.In general it may not be this easy. An inefficient method of performing function composition would just beto store the composition to be evaluated later. In order for us to be able to efficiently solve this problem inparallel, we must have a way to efficiently do function composition. We present three types of techniquesfor efficiently composing functions :1• State Machine Transitions• Carry Propagation• Linear Recurrences5.1.1 State Machine TransitionsIf we are able to represent a seemingly linear sequence of operations as a series of transitions on a statemachine, we parallelize this linear procedure by composing the state machine transition functions.Example : Say we have a state machine with 3 states [ 1, 2, 3 ] and two transition functions [ a, b ]. :States : [1,2,3]Transitions :a : state → (state + 1) mod 3b : state → stateIt turns out that its easy to represent state machines as functions. We describe a state machine as a functionfrom input state to output state.Fa= 1 → 2, 2 → 3, 3 → 1 (5.1.7)Fb= 1 → 1, 2 → 2, 3 → 3 (5.1.8)Consider a sequence of transitions, a string in [a,b], applied to this state machine from some initial state. Wewant to find the resultant state. We could do this linearly using function composition, but it is possible tocompose the state transition functions efficiently.Some state transition function composition examples :Fa◦ Fa= 1 → 3, 2 → 1, 3 → 2 (5.1.9)Fa◦ (Fa◦ Fa) = Fb(5.1.10)Notice that the description of the state transition function is compact, and can be calculated quite easily. Theoutput is always some sort of permutation or map from states to states. If we have some constant number ofK states, we can compose two function in K steps. This can be done with table lookup very efficiently. Ifyou have a small number of possible transitions N, you can store a NxN table which stores the new transitionformed by compositing two transitions.You can apply the same contraction method, discussed two lecturesago, to compute all the state transition compositions.This procedure can be used to construct a parallel algorithm with linear work and log(n) depth which canfigure out the state after any sequence of transitions. Carry propagation is one application of this technique.25.1.2 Carry PropagationFor simplicity consider adding together two binary numbers. This procedure generalizes to any base, butbinary provides a more clear example.Consider adding two binary number A and B whereA = 1010011010 (5.1.11)B = 1101101000 (5.1.12)We break convention and write the number with the least significant byte first. This is so we can representthe algorithms as scanning form left to right, which is consistent with previous examples. A linear procedurefor adding these two number would produce output like this :Carry = 0100000100 (5.1.13)A = 1010011010 (5.1.14)B = 1101101000 (5.1.15)A + B = 0000001110 (5.1.16)Because of the dependence of higher order bits on carries from lower order bit, this seems to be an inherentlysequential algorithm. However, it can be represented as a very simple state machine :States ( 0 : no carry, 1 : carry )Transitionsa : clear carry (Ai= 0, Bi= 0) (5.1.17)b : preserve carry (Ai= 0, Bi= 1ORAi= 1, Bi= 0) (5.1.18)c : create carry (Ai= 1, Bi= 1) (5.1.19)statetransitions = cbbbbbcaba (5.1.20)A = 1010011010 (5.1.21)B = 1101101000 (5.1.22)State transition functions :Fa= 0 → 0, 1 → 0 (5.1.23)Fb= 0 → 0, 1 → 1 (5.1.24)Fc= 0 → 1, 1 → 1 (5.1.25)301acb,ca,bFigure 5.1.1: state machine for carry propagationThese can now be easily composed e.g.Fb◦ Fb◦ Fb◦ F b = Fb(5.1.26)Fb◦ Fb◦ Fa= Fa(5.1.27)Fb◦ Fb◦ Fc= Fc(5.1.28)Fc◦ Fb◦ Fb◦ Fb◦ Fa= Fc(5.1.29)We can therefore easily compose these things, and apply a prefix sum algorithm on the state transitionfunction array. This means we can do carry propagation in linear work and in parallel.W (n) = W (n/2) + O(n) = O(n) (5.1.30)Try to think about converting a linear procedure into a state machine. More generally, try to see if thefunctions in question can be composed efficiently. Linear Recurrences are an example of something that ishard to phrase as a state machine, but still has a efficient method of function composition which makes itparallelizable.5.1.3 Linear RecurrencesExample consider a loopfor i = 0 .. n-1Xi+1= ai+ bi· xiThis is an example of a linear recurrence. Since each Xi+1depends only on Xithis is a first order recurrence.our function :fi(x) = ai+ bi· x (5.1.31)4Efficient composition : composition itself is fast ( constant time ) complexity of resultant function does notgrow. Linear functions can be easily composed :f1= a1+ b1· x (5.1.32)f2= a2+ b2· x (5.1.33)f2◦ f1(x) = a2+ b2· (a1+ b1· x) (5.1.34)f2◦ f1(x) = a2+ b2· a1+ b2· b1· x (5.1.35)f2◦ f1(x) = (a2+ b2· a1) + b2· b1· x (5.1.36)f2◦ f1(x) = f0(x) = a0+ b0· x (5.1.37)a0= a2+ b2· a1(5.1.38)b0= b2· b1(5.1.39)So, in one + and two · we can

View Full Document