DOC PREVIEW
Berkeley COMPSCI C267 - Tricks with Trees

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 267 Tricks with TreesOutlineA log n lower bound to compute any function of n variablesBroadcasts and Reductions on TreesParallel Prefix, or ScanMapping Parallel Prefix onto a Tree - DetailsE.g., Fibonacci via Matrix Multiply PrefixAdding two n-bit integers in O(log n) timeOther applications of scansMultiplying n-by-n matrices in O(log n) timeInverting triangular n-by-n matrices in O(log2 n) timeInverting Dense n-by-n matrices in O(log n) timeEvaluating arbitrary expressionsEvaluating recurrencesSummary of tree algorithms02/07/06 CS267 Lecture 71CS 267Tricks with TreesJames Demmelwww.cs.berkeley.edu/~demmel/cs267_Spr0602/07/06 CS267 Lecture 72Outline°A log n lower bound to compute any function in parallel°Reduction and broadcast in O(log n) time°Parallel prefix (scan) in O(log n) time°Adding two n-bit integers in O(log n) time°Multiplying n-by-n matrices in O(log n) time°Inverting n-by-n triangular matrices in O(log n) time°Inverting n-by-n dense matrices in O(log n) time°Evaluating arbitrary expressions in O(log n) time°Evaluating recurrences in O(log n) time°Solving n-by-n tridiagonal matrices in O(log n) time°Traversing linked lists °Computing minimal spanning trees°Computing convex hulls of point sets2202/07/06 CS267 Lecture 73A log n lower bound to compute any function of n variables°Assume we can only use binary operations, one per time unit°After 1 time unit, an output can only depend on two inputs°Use induction to show that after k time units, an output can only depend on 2k inputs•After log2 n time units, output depends on at most n inputs°A binary tree performs such a computation02/07/06 CS267 Lecture 74Broadcasts and Reductions on Trees02/07/06 CS267 Lecture 75Parallel Prefix, or Scan°If “+” is an associative operator, and x[0],…,x[p-1] are input data then parallel prefix operation computes°Notation: j:k mean x[j]+x[j+1]+…+x[k], blue is final valuey[j] = x[0] + x[1] + … + x[j] for j=0,1,…,p-102/07/06 CS267 Lecture 76Mapping Parallel Prefix onto a Tree - Details°Up-the-tree phase (from leaves to root)°By induction, Lsave = sum of all leaves in left subtree°Down the tree phase (from root to leaves)°By induction, S = sum of all leaves to left of subtree rooted at the parent1) Get values L and R from left and right children2) Save L in a local register Lsave3) Pass sum L+R to parent1) Get value S from parent (the root gets 0)2) Send S to the left child3) Send S + Lsave to the right child02/07/06 CS267 Lecture 77E.g., Fibonacci via Matrix Multiply PrefixFn+1 = Fn + Fn-11-nnn1nFF 0111FFCan compute all Fn by matmul_prefix on [ , , , , , , , , ]then select the upper left entry 011101110111011101110111011101110111Slide source: Alan Edelman, MIT02/07/06 CS267 Lecture 78Adding two n-bit integers in O(log n) time°Let a = a[n-1]a[n-2]…a[0] and b = b[n-1]b[n-2]…b[0] be two n-bit binary numbers°We want their sum s = a+b = s[n]s[n-1]…s[0]°Challenge: compute all c[i] in O(log n) time via parallel prefix°Used in all computers to implement addition - Carry look-aheadc[-1] = 0 … rightmost carry bitfor i = 0 to n-1 c[i] = ( (a[i] xor b[i]) and c[i-1] ) or ( a[i] and b[i] ) ... next carry bit s[i] = ( a[i] xor b[i] ) xor c[i-1] for all (0 <= i <= n-1) p[i] = a[i] xor b[i] … propagate bit for all (0 <= i <= n-1) g[i] = a[i] and b[i] … generate bit c[i] = ( p[i] and c[i-1] ) or g[i] = p[i] g[i] * c[i-1] = C[i] * c[i-1] 1 1 0 1 1 1 … 2-by-2 Boolean matrix multiplication (associative) = C[i] * C[i-1] * … C[0] * 0 1 … evaluate each P[i] = C[i] * C[i-1] * … * C[0] by parallel prefix02/07/06 CS267 Lecture 79Other applications of scans°There are several applications of scans, some more obvious than others•add multi-precision numbers (represented as array of numbers)•evaluate recurrences, expressions •solve tridiagonal systems (numerically unstable!)•implement bucket sort and radix sort•to dynamically allocate processors•to search for regular expression (e.g., grep)°Names: +\ (APL), cumsum (Matlab), MPI_SCAN°Note: 2n operations used when only n-1 needed02/07/06 CS267 Lecture 710Multiplying n-by-n matrices in O(log n) time°For all (1 <= i,j,k <= n) P(i,j,k) = A(i,k) * B(k,j)•cost = 1 time unit, using n^3 processors°For all (1 <= I,j <= n) C(i,j) = P(i,j,k)•cost = O(log n) time, using a tree with n^3 / 2 processorsk =1n02/07/06 CS267 Lecture 711Inverting triangular n-by-n matrices in O(log2 n) time°Fact:°Function Tri_Inv(T) … assume n = dim(T) = 2m for simplicity°time(Tri_Inv(n)) = time(Tri_Inv(n/2)) + O(log(n))•Change variable to m = log n to get time(Tri_Inv(n)) = O(log2n)A 0C B-1= A 0-B CA B-1-1-1 -1If T is 1-by-1 return 1/Telse … Write T = A 0 C B In parallel do { invA = Tri_Inv(A) invB = Tri_Inv(B) } … implicitly uses a tree newC = -invB * C * invA Return invA 0 newC invB02/07/06 CS267 Lecture 712Inverting Dense n-by-n matrices in O(log n) time°Lemma 1: Cayley-Hamilton Theorem•expression for A-1 via characteristic polynomial in A°Lemma 2: Newton’s Identities•Triangular system of equations for coefficients of characteristic polynomial, matrix entries = sk°Lemma 3: sk = trace(Ak) =  Ak [i,i] = [i (A)]k°Csanky’s Algorithm (1976)°Completely numerically unstable2i=1ni=1n1) Compute the powers A2, A3, …,An-1 by parallel prefix cost = O(log2 n)2) Compute the traces sk = trace(Ak) cost = O(log n)3) Solve Newton identities for coefficients of characteristic polynomial cost = O(log2 n)4) Evaluate A-1 using Cayley-Hamilton Theorem cost = O(log n)02/07/06 CS267 Lecture 713Evaluating arbitrary expressions°Let E be an arbitrary expression formed from +, -, *, /, parentheses, and n variables, where each appearance of each variable is counted separately°Can think of E as arbitrary expression tree with n leaves (the variables) and


View Full Document

Berkeley COMPSCI C267 - Tricks with Trees

Documents in this Course
Lecture 4

Lecture 4

52 pages

Split-C

Split-C

5 pages

Lecture 5

Lecture 5

40 pages

Load more
Download Tricks with Trees
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Tricks with Trees and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Tricks with Trees 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?