DOC PREVIEW
Berkeley COMPSCI C267 - Sources of Parallelism and Locality

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 267 Applications of Parallel Computers Lecture 12: Sources of Parallelism and Locality (Part 3) Tricks with TreesRecap of last lectureOutlineSlide 4Poisson’s equation in 1DPoisson’s equation in 2DAlgorithms for 2D Poisson Equation with N unknownsRelation of Poisson’s equation to Gravity, ElectrostaticsComments on practical meshesComposite mesh from a mechanical structureConverting the mesh to a matrixEffects of Ordering Rows and Columns on Gaussian EliminationIrregular mesh: NASA Airfoil in 2D (direct solution)Irregular mesh: Tapered Tube (multigrid)Adaptive Mesh Refinement (AMR)Challenges of irregular meshes (and a few solutions)Tricks with TreesSlide 18A log n lower bound to compute any function of n variablesBroadcasts and Reductions on TreesParallel Prefix, or ScanMapping Parallel Prefix onto a Tree - DetailsAdding two n-bit integers in O(log n) timeMultiplying n-by-n matrices in O(log n) timeInverting triangular n-by-n matrices in O(log2 n) timeInverting Dense n-by-n matrices in O(log n) timeEvaluating arbitrary expressionsEvaluating recurrencesSummary of tree algorithmsCS267 L12 Sources of Parallelism(3).1Demmel Sp 1999CS 267 Applications of Parallel ComputersLecture 12: Sources of Parallelism and Locality(Part 3)Tricks with TreesJames Demmelhttp://www.cs.berkeley.edu/~demmel/cs267_Spr99CS267 L12 Sources of Parallelism(3).2Demmel Sp 1999Recap of last lecture°ODEs•Sparse Matrix-vector multiplication•Graph partitioning to balance load and minimize communication°PDEs•Heat Equation and Poisson Equation•Solving a certain special linear system T•Many algorithms, ranging from-Dense Gaussian elimination, slow but very general, to-Multigrid, fast but only works on matrices like TCS267 L12 Sources of Parallelism(3).3Demmel Sp 1999Outline°Continuation of PDEs•What do realistic meshes look like?°Tricks with TreesCS267 L12 Sources of Parallelism(3).4Demmel Sp 1999Partial Differential EquationsPDEsCS267 L12 Sources of Parallelism(3).5Demmel Sp 1999Poisson’s equation in 1D°Solve Tx=b where2 -1 -1 2 -1 -1 2 -1 -1 2 -1 -1 2T =2-1 -1Graph and “stencil”CS267 L12 Sources of Parallelism(3).6Demmel Sp 1999Poisson’s equation in 2D°Solve Tx=b where°3D is analogous4 -1 -1-1 4 -1 -1 -1 4 -1 -1 4 -1 -1 -1 -1 4 -1 -1 -1 -1 4 -1 -1 4 -1 -1 -1 4 -1 -1 -1 4T =4-1-1-1-1Graph and “stencil”CS267 L12 Sources of Parallelism(3).7Demmel Sp 1999Algorithms for 2D Poisson Equation with N unknownsAlgorithm Serial PRAM Memory #Procs°Dense LU N3N N2N2°Band LU N2N N3/2N°Jacobi N2 N N N°Explicit Inv. N log N N N°Conj.Grad. N 3/2N 1/2 *log N N N°RB SOR N 3/2N 1/2N N°Sparse LU N 3/2N 1/2N*log N N°FFT N*log N log N N N°Multigrid N log2 N N N°Lower bound N log N NPRAM is an idealized parallel model with zero cost communication (see next slide for explanation)2 22CS267 L12 Sources of Parallelism(3).8Demmel Sp 1999Relation of Poisson’s equation to Gravity, Electrostatics°Force on particle at (x,y,z) due to particle at 0 is -(x,y,z)/r^3, where r = sqrt(x +y +z )°Force is also gradient of potential V = -1/r = -(d/dx V, d/dy V, d/dz V) = -grad V°V satisfies Poisson’s equation (try it!)2 2 2CS267 L12 Sources of Parallelism(3).9Demmel Sp 1999Comments on practical meshes°Regular 1D, 2D, 3D meshes•Important as building blocks for more complicated meshes°Practical meshes are often irregular•Composite meshes, consisting of multiple “bent” regular meshes joined at edges•Unstructured meshes, with arbitrary mesh points and connectivities•Adaptive meshes, which change resolution during solution process to put computational effort where neededCS267 L12 Sources of Parallelism(3).10Demmel Sp 1999Composite mesh from a mechanical structureCS267 L12 Sources of Parallelism(3).11Demmel Sp 1999Converting the mesh to a matrixCS267 L12 Sources of Parallelism(3).12Demmel Sp 1999Effects of Ordering Rows and Columns on Gaussian EliminationCS267 L12 Sources of Parallelism(3).13Demmel Sp 1999Irregular mesh: NASA Airfoil in 2D (direct solution)CS267 L12 Sources of Parallelism(3).14Demmel Sp 1999Irregular mesh: Tapered Tube (multigrid)CS267 L12 Sources of Parallelism(3).15Demmel Sp 1999Adaptive Mesh Refinement (AMR)°Adaptive mesh around an explosion°John Bell and Phil Colella at LBL (see class web page for URL)°Goal of Titanium is to make these algorithms easier to implementin parallelCS267 L12 Sources of Parallelism(3).16Demmel Sp 1999Challenges of irregular meshes (and a few solutions)°How to generate them in the first place•Triangle, a 2D mesh partitioner by Jonathan Shewchuk•3D harder!°How to partition them•ParMetis, a parallel graph partitioner°How to design iterative solvers•PETSc, a Portable Extensible Toolkit for Scientific Computing•Prometheus, a multigrid solver for finite element problems on irregular meshes•Titanium, a language to implement Adaptive Mesh Refinement°How to design direct solvers•SuperLU, parallel sparse Gaussian elimination°These are challenges to do sequentially, the more so in parallelCS267 L12 Sources of Parallelism(3).17Demmel Sp 1999Tricks with TreesCS267 L12 Sources of Parallelism(3).18Demmel Sp 1999Outline°A log n lower bound to compute any function in parallel°Reduction and broadcast in O(log n) time°Parallel prefix (scan) in O(log n) time°Adding two n-bit integers in O(log n) time°Multiplying n-by-n matrices in O(log n) time°Inverting n-by-n triangular matrices in O(log n) time°Inverting n-by-n dense matrices in O(log n) time°Evaluating arbitrary expressions in O(log n) time°Evaluating recurrences in O(log n) time°Solving n-by-n tridiagonal matrices in O(log n) time°Traversing linked lists °Computing minimal spanning trees°Computing convex hulls of point sets22CS267 L12 Sources of Parallelism(3).19Demmel Sp 1999A log n lower bound to compute any function of n variables°Assume we can only use binary operations, one per time unit°After 1 time unit, an output can only depend on two inputs°Use induction to show that after k time units, an output can only depend on 2k inputs°A binary tree performs such a computationCS267 L12 Sources of


View Full Document

Berkeley COMPSCI C267 - Sources of Parallelism and Locality

Documents in this Course
Lecture 4

Lecture 4

52 pages

Split-C

Split-C

5 pages

Lecture 5

Lecture 5

40 pages

Load more
Download Sources of Parallelism and Locality
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Sources of Parallelism and Locality and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Sources of Parallelism and Locality 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?