DOC PREVIEW
Berkeley COMPSCI C267 - Lecture Notes

This preview shows page 1-2-3-4-30-31-32-33-34-61-62-63-64 out of 64 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 64 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

04/4/2007 CS267 Lecture 21CS 267 Applications of Parallel ComputersHierarchical Methods for the N-Body problemKathy Yelickwww.cs.berkeley.edu/~yelick/cs267_s0704/4/2007 CS267 Lecture 21Big Idea° Suppose the answer at each point depends on data at all the other points• Electrostatic, gravitational force• Solution of elliptic PDEs• Graph partitioning° Seems to require at least O(n2) work, communication° If the dependence on “distant” data can be compressed• Because it gets smaller, smoother, simpler…° Then by compressing data of groups of nearby points, can cut cost (work, communication) at distant points• Apply idea recursively: cost drops to O(n log n) or even O(n)° Examples:• Barnes-Hut or Fast Multipole Method (FMM) for electrostatics/gravity/…• Multigrid for elliptic PDE• Multilevel graph partitioning (METIS, Chaco,…)04/4/2007 CS267 Lecture 21Outline° Motivation• Obvious algorithm for computing gravitational or electrostatic force on N bodies takes O(N2) work° How to reduce the number of particles in the force sum• We must settle for an approximate answer (say 2 decimal digits, or perhaps 16 …)° Basic Data Structures: Quad Trees and Oct Trees° The Barnes-Hut Algorithm (BH)• An O(N log N) approximate algorithm for the N-Body problem° The Fast Multipole Method (FMM)• An O(N) approximate algorithm for the N-Body problem° Parallelizing BH, FMM and related algorithms04/4/2007 CS267 Lecture 21Particle Simulation° f(i) = external_force + nearest_neighbor_force + N-Body_force• External_force is usually embarrassingly parallel and costs O(N) for all particles- external current in Sharks and Fish• Nearest_neighbor_force requires interacting with a few neighbors, so still O(N)- van der Waals, bouncing balls• N-Body_force (gravity or electrostatics) requires all-to-all interactions- f(i) = Σ f(i,k) … f(i,k) = force on i from k- f(i,k) = c*v/||v||3 in 3 dimensions or f(i,k) = c*v/||v||2 in 2 dimensions– v = vector from particle i to particle k , c = product of masses or charges– ||v|| = length of v - Obvious algorithm costs O(N2), but we can do better...t = 0while t < t_finalfor i = 1 to n … n = number of particlescompute f(i) = force on particle ifor i = 1 to nmove particle i under force f(i) for time dt … using F=macompute interesting properties of particles (energy, etc.)t = t + dtend whilek != i04/4/2007 CS267 Lecture 21Applications° Astrophysics and Celestial Mechanics• Intel Delta = 1992 supercomputer, 512 Intel i860s• 17 million particles, 600 time steps, 24 hours elapsed time– M. Warren and J. Salmon– Gordon Bell Prize at Supercomputing 92• Sustained 5.2 Gflops = 44K Flops/particle/time step• 1% accuracy• Direct method (17 Flops/particle/time step) at 5.2 Gflops would have taken 18 years, 6570 times longer° Plasma Simulation° Molecular Dynamics° Electron-Beam Lithography Device Simulation° Fluid Dynamics (vortex method)° Good sequential algorithms too!04/4/2007 CS267 Lecture 21Reducing the number of particles in the force sum° All later divide and conquer algorithms use same intuition° Consider computing force on earth due to all celestial bodies• Look at night sky, # terms in force sum >= number of visible stars• Oops! One “star” is really the Andromeda galaxy, which contains billions of real stars- Seems like a lot more work than we thought …° Don’t worry, ok to approximate all stars in Andromeda by a single point at its center of mass (CM) with same total mass• D = size of box containing Andromeda , r = distance of CM to Earth• Require that D/r be “small enough”• Idea not new: Newton approximated earth and falling apple by CMs04/4/2007 CS267 Lecture 21What is new: Using points at CM recursively° From Andromeda’s point of view, Milky Way is also a point mass° Within Andromeda, picture repeats itself• As long as D1/r1 is small enough, stars inside smaller box can be replaced by their CM to compute the force on Vulcan• Boxes nest in boxes recursively04/4/2007 CS267 Lecture 21Outline° Motivation• Obvious algorithm for computing gravitational or electrostatic force on N bodies takes O(N2) work° How to reduce the number of particles in the force sum• We must settle for an approximate answer (say 2 decimal digits, or perhaps 16 …)° Basic Data Structures: Quad Trees and Oct Trees° The Barnes-Hut Algorithm (BH)• An O(N log N) approximate algorithm for the N-Body problem° The Fast Multipole Method (FMM)• An O(N) approximate algorithm for the N-Body problem° Parallelizing BH, FMM and related algorithms04/4/2007 CS267 Lecture 21Quad Trees° Data structure to subdivide the plane• Nodes can contain coordinates of center of box, side length• Eventually also coordinates of CM, total mass, etc.° In a complete quad tree, each nonleaf node has 4 children04/4/2007 CS267 Lecture 21Oct Trees° Similar Data Structure to subdivide space04/4/2007 CS267 Lecture 21Using Quad Trees and Oct Trees° All our algorithms begin by constructing a tree to hold all the particles° Interesting cases have nonuniformly distributed particles• In a complete tree most nodes would be empty, a waste of space and time° Adaptive Quad (Oct) Tree only subdivides space where particles are located04/4/2007 CS267 Lecture 21Example of an Adaptive Quad TreeChild nodes enumerated counterclockwisefrom SW corner, empty ones excluded04/4/2007 CS267 Lecture 21Adaptive Quad Tree Algorithm (Oct Tree analogous)Procedure Quad_Tree_Build Quad_Tree = {empty}for j = 1 to N … loop over all N particlesQuad_Tree_Insert(j, root) … insert particle j in QuadTreeendfor… At this point, each leaf of Quad_Tree will have 0 or 1 particles … There will be 0 particles when some sibling has 1Traverse the Quad_Tree eliminating empty leaves … via, say Breadth First SearchProcedure Quad_Tree_Insert(j, n) … Try to insert particle j at node n in Quad_Treeif n an internal node … n has 4 childrendetermine which child c of node n contains particle jQuad_Tree_Insert(j, c)else if n contains 1 particle … n is a leafadd n’s 4 children to the Quad_Treemove the particle already in n into the child containing itlet c be the child of n containing jQuad_Tree_Insert(j, c)else … n empty store particle j in node nend04/4/2007 CS267 Lecture 21Cost of Adaptive Quad Tree Constrution° Cost <= N * maximum


View Full Document

Berkeley COMPSCI C267 - Lecture Notes

Documents in this Course
Lecture 4

Lecture 4

52 pages

Split-C

Split-C

5 pages

Lecture 5

Lecture 5

40 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?