DOC PREVIEW
UT CS 395T - Introduction to parallelism in irregular algorithms

This preview shows page 1 out of 3 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 3 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

11Keshav PingaliUniversity of Texas, AustinIntroduction to parallelism in irregular algorithms2ExamplesSurvey propagation, Bayesian inferenceSAT solversChandy-Misra-Bryant, Jefferson TimewarpEvent-driven simulationMessage-passing algorithmsAISparse MVM, sparse CholeskyfactorizationSparse linear solversPrim, Kruskal, BoruvkaMinimal spanning treesPreflow-push, augmenting pathsMaxflowGraph reduction, static and dynamic dataflowFunctional interpretersIterative and elimination-based dataflow algorithmsCompilersGeneration/refinement/partitioningMeshingAlgorithmApplication/domain3Delaunay Mesh Refinement• Iterative refinement to remove badly shaped triangles:while there are bad triangles do {Pick a bad triangle;Find its cavity;Retriangulate cavity; // may create new bad triangles}• Don’t-care non-determinism:– final mesh depends on order in which bad triangles are processed– applications do not care which mesh is produced• Data structure: – graph in which nodes represent triangles and edges represent triangle adjacencies• Parallelism: – bad triangles with cavities that do not overlap can be processed in parallel– parallelism is dependent on runtime values• compilers cannot find this parallelism – (Miller et al) at runtime, repeatedly build interference graph and find maximal independent sets for parallel execution4Event-driven simulation• Stations communicate by sending messages with time-stamps on FIFO channels• Stations have internal state that is updated when a message is processed• Messages must be processed in time-order at each station• Data structure:– Messages in event-queue, sorted in time-order• Parallelism: – Jefferson time-warp• station can fire when it has an incoming message on any edge• requires roll-back if speculative conflict is detected– Chandy-Misra-Bryant• station fires when it has messages on all incoming edges and processes earliest message• requires null messages to avoid deadlock25AB34C625Remarks on algorithms• Diverse algorithms and data structures• Exploiting parallelism in irregular algorithms is very complex– Miller et al DMR implementation: interference graph + maximal independent sets– Jefferson Timewarp algorithm for event-driven simulation• Algorithms:– parallelism can be dependent on runtime values• DMR, event-driven simulation,…– don’t-care non-determinism• nothing to do with concurrency•DMR– activities created in the future may interfere with current activities• event-driven simulation…• Data structures:– graphs, trees, lists, priority queues,…6i1i2i3i4i5Operator formulation of algorithms• Algorithm = repeated application of operator to graph– active element: • node or edge where computation is needed– DMR: nodes representing bad triangles– Event-driven simulation: station with incoming message– Jacobi: interior nodes of mesh– neighborhood:• set of nodes and edges read/written to perform computation– DMR: cavity of bad triangle– Event-driven simulation: station– Jacobi: nodes in stencil• distinct usually from neighbors in graph– ordering: • order in which active elements must be executed in a sequential implementation– any order (Jacobi,DMR, graph reduction)– some problem-dependent order (event-driven simulation): active node: neighborhood7Parallelism• Amorphous data-parallelism– active nodes can be processed in parallel, subject to• neighborhood constraints• ordering constraints• Computations at two active elements are independent if– Neighborhoods do not overlap– More generally, neither of them writes to an element in the intersection of the neighborhoods• Unordered active elements– Independent active elements can be processed in parallel– How do we find independent active elements?• Ordered active elements– Independence is not enough – How do we determine what is safe to execute w/o violating ordering?i1i2i3i4i525ABC8Galois programming model (PLDI 2007)• Program written in terms of abstractions in model• Programming model: sequential, OO• Graph class: provided by Galois library– specialized versions to exploit structure (see later)• Galois set iterators: for iterating over unordered and ordered sets of active elements– for each e in Set S do B(e)• evaluate B(e) for each element in set S• no a priori order on iterations• set S may get new elements during execution– for each e in OrderedSet S do B(e)• evaluate B(e) for each element in set S• perform iterations in order specified by OrderedSet• set S may get new elements during executionMesh m = /* read in mesh */Set ws;ws.add(m.badTriangles()); // initialize wsfor each tr in Set ws do { //unordered Set iteratorif (tr no longer in mesh) continue;Cavity c = new Cavity(tr);c.expand();c.retriangulate();m.update(c);ws.add(c.badTriangles()); //bad triangles }DMR using Galois iterators39iterativealgorithmstopologyoperatororderingmorph: modifies structure of graphlocal computation: only updates values on nodes/edgesreader: does not modify graph in any waygeneral graphgridtreeunorderedorderedAlgorithm abstractionsJacobi: topology: grid, operator: local computation, ordering: unorderedDMR: topology: graph, operator: morph, ordering: unorderedEvent-driven simulation: topology: graph, operator: local computation, ordering:


View Full Document

UT CS 395T - Introduction to parallelism in irregular algorithms

Documents in this Course
TERRA

TERRA

23 pages

OpenCL

OpenCL

15 pages

Byzantine

Byzantine

32 pages

Load more
Download Introduction to parallelism in irregular algorithms
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to parallelism in irregular algorithms and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to parallelism in irregular algorithms 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?