COSC 6374 Parallel Computation Parallel Design Patterns I Edgar Gabriel Fall 2010 Edgar Gabriel Design patterns A design pattern is a way of reusing abstract knowledge about a problem and its solution A pattern is a description of the problem and the essence of its solution Patterns are devices that allow programs to share knowledge about their design Documenting patterns is one way to reuse and share the information about how it is best to solve a specific design problem A pattern should be sufficiently abstract to be reused in different settings Patterns often rely on object characteristics such as inheritance and polymorphism COSC 6374 Parallel Computation Edgar Gabriel 1 History of Design Patterns Architect Christopher Alexander A Pattern Language 1977 A Timeless Way of Building 1979 Gang of four Erich Gamma Richard Helm Ralph Johnson John Vlissides Design Patterns Elements of Reusable ObjectOriented Software 1995 Many since Conferences symposia books COSC 6374 Parallel Computation Edgar Gabriel Pattern elements Name Problem description Solution description A meaningful pattern identifier Not a concrete design but a template for a design solution that can be instantiated in different ways Consequences The results and trade offs of applying the pattern COSC 6374 Parallel Computation Edgar Gabriel 2 Abstraction Levels Patterns exist on different abstraction levels basic building blocks algorithms and components provided by language or by libraries e g hash tables linked lists sort algorithms math e g inheritance and polymorphism encapsulation design patterns general design problems in particular context concerning several classes e g GOF design patterns architecture patterns architecture decisions concerning the whole system or subsystem e g client server data centric etc COSC 6374 Parallel Computation Edgar Gabriel Terminology I Task Sequence of instructions operating as a group A logical part of an algorithm program Example multiplication of two matrixes A task could be Inner products between rows and columns of the matrices Individual iterations of the loops involved in the matrix multiplication Usually a problem can be decomposed in several different ways leading to very different tasks COSC 6374 Parallel Computation Edgar Gabriel 3 Terminology II Unit of execution UE Generic abstraction for threads and processes Process collection of resources that enable the execution of program instructions Thread shares the process s environment Considered lightweight compared to a process since switching between several threads is significantly cheaper than switching between several processes Processing element PE Generic term for a hardware element that executes a stream of instructions COSC 6374 Parallel Computation Edgar Gabriel Terminology III Load balance How well work is distributed among different PEs Goal all PEs should spend the same time with a certain task If all PEs are equal homogeneous platform load balance can often be achieved by assigning the same number of data elements to each PE If PEs are different heterogeneous platform E g some processors are faster than others For a balanced execution the faster processes need more data elements than the slower ones COSC 6374 Parallel Computation Edgar Gabriel 4 Terminology IV Synchronization Enforces ordering constraints E g for non deterministic task sequences E g for concurrent access to the same variable on a shared memory machine If two events must happen at the same time they are synchronous COSC 6374 Parallel Computation Edgar Gabriel Parallelization Strategy Finding Concurrency Algorithm Structure Supporting Structure Implementation Mechanism Structure the problem to expose exploitable concurrency Structure the algorithm to take advantage of concurrency Intermediate stage between Algorithm Structure and Implementation program structuring definition of shared data structures Mapping of the higher level patterns onto a programming environment COSC 6374 Parallel Computation Edgar Gabriel 5 Finding concurrency Overview Decomposition Dependency Analysis Task Decomposition Group Tasks Data Decomposition Order Tasks Design Evaluation Data Sharing COSC 6374 Parallel Computation Edgar Gabriel Overview II Is the problem large enough to justify the efforts for parallelizing it Are the key features of the problem and the data elements within the problem well understood Which parts of the problem are most computationally intensive COSC 6374 Parallel Computation Edgar Gabriel 6 Task decomposition How can a problem be decomposed into tasks that can execute concurrently Goal a collection of nearly independent tasks Initially try to find as many as tasks as possible can be merged later to form larger tasks A task can correspond to A function call Distinct iterations of loops within the algorithm loop splitting Any independent sections in the source code COSC 6374 Parallel Computation Edgar Gabriel Task decomposition goals and constraints Flexibility the design should be flexible enough to be able to handle any numbers of processes E g the number and the size of each task should be a parameter for the task decomposition Efficiency each task should include enough work to compensate for the overhead of generating tasks and managing their dependencies Simplicity tasks should be defined such that it keeps debugging and maintenance easy COSC 6374 Parallel Computation Edgar Gabriel 7 Task decomposition Two tasks A and B are considered independent if I A OB I B OA O A OB COSC 6374 Parallel Computation Edgar Gabriel Task decomposition example Solving a system of linear equations using the Conjugant Gradient Method Given x0 A b p0 r0 b Ax0 For i 0 1 2 rTipi Api Tpi xi 1 xi pi ri 1 ri Api pi 1 ri 1 Ari 1 Tpi Can be executed in parallel Can be executed in parallel COSC 6374 Parallel Computation Edgar Gabriel 8 Data decomposition How can a problem s data be decomposed into units that can be operated relatively independently Most computationally intensive parts of a problem are dealing with large data structures Common mechanisms Array based decomposition e g see example on the next slides Recursive data structures e g decomposing the parallel update of a large tree data structure COSC 6374 Parallel Computation Edgar Gabriel Data decomposition goals and constraints Flexibility size and number of data chunks should be flexible granularity Efficiency Data chunks have to be large enough that the amount of work with the data chunk compensates for
View Full Document
Unlocking...