Parallel Program Issues Dependency Analysis Types of dependency Dependency Graphs Bernstein s Conditions of Parallelism Asymptotic Notations for Algorithm Complexity Analysis Parallel Random Access Machine PRAM Example sum algorithm on P processor PRAM Network Model of Message Passing Multicomputers Example Asynchronous Matrix Vector Product on a Ring Levels of Parallelism in Program Execution Hardware Vs Software Parallelism Parallel Task Grain Size Software Parallelism Types Data Vs Functional Parallelism Example Motivating Problem With high levels of concurrency Limited Parallel Program Concurrency Amdahl s Law Parallel Performance Metrics Degree of Parallelism DOP Concurrency Profile Steps in Creating a Parallel Program Decomposition Assignment Orchestration Mapping Program Partitioning Example handout Static Multiprocessor Scheduling Example handout PCA Chapter 2 1 2 2 EECC756 Shaaban 1 lec 3 Spring2008 3 20 2008 Parallel Programs Definitions A parallel program is comprised of a number of tasks running as threads or processes on a number of processing elements that cooperate communicate as Communication part of a single parallel computation Parallel Execution Time The processor with max execution time Computation Task determines parallel execution time Other Parallelization Arbitrary piece of undecomposed work in parallel computation Executed sequentially on a single processor concurrency in parallel computation is only across tasks Overheads Parallel or Independent Tasks Tasks that with no dependencies among them and thus can run in parallel on different processing elements Parallel Task Grain Size The amount of computations in a task Process thread Abstract entity that performs the computations assigned to a task Processes communicate and synchronize to perform their tasks Processor or Processing Element Physical computing engine on which a process executes sequentially Processes virtualize machine to programmer First write program in terms of processes then map to processors Communication to Computation Ratio C to C Ratio Represents the amount of resulting communication between tasks of a parallel program In general for a parallel computation a lower C to C ratio is desirable and usually indicates better parallel performance EECC756 Shaaban 2 lec 3 Spring2008 3 20 2008 Dependency Analysis Conditions of Parallelism Dependency analysis is concerned with detecting the presence and type of dependency between tasks that prevent tasks from being independent and from running in parallel on different processors and can be applied to tasks of any grain size Down to task instruction Represented graphically as task dependency graphs Dependencies between tasks can be 1 algorithm program related or 2 hardware resource architecture related 1 Algorithm program Task Dependencies Data Dependence True Data or Flow Dependence Name Dependence Anti dependence Output or write dependence Control Dependence 2 Hardware Architecture Resource Dependence A task only executes on one processor to which it has been mapped or allocated EECC756 Shaaban 3 lec 3 Spring2008 3 20 2008 Conditions of Parallelism Data Name Dependence S1 S2 Program Order 1 Assume task S2 follows task S1 in sequential program order True Data or Flow Dependence Task S2 is data dependent on task S1 if an execution path exists from S1 to S2 and if at least one output variable of S1 feeds in as an input operand used by S2 Represented by S1 S2 in task dependency graphs 2 Anti dependence Task S2 is antidependent on S1 if S2 follows S1 in program order and if the output of S2 overlaps the input of S1 Name Represented by S1 S2 in dependency graphs Dependencies 3 Output dependence Two tasks S1 S2 are output dependent if they produce the same output variable Represented by S1 S2 in task dependency graphs EECC756 Shaaban 4 lec 3 Spring2008 3 20 2008 True Data or Flow Dependence Assume task S2 follows task S1 in sequential program order Task S1 produces one or more results used by task S2 S1 S2 Then task S2 is said to be data dependent on task S1 Changing the relative execution order of tasks S1 S2 in the parallel program violates this data dependence and results in incorrect execution Task Dependency Graph Representation S1 Write i e Producer S1 Shared Operands S2 Read S1 S2 Data Dependence i e Consumer S2 Program Order Task S2 is data dependent on task S1 EECC756 Shaaban 5 lec 3 Spring2008 3 20 2008 Name Dependence Classification Anti Dependence Assume task S2 follows task S1 in sequential program order Task S1 reads one or more values from one or more names registers or memory locations Instruction S2 writes one or more values to the same names same registers or memory locations read by S1 S1 S2 Then task S2 is said to be anti dependent on task S1 Changing the relative execution order of tasks S1 S2 in the parallel program violates this name dependence and may result in incorrect execution Task Dependency Graph Representation S1 Read S1 Shared Names S1 S2 S2 Write Anti dependence S2 Program Order Task S2 is anti dependent on task S1 Name Register or Memory Location EECC756 Shaaban 6 lec 3 Spring2008 3 20 2008 Name Dependence Classification Output or Write Dependence Assume task S2 follows task S1 in sequential program order Both tasks S1 S2 write to the same a name or names same registers or memory locations S1 S2 Then task S2 is said to be output dependent on task S1 Changing the relative execution order of tasks S1 S2 in the parallel program violates this name dependence and may result in incorrect execution Task Dependency Graph Representation S1 Write I Shared Names S2 Write S1 S2 Output dependence J Program Order Task S2 is output dependent on task S1 Name Register or Memory Location EECC756 Shaaban 7 lec 3 Spring2008 3 20 2008 Dependency Graph Example Here assume each instruction is treated as a task S1 S2 S3 S4 Load R1 A Add R2 R1 Move R1 R3 Store B R1 R1 Memory A R2 R1 R2 R1 R3 Memory B R1 True Date Dependence S1 S2 S3 S4 S1 Output Dependence S1 S3 Anti dependence S2 S3 S4 S2 Dependency graph S3 EECC756 Shaaban 8 lec 3 Spring2008 3 20 2008 Dependency Graph Example Here assume each instruction is treated as a task MIPS Code Task Dependency graph 1 ADD D F2 F1 F0 2 3 ADD D F4 F2 F3 ADD D F2 F2 F4 1 2 3 4 ADD D ADD D ADD D ADD D F2 F1 F0 F4 F2 F3 F2 F2 F4 F4 F2 F6 True Date Dependence 1 2 1 3 2 3 3 4 Output Dependence 1 3 2 4 Anti dependence 4 2 3 3 4 ADD D F4 F2 F6 EECC756 Shaaban 9 lec 3 Spring2008 3 20 2008 Dependency Graph
View Full Document
Unlocking...