Unformatted text preview:

COSC 6374 Parallel Computation Message Passing Interface MPI III Collective Operations Edgar Gabriel Fall 2011 Edgar Gabriel What you ve learned so far Six MPI functions are sufficient for programming a distributed memory machine MPI Init int argc char argv MPI Finalize MPI Comm rank MPI Comm comm int rank MPI Comm size MPI Comm comm int size MPI Send void buf int count MPI Datatype dat int dest int tag MPI Comm comm MPI Recv void buf int count MPI Datatype dat int source int tag MPI Comm comm MPI Status status Edgar Gabriel 1 So why not stop here Performance need functions which can fully exploit the capabilities of the hardware need functions to abstract typical communication patterns Usability need functions to simplify often recurring tasks need functions to simplify the management of parallel applications Edgar Gabriel So why not stop here Performance asynchronous point to point operations one sided operations collective operations derived data types parallel I O hints Usability process grouping functions environmental and process management error handling object attributes language bindings Edgar Gabriel 2 Collective operation All process of a process group have to participate in the same operation process group is defined by a communicator all processes have to provide the same arguments for each communicator you can have one collective operation ongoing at a time Collective operations are abstractions for often occurring communication patterns eases programming enables low level optimizations and adaptations to the hardware infrastructure Edgar Gabriel MPI collective operations MPI Barrier MPI Bcast MPI Scatter MPI Scatterv MPI Gather MPI Gatherv MPI Allgather MPI Allgatherv MPI Alltoall MPI Alltoallv MPI Reduce MPI Allreduce MPI Reduce scatter MPI Scan MPI Exscan MPI Alltoallw Edgar Gabriel 3 More MPI collective operations Creating and freeing a communicator is considered a collective operation e g MPI Comm create e g MPI Comm spawn Collective I O operations e g MPI File write all Window synchronization calls are collective operations e g MPI Win fence Edgar Gabriel MPI Bcast MPI Bcast void buf int cnt MPI Datatype dat int root MPI Comm comm The process with the rank root distributes the data stored in buf to all other processes in the communicator comm Data in buf is identical on all processes after the bcast Compared to point to point operations no tag since you cannot have several ongoing collective operations Edgar Gabriel 4 MPI Bcast II MPI Bcast buf 2 MPI INT 0 comm rbuf on rank 0 rbuf on rank 1 buf on root rbuf on rank 2 rbuf on rank 3 rbuf on rank 4 Edgar Gabriel Example distributing global parameters int rank problemsize float precision MPI Comm comm MPI COMM WORLD MPI Comm rank comm rank if rank 0 FILE myfile myfile fopen testfile txt r fscanf myfile d problemsize fscanf myfile f precision fclose myfile MPI Bcast problemsize 1 MPI INT 0 comm MPI Bcast precision 1 MPI FLOAT 0 comm Edgar Gabriel 5 MPI Scatter MPI Scatter void sbuf int scnt MPI Datatype sdat void rbuf int rcnt MPI Datatype rdat int root MPI Comm comm The process with the rank root distributes the data stored in sbuf to all other processes in the communicator comm Difference to Broadcast every process gets different segment of the original data at the root process Arguments sbuf scnt sdat only relevant and have to be set at the root process Edgar Gabriel MPI Scatter II MPI Scatter sbuf 2 MPI INT rbuf 2 MPI INT 0 comm rbuf on rank 0 rbuf on rank 1 sbuf on root rbuf on rank 2 rbuf on rank 3 rbuf on rank 4 Edgar Gabriel 6 Example partition a vector among processes int rank size float sbuf rbuf 3 MPI Comm comm MPI COMM WORLD MPI Comm rank comm rank MPI Comm size comm size if rank root sbuf malloc 3 size sizeof float set sbuf to required values etc distribute the vector 3 Elements for each process MPI Scatter sbuf 3 MPI FLOAT rbuf 3 MPI FLOAT root comm if rank root free sbuf Edgar Gabriel MPI Gather MPI Gather void sbuf int scnt MPI Datatype sdat void rbuf int rcnt MPI Datatype rdat int root MPI Comm comm Reverse operation of MPI Scatter The process with the rank root receives the data stored in sbuf on all other processes in the communicator comm into the rbuf Arguments rbuf rcnt rdat only relevant and have to be set at the root process Edgar Gabriel 7 MPI Gather II MPI Gather sbuf 2 MPI INT rbuf 2 MPI INT 0 comm sbuf on rank 0 sbuf on rank 1 sbuf on rank 2 rbuf on root sbuf on rank 3 sbuf on rank 4 Edgar Gabriel MPI Allgather MPI Allgather void sbuf int scnt MPI Datatype sdat void rbuf int rcnt MPI Datatype rdat MPI Comm comm Identical to MPI Gather except that all processes have the final result sbuf on rank 0 rbuf on rank 0 sbuf on rank 1 sbuf on rank 2 rbuf on rank 1 rbuf on rank 2 Edgar Gabriel 8 Example matrix vector multiplication with row wise block distribution int main int argc char argv double A nlocal n b n double c nlocal cglobal n int i j for i 0 i nlocal i for j 0 j n j c i c i A i j b j Each process holds the final result for its part of c MPI Allgather c nlocal MPI DOUBLE cglobal nlocal MPI DOUBLE MPI COMM WORLD Edgar Gabriel Reduction operations MPI Reduce void inbuf void outbuf int cnt MPI Datatype dat MPI Op op int root MPI Comm comm MPI Allreduce void inbuf void outbuf int cnt MPI Datatype dat MPI Op op MPI Comm comm Perform simple calculations e g caculate the sum or the product over all processes in the communicator MPI Reduce outbuf has to be provided by all processes result is only available at root MPI Allreduce result available on all processes Edgar Gabriel 9 Predefined reduction operations sum product minimum maximum logical and logical or logical exclusive or binary and binary or binary exclusive or maximum value and location minimum value and location MPI SUM MPI PROD MPI MIN MPI MAX MPI LAND MPI LOR MPI LXOR MPI BAND MPI BOR MPI BXOR MPI MAXLOC MPI MINLOC Edgar Gabriel Reduction operations on vectors Reduce operation is executed element wise on each entry of the array Rank 1 inbuf Rank 0 inbuf 1 2 2 3 3 4 Rank 2 inbuf Rank 3 inbuf 3 4 4 5 Rank 0 outbuf 10 5 6 14 18 4 5 6 7 22 5 6 7 8 26 Reduction of 5 elements with root 0 MPI Reduce inbuf outbuf 5 MPI INT MPI SUM 0 MPI COMM WORLD Edgar Gabriel 20 10 Example scalar product of two vectors Process with rank 0 int main int argc char argv a 0 N 1 b 0 N 1 2 2 int i rank size double a local N 2 double b local N 2 double s local s Process with rank 1 a N 2 N 1 b N 2 N 1 s …


View Full Document

UH COSC 6374 - Message Passing Interface (MPI ) – III Collective Operations

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view Message Passing Interface (MPI ) – III Collective Operations and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Message Passing Interface (MPI ) – III Collective Operations and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?