Rice CAAM 471 - Distributed Computation of a Sparse Matrix Vector Product - D2001793

Home> Schools> Rice University> Business (CAAM) > CAAM 471> Distributed Computation of a Sparse Matrix Vector Product

Rice CAAM 471 - Distributed Computation of a Sparse Matrix Vector Product

Course Caam 471- Intro to Linear/integer Prog.

Pages 15

Download Save

Unformatted text preview:

Distributed Computation of a Sparse Matrix Vector ProductRecallThe Sparse Matrix-Vector ProductA Distributed Sparse MatrixSlide 5Divide The UnknownsDividing The Rows UpMatrix Vector ProductPractical DescriptionPractical Description contSlide 11Slide 12Slide 13DetailsLast StageDistributed Computation of a Sparse Matrix Vector ProductLecture 18MA/CS 471Fall 2003Recall•We wish to solve Ax=b for some sparse A•In each of the three iterative schemes we need to evaluate a matrix vector product of the form: Qv for some sparse matrix Q.•For the Jacobi iteration:111:1 if where is the Kronecker delta0 if j Nn ni i ij jjiiij ij ij ijijx b xi ji jdd=+=� �= -� �� = -=�=��QAQ A AThe Sparse Matrix-Vector Product1j Nnij jjx==�QFor the homework I strongly suggest building a distributed sparse matrix and associated functions..We will discuss possible approaches to this.A Distributed Sparse Matrix•Suppose we decided to assign a set of rows to each process:Proc. 0Proc. 1Proc. 2Proc. 3Rows of the matrix ownedby process 0.QA Distributed Sparse Matrix•Suppose we decided to assign a set of rows to each process:Proc. 0Proc. 1Proc. 2Proc. 3Sparse entries of the globalmatrix stored on proc 0Divide The Unknowns•Suppose we decided to assign a part of the vector to be multiplied to each processProc. 0Proc. 1Proc. 2Proc. 3Part of the vector sitting on proc. 0Part of the vector sitting on proc. 1Part of the vector sitting on proc. 2Part of the vector sitting on proc. 3Dividing The Rows Up•Suppose we decided to assign a part of the vector to be multiplied to each process.•Then we can naturally group the part of the matrix sitting on each process into a set of sparse matrices.Proc. 0Proc. 1Proc. 2Proc. 3Part of the vector sitting on proc. 0Part of the vector sitting on proc. 1Part of the vector sitting on proc. 2Part of the vector sitting on proc. 3Matrix Vector Product•We can break up the row*vector products into separate sums•Demo on board.Practical Description1) Each process has to inform the other process how many vector entries and which vector entry it owns2) perhaps use MPI_BcastPractical Description cont3) Create a routine that is given the global numerical id of a degree of freedom and returns the process id which owns it and the local identity of the vector entry on that process.0 1 23 4 5 6 7 8Global number:0,0 0,1 0,21,0 1,1 1,2 2,0 2,1 2,2Location and local number:Practical Description cont4) Each process should create Nprocs empty sparse matrices with the correct number of rows (i.e. the number of rows owned by this process).5) Next – fill in the non-zero entries, using the vector entry locator (part 3) to determine which entry of which matrix should be set.Board descriptionPractical Description cont6) On each process:for each process p (excluding self) a) inspect the p’th matrix and make a list of which columns contain a non-zero entry. b) mpi_isend the number of entries required c) mpi_isend the list column ids end for each process p (excluding self) a) irecv and wait for the number of vector components ids which will be required to be sent to the p’th process b) irecv and wait for the list of vector components which will be required to be sent to the p’th process. end Each process now knows which degrees of freedom it requires to receive from each of the other processes. Each process has also informed the other processes which of their vector values it needs. Board descriptionPractical Description cont7) Now each process will be able to request the parts of the vector it needs to compute the matrix row * vector it needs to compute.8) Create the matrix vector routine: on each process:a) mpi_isend the data required by the other processes to them b) mpi_irecv the data required for this process. c) compute the local sparse matrix * local part of the global vector d) for each other process p:a) mpi_wait for the data to have arrived from the process p b) compute the sparse matrix vector for process p c) add this to the accumulating result vector. end e) mpi_wait for the outgoing data to have been sent out.Board descriptionDetails•There are a couple of wrinkles. 1) To make this easier: you can create some redundant storage. i.e. create a vector of zeros and set the non-zeros coming from the p’th process. 2) You should also consider the necessary modification required to make the matrix symmetric and diagonally dominant.Last Stage•Once you have a distributed matrix constructor – create a Jacobi iterative solver.•Benchmark this code for N=102400 on 2, 4, 8, 16 processors.•Hand in speed up (or down) curves and code print out as part of the project

View Full Document


School:
Email:
New Password:
Confirm Password:

Rice CAAM 471 - Distributed Computation of a Sparse Matrix Vector Product

Sign up for free to view:

Please select your school