DOC PREVIEW
Berkeley COMPSCI C267 - Dense Linear Algebra: Parallel Gaussian Elimination

This preview shows page 1-2-3-4-28-29-30-31-58-59-60-61 out of 61 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 61 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 267 Dense Linear Algebra: Parallel Gaussian EliminationOutlineSuccess Stories for Sca/LAPACKMotivation (1)Motivation (2)PowerPoint PresentationGaussian Elimination (GE) for solving Ax=bRefine GE Algorithm (1)Refine GE Algorithm (2)Refine GE Algorithm (3)Refine GE Algorithm (4)Refine GE Algorithm (5)What GE really computesProblems with basic GE algorithmPivoting in Gaussian EliminationGaussian Elimination with Partial Pivoting (GEPP)Slide 17Converting BLAS2 to BLAS3 in GEPPBlocked GEPP (www.netlib.org/lapack/single/sgetrf.f)Efficiency of Blocked GEPPOverview of LAPACK and ScaLAPACKPerformance of LAPACK (n=1000)Performance of LAPACK (n=100)Parallelizing Gaussian EliminationDifferent Data Layouts for Parallel GESlide 26Review: BLAS 3 (Blocked) GEPPSlide 28Slide 29Slide 30Slide 31LAPACK and ScaLAPACK StatusScaLAPACK Performance Models (1)ScaLAPACK Performance Models (2)Slide 35Slide 36Slide 37The “Holy Grail” (Parlett, Dhillon, Marques) Perfect Output complexity (O(n * #vectors)), Embarrassingly parallel, AccurateSlide 40Scalable Nonsymmetric EigensolverSlide 42Recursive AlgorithmsGaussian Elimination via a Recursive AlgorithmRecursive FactorizationsSlide 46Recursive Algorithms – LimitsNext release of LAPACK and ScaLAPACKSome contributors (incomplete list)Extra SlidesAssignment of parallel work in GEComputational Electromagnetics (MOM)Slide 53BLAS2 version of GE with Partial Pivoting (GEPP)Slide 55Slide 56Slide 57Slide 58Computational Chemistry – Ax = l xSlide 60Parallelism in ScaLAPACKWinner of TOPS 500 (LINPACK Benchmark)02/14/2006 CS267 Lecture 9 1CS 267 Dense Linear Algebra:Parallel Gaussian EliminationJames Demmelwww.cs.berkeley.edu/~demmel/cs267_Spr0602/14/2006 CS267 Lecture 9 2Outline•Motivation, overview for Dense Linear Algebra•Review Gaussian Elimination (GE) for solving Ax=b•Optimizing GE for caches on sequential machines-using matrix-matrix multiplication (BLAS)•LAPACK library overview and performance•Data layouts on parallel machines•Parallel Gaussian Elimination•ScaLAPACK library overview•Eigenvalue problems•Current Research02/14/2006 CS267 Lecture 9 3Success Stories for Sca/LAPACKCosmic Microwave Background Analysis, BOOMERanG collaboration, MADCAP code (Apr. 27, 2000).ScaLAPACK•Widely used-Adopted by Mathworks, Cray, Fujitsu, HP, IBM, IMSL, NAG, NEC, SGI, …->56M web hits @ Netlib (incl. CLAPACK, LAPACK95)•New Science discovered through the solution of dense matrix systems-Nature article on the flat universe used ScaLAPACK-Other articles in Physics Review B that also use it-1998 Gordon Bell Prize-www.nersc.gov/news/reports/newNERSCresults050703.pdf02/14/2006 CS267 Lecture 9 4Motivation (1)3 Basic Linear Algebra Problems1. Linear Equations: Solve Ax=b for x2. Least Squares: Find x that minimizes ||r||2   ri2 where r=Ax-b•Statistics: Fitting data with simple functions3a. Eigenvalues: Findand x where Ax =  x•Vibration analysis, e.g., earthquakes, circuits3b. Singular Value Decomposition: ATAx=2x•Data fitting, Information retrievalLots of variations depending on structure of A•A symmetric, positive definite, banded, …02/14/2006 CS267 Lecture 9 5Motivation (2) •Why dense A, as opposed to sparse A?-Many large matrices are sparse, but …-Dense algorithms easier to understand -Some applications yields large dense matrices-LINPACK Benchmark (www.top500.org)•“How fast is your computer?” = “How fast can you solve dense Ax=b?”-Large sparse matrix algorithms often yield smaller (but still large) dense problems02/14/2006 CS267 Lecture 9 6Current Records for Solving Dense Systems (2006) GigaflopsMachine n=100 n=1000 Any n Peak IBM BlueGene/L 281 367 (131K procs) (n=1.8M)NEC SX 8 (8 proc, 2 GHz) 75.1 128 (1 proc, 2 GHz) 2.2 15.0 16…www.netlib.org, click on Performance Database ServerPalm Pilot III .0016902/14/2006 CS267 Lecture 9 7Gaussian Elimination (GE) for solving Ax=b•Add multiples of each row to later rows to make A upper triangular•Solve resulting triangular system Ux = c by substitution… for each column i… zero it out below the diagonal by adding multiples of row i to later rowsfor i = 1 to n-1 … for each row j below row i for j = i+1 to n … add a multiple of row i to row j tmp = A(j,i); for k = i to n A(j,k) = A(j,k) - (tmp/A(i,i)) * A(i,k)0...00...00.00000...00...00.00...00...00...0After i=1 After i=2 After i=3 After i=n-1…02/14/2006 CS267 Lecture 9 8Refine GE Algorithm (1)•Initial Version•Remove computation of constant tmp/A(i,i) from inner loop. … for each column i… zero it out below the diagonal by adding multiples of row i to later rowsfor i = 1 to n-1 … for each row j below row i for j = i+1 to n … add a multiple of row i to row j tmp = A(j,i); for k = i to n A(j,k) = A(j,k) - (tmp/A(i,i)) * A(i,k)for i = 1 to n-1 for j = i+1 to n m = A(j,i)/A(i,i) for k = i to n A(j,k) = A(j,k) - m * A(i,k)m02/14/2006 CS267 Lecture 9 9Refine GE Algorithm (2)•Last version•Don’t compute what we already know: zeros below diagonal in column ifor i = 1 to n-1 for j = i+1 to n m = A(j,i)/A(i,i) for k = i+1 to n A(j,k) = A(j,k) - m * A(i,k)for i = 1 to n-1 for j = i+1 to n m = A(j,i)/A(i,i) for k = i to n A(j,k) = A(j,k) - m * A(i,k)Do not compute zerosm02/14/2006 CS267 Lecture 9 10Refine GE Algorithm (3)•Last version•Store multipliers m below diagonal in zeroed entries for later usefor i = 1 to n-1 for j = i+1 to n m = A(j,i)/A(i,i) for k = i+1 to n A(j,k) = A(j,k) - m * A(i,k)for i = 1 to n-1 for j = i+1 to n A(j,i) = A(j,i)/A(i,i) for k = i+1 to n A(j,k) = A(j,k) - A(j,i) * A(i,k)Store m herem02/14/2006 CS267 Lecture 9 11Refine GE Algorithm (4)•Last versionfor i = 1 to n-1 for j = i+1 to n A(j,i) = A(j,i)/A(i,i) for k = i+1 to n A(j,k) = A(j,k) - A(j,i) * A(i,k)•Split Loopfor i = 1 to n-1 for j = i+1 to n A(j,i) = A(j,i)/A(i,i) for j = i+1 to n for k = i+1 to n A(j,k) = A(j,k) - A(j,i) * A(i,k)Store all m’s here before


View Full Document

Berkeley COMPSCI C267 - Dense Linear Algebra: Parallel Gaussian Elimination

Documents in this Course
Lecture 4

Lecture 4

52 pages

Split-C

Split-C

5 pages

Lecture 5

Lecture 5

40 pages

Load more
Download Dense Linear Algebra: Parallel Gaussian Elimination
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Dense Linear Algebra: Parallel Gaussian Elimination and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Dense Linear Algebra: Parallel Gaussian Elimination 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?