DOC PREVIEW
Berkeley COMPSCI C267 - Dense Linear Algebra

This preview shows page 1-2-3-4-28-29-30-31-57-58-59-60 out of 60 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 60 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 267 Applications of Parallel Computers Dense Linear AlgebraOutlineSuccess Stories for ScaLAPACKMotivation (1)Motivation (2)Winner of TOPS 500 (LINPACK Benchmark)PowerPoint PresentationGaussian Elimination (GE) for solving Ax=bRefine GE Algorithm (1)Refine GE Algorithm (2)Refine GE Algorithm (3)Refine GE Algorithm (4)Refine GE Algorithm (5)What GE really computesProblems with basic GE algorithmPivoting in Gaussian EliminationGaussian Elimination with Partial Pivoting (GEPP)Slide 18Converting BLAS2 to BLAS3 in GEPPBlocked GEPP (www.netlib.org/lapack/single/sgetrf.f)Efficiency of Blocked GEPPOverview of LAPACKPerformance of LAPACK (n=1000)Performance of LAPACK (n=100)Parallelizing Gaussian EliminationDifferent Data Layouts for Parallel GESlide 27Review: BLAS 3 (Blocked) GEPPSlide 29Slide 30Slide 31Slide 32ScaLAPACK Performance Models (1)ScaLAPACK Performance Models (2)Slide 35Slide 36Parallelism in ScaLAPACKSlide 38The “Holy Grail” (Parlett, Dhillon, Marques) Perfect Output complexity (O(n * #vectors)), Embarrassingly parallel, AccurateSlide 41Scalable Nonsymmetric EigensolverSlide 43Recursive AlgorithmsGaussian Elimination via a Recursive AlgorithmRecursive FactorizationsSlide 47Recursive Algorithms – LimitsScaLAPACK Summary and ConclusionsA small software project ...Extra SlidesAssignment of parallel work in GEComputational Electromagnetics (MOM)Slide 54BLAS2 version of GE with Partial Pivoting (GEPP)Slide 56Slide 57Slide 58Slide 59Computational Chemistry – Ax = l xSlide 6101/14/19 1CS 267 Applications of Parallel ComputersDense Linear Algebra Kathy Yelickhttp://www.cs.berkeley.edu/~yelick/cs26701/14/19 CS267 Lecture 14 2Outline•Motivation for Dense Linear Algebra•Benchmarks•Review Gaussian Elimination (GE) for solving Ax=b•Optimizing GE for caches on sequential machines-using matrix-matrix multiplication (BLAS)•LAPACK library overview and performance•Data layouts on parallel machines•Parallel Gaussian Elimination•ScaLAPACK library overview•Eigenvalue problems•Open Problems01/14/19 CS267 Lecture 14 3Success Stories for ScaLAPACKCosmic Microwave Background Analysis, BOOMERanG collaboration, MADCAP code (Apr. 27, 2000).ScaLAPACK•New Science discovered through the solution of dense matrix systems•ScaLAPACK is a library for dense and banded matrices-Nature article on the flat universe used ScaLAPACK-Other articles in Physics Review B that also use it-1998 Gordon Bell Prize-Joint effort between DOE, DARPA, and NSF01/14/19 CS267 Lecture 14 4Motivation (1)3 Basic Linear Algebra Problems1. Linear Equations: Solve Ax=b for x2. Least Squares: Find x that minimizes  ri2 where r=Ax-b•Statistics: Fitting data with simple functions3a. Eigenvalues: Findand x where Ax =  x•Vibration analysis, e.g., earthquakes, circuits3b. Singular Value Decomposition: ATAx=2x•Information retrieval, web searchLots of variations depending on structure of A•A symmetric, positive definite, banded, …01/14/19 CS267 Lecture 14 5Motivation (2) •Why dense A, as opposed to sparse A?-Many large matrices are sparse, but …-Dense algorithms easier to understand -Some applications yields large dense matrices-Benchmarking•“How fast is your computer?” = “How fast can you solve dense Ax=b?”-Large sparse matrix algorithms often yield smaller (but still large) dense problems01/14/19 CS267 Lecture 14 6Winner of TOPS 500 (LINPACK Benchmark)Year Machine Tflops Factor fasterPeakTflopsNumProcsN20022003Earth System Computer, NEC35.6 4.9 40.8 5104 1.04M2001ASCI White, IBM SP Power 37.2 1.5 11.1 7424 .52M2000ASCI White, IBM SP Power 34.9 2.1 11.1 7424 .43M1999ASCI Red, Intel PII Xeon2.4 1.1 3.2 9632 .36M 1998ASCI Blue, IBM SP 604E2.1 1.6 3.9 5808 .43M1997ASCI Red, Intel Ppro, 200 MHz1.3 3.6 1.8 9152 .24M1996Hitachi CP-PACS.37 1.3 .6 2048 .10M1995Intel Paragon XP/S MP.28 1 .3 6768 .13MSource: Jack Dongarra (UTK)01/14/19 CS267 Lecture 14 7Current Records for Solving Small Dense Systems MegaflopsMachine n=100 n=1000 PeakIntel Pentium 4 (1 proc, 2.53 GHz) 1190 2355 5060 Compaq ES45 (4 proc. 1 GHz) 5522 8000 (1 proc. 1 GHz) 824 1542 2000 NEC SX 6 (8 proc, 500 MHz) 45030 64000 (1 proc, 500 MHz) 1161 7575 8000www.netlib.org, click on Performance Database Server01/14/19 CS267 Lecture 14 8Gaussian Elimination (GE) for solving Ax=b•Add multiples of each row to later rows to make A upper triangular•Solve resulting triangular system Ux = c by substitution… for each column i… zero it out below the diagonal by adding multiples of row i to later rowsfor i = 1 to n-1 … for each row j below row i for j = i+1 to n … add a multiple of row i to row j tmp = A(j,i); for k = i to n A(j,k) = A(j,k) - (tmp/A(i,i)) * A(i,k)0...00...00.00000...00...00.00...00...00...0After i=1 After i=2 After i=3 After i=n-1…01/14/19 CS267 Lecture 14 9Refine GE Algorithm (1)•Initial Version•Remove computation of constant tmp/A(i,i) from inner loop. … for each column i… zero it out below the diagonal by adding multiples of row i to later rowsfor i = 1 to n-1 … for each row j below row i for j = i+1 to n … add a multiple of row i to row j tmp = A(j,i); for k = i to n A(j,k) = A(j,k) - (tmp/A(i,i)) * A(i,k)for i = 1 to n-1 for j = i+1 to n m = A(j,i)/A(i,i) for k = i to n A(j,k) = A(j,k) - m * A(i,k)m01/14/19 CS267 Lecture 14 10Refine GE Algorithm (2)•Last version•Don’t compute what we already know: zeros below diagonal in column ifor i = 1 to n-1 for j = i+1 to n m = A(j,i)/A(i,i) for k = i+1 to n A(j,k) = A(j,k) - m * A(i,k)for i = 1 to n-1 for j = i+1 to n m = A(j,i)/A(i,i) for k = i to n A(j,k) = A(j,k) - m * A(i,k)Do not compute zerosm01/14/19 CS267 Lecture 14 11Refine GE Algorithm (3)•Last version•Store multipliers m below diagonal in zeroed entries for later usefor i = 1 to n-1 for j = i+1 to n m = A(j,i)/A(i,i) for k = i+1 to n A(j,k) = A(j,k) - m * A(i,k)for i = 1 to n-1 for j = i+1 to n A(j,i) = A(j,i)/A(i,i) for k = i+1 to n A(j,k) = A(j,k) - A(j,i) * A(i,k)Store m


View Full Document

Berkeley COMPSCI C267 - Dense Linear Algebra

Documents in this Course
Lecture 4

Lecture 4

52 pages

Split-C

Split-C

5 pages

Lecture 5

Lecture 5

40 pages

Load more
Download Dense Linear Algebra
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Dense Linear Algebra and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Dense Linear Algebra 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?