Unformatted text preview:

Model Problem: Solving Poisson’s equation for temperatureThe Landscape of Ax=b SolversComplexity of linear solversCS 240A: Solving Ax = b in parallelSlide 5Conjugate gradient iteration for Ax = bSlide 7Slide 8Slide 9Slide 10Conjugate gradient iterationConjugate gradient: Krylov subspacesConjugate gradient: Orthogonal sequencesConjugate gradient: ConvergenceOther Krylov subspace methodsto complexity slides…Slide 17Conjugate gradient primitivesModel Problem: Solving Poisson’s equationStencil computationsParallelism in Stencil ComputationsWhere’s the data?Complexity measures for parallel computationGhost Nodes in Stencil ComputationsDetailed complexity measures for data movement IModel Problem: Solving Poisson’s equation for temperatureModel Problem: Solving Poisson’s equation for temperaturek = n1/3•For each i from 1 to n, except on the boundaries:– x(i-k2) – x(i-k) – x(i-1) + 6*x(i) – x(i+1) – x(i+k) – x(i+k2) = 0•n equations in n unknowns: A*x = b•Each row of A has at most 7 nonzeros•In two dimensions, k = n1/2 and each row has at most 5 nzsThe Landscape of Ax=b SolversThe Landscape of Ax=b Solvers Pivoting LU GMRES, BiCGSTAB, … Cholesky Conjugate gradient DirectA = LUIterativey’ = AyNon-symmetricSymmetricpositivedefiniteMore Robust Less Storage (if sparse)More RobustMore GeneralComplexity of linear solversComplexity of linear solvers2D 3DSparse Cholesky: O(n1.5 ) O(n2 )CG, exact arithmetic: O(n2 ) O(n2 )CG, no precond: O(n1.5 ) O(n1.33 )CG, modified IC: O(n1.25 ) O(n1.17 )CG, support trees: O(n1.20 ) -> O(n1+ ) O(n1.75 ) -> O(n1+ ) Multigrid: O(n) O(n)n1/2n1/3Time to solve model problem (Poisson’s equation) on regular meshCS 240A: Solving Ax = b in parallelCS 240A: Solving Ax = b in parallel•Dense A: Gaussian elimination with partial pivoting (LU)•See Jim Demmel’s slides•Same flavor as matrix * matrix, but more complicated•Sparse A: Iterative methods – Conjugate gradient, etc.•Sparse matrix times dense vector•Sparse A: Gaussian elimination – Cholesky, LU, etc.•Graph algorithms•Sparse A: Preconditioned iterative methods and multigrid•Mixture of lots of thingsCS 240A: Solving Ax = b in parallelCS 240A: Solving Ax = b in parallel•Dense A: Gaussian elimination with partial pivoting•See Jim Demmel’s slides•Same flavor as matrix * matrix, but more complicated•Sparse A: Iterative methods – Conjugate gradient etc.•Sparse matrix times dense vector•Sparse A: Gaussian elimination – Cholesky, LU, etc.•Graph algorithms•Sparse A: Preconditioned iterative methods and multigrid•Mixture of lots of thingsConjugate gradient iteration for Ax = bConjugate gradient iteration for Ax = bx0 = 0 approx solutionr0 = b residual = b - Axd0 = r0 search directionfor k = 1, 2, 3, . . .xk = xk-1 + … new approx solution rk = … new residual dk = … new search directionConjugate gradient iteration for Ax = bConjugate gradient iteration for Ax = bx0 = 0 approx solutionr0 = b residual = b - Axd0 = r0 search directionfor k = 1, 2, 3, . . .αk = … step lengthxk = xk-1 + αk dk-1 new approx solution rk = … new residual dk = … new search directionConjugate gradient iteration for Ax = bConjugate gradient iteration for Ax = bx0 = 0 approx solutionr0 = b residual = b - Axd0 = r0 search directionfor k = 1, 2, 3, . . .αk = (rTk-1rk-1) / (dTk-1Adk-1) step lengthxk = xk-1 + αk dk-1 new approx solution rk = … new residual dk = … new search directionConjugate gradient iteration for Ax = bConjugate gradient iteration for Ax = bx0 = 0 approx solutionr0 = b residual = b - Axd0 = r0 search directionfor k = 1, 2, 3, . . .αk = (rTk-1rk-1) / (dTk-1Adk-1) step lengthxk = xk-1 + αk dk-1 new approx solution rk = … new residualβk = (rTk rk) / (rTk-1rk-1)dk = rk + βk dk-1 new search directionConjugate gradient iteration for Ax = bConjugate gradient iteration for Ax = bx0 = 0 approx solutionr0 = b residual = b - Axd0 = r0 search directionfor k = 1, 2, 3, . . .αk = (rTk-1rk-1) / (dTk-1Adk-1) step lengthxk = xk-1 + αk dk-1 new approx solution rk = rk-1 – αk Adk-1 new residualβk = (rTk rk) / (rTk-1rk-1)dk = rk + βk dk-1 new search directionConjugate gradient iterationConjugate gradient iteration•One matrix-vector multiplication per iteration•Two vector dot products per iteration•Four n-vectors of working storagex0 = 0, r0 = b, d0 = r0for k = 1, 2, 3, . . .αk = (rTk-1rk-1) / (dTk-1Adk-1) step lengthxk = xk-1 + αk dk-1 approx solution rk = rk-1 – αk Adk-1 residualβk = (rTk rk) / (rTk-1rk-1) improvementdk = rk + βk dk-1 search directionConjugate gradient: Krylov subspacesConjugate gradient: Krylov subspaces•Eigenvalues: Av = λv { λ1, λ2 , . . ., λn}•Cayley-Hamilton theorem: (A – λ1I)·(A – λ2I) · · · (A – λnI) = 0 Therefore Σ ciAi = 0 for some ciso A-1 = Σ (–ci/c0) Ai–1 •Krylov subspace:Therefore if Ax = b, then x = A-1 b andx  span (b, Ab, A2b, . . ., An-1b) = Kn (A, b) 0  i  n1  i  nConjugate gradient: Orthogonal sequencesConjugate gradient: Orthogonal sequences•Krylov subspace: Ki (A, b) = span (b, Ab, A2b, . . ., Ai-1b) •Conjugate gradient algorithm:for i = 1, 2, 3, . . .find xi  Ki (A, b) such that ri = (b – Axi)  Ki (A, b)•Notice ri  Ki+1 (A, b), so ri  rj for all j < i•Similarly, the “directions” are A-orthogonal:(xi – xi-1 )T·A· (xj – xj-1 ) = 0•The magic: Short recurrences. . .A is symmetric => can get next residual and direction from the previous one, without saving them all.Conjugate gradient: ConvergenceConjugate gradient: Convergence•In exact arithmetic, CG converges in n steps (completely unrealistic!!)•Accuracy after k steps of CG is related to:•consider polynomials of degree k that are equal to 1 at 0.•how small can such a polynomial be at all the eigenvalues of A?•Thus, eigenvalues close together are good.•Condition number: κ(A) = ||A||2 ||A-1||2 = λmax(A) / λmin(A)•Residual is reduced by a constant factor by O(κ1/2(A)) iterations of CG.Other Krylov subspace methodsOther Krylov subspace methods•Nonsymmetric linear systems:•GMRES: for i = 1, 2, 3, . . . find xi  Ki (A, b) such that


View Full Document

UCSB CS 240A - MODEL PROBLEM

Download MODEL PROBLEM
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view MODEL PROBLEM and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view MODEL PROBLEM 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?