Krylov Subspace Methods for the Eigenvalue problemPresented by: Sanjeev KumarApplicationsWe need only few eigen (singular) pairs, and matrices can be large and sparseSolving homogeneous system of linear equations A x = 0. Solution is given by right singular vector of A corresponding to smallest singular valuePrincipal component analysisWe are interested in eigen pairs corresponding to few largest eigenvaluesDiscretization of Partial differential equationSpectral image segmentationReferences:Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., and van der Vorst, H., editors. 2000. Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide, SIAM, Philadelphia, PA . http://www.cs.utk.edu/~dongarra/etemplates/book.htmlD.C. Sorensen, Implicitly Restarted Arnoldi/Lanczos Methods for Large Scale Eigenvalue Calculationshttp://techreports.larc.nasa.gov/icase/1996/icase-1996-40.pdfC.D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAMhttp://www.matrixanalysis.com/DownloadChapters.htmlMarco Latini, The QR algorithm past and present http://www.acm.caltech.edu/~mlatini/research/presentation-qr-feb04.pdfJ. Shi and J. Malik. “Normalized Cuts and Image Segmentation”, IEEE PAMI, Vol. 22, No. 8, August 2000Review:Eigenvalue and Eigenvector If A x = λ xwhere,A ∈ Rn nx : vectorλ : scalarThen,λ : eigenvaluex : eigenvector(λ,x) : eigen pairReview:Eigenvalue decomposition (EVD) A V = V D V = [ x1, x2, L, xn]Xi’s are eigenvectors D = diag( λ1, λ2, L, λn)λI’s are eigenvalues In Matlab: [ V, D ] = eig( A )Review:Characteristic Equation Eigenvalues are roots of the polynomial equationdet( A - λ I ) = 0I : n × n Identity matrixdet(.) : determinant of the matrixPolynomial equation of degree nReview:Companion Matrix Roots of a polynomial equationxn+ αn-1xn-1 + L + α1x + α0 = 0are given by eigenvalues of the matrixReview:Abel-Rufini’s Theorem TheoremThere are no algebraic formulae for roots of a general polynomial with degree greater than 4 ConsequenceAs opposed to solving linear system of equations, iteration is the only way for eigenvalue computations for a general matrixReview:Eigenvector Expansion(λi, xi) are eigen pairs of matrix ALet us express any vector v as linear combination of eigenvectors,v = c1x1+ + cnxn Result of successive multiplication by A can be represented as,A v = λ1c1x1+ + λncnxn(Aj) v = λ1jc1x1+ + λnjcnxnUseful laterProblem Statement Given a matrix A ∈ Rnn, find k eigen pairs corresponding to eigenvalues with properties such asLargest (or smallest) absolute valueLargest (or smallest) real partNearest to given scalar µPower Iteration Basic iteration step: Analysis(λi, xi) : eigen pair of A, arranged in decreasing order of abs(λi)Power Iteration, cont. Eigenvalue estimate : Convergence rate (eigenvector): Disadvantages:Very slow convergence if λ1 λ2Cannot find complex eigenvaluesOnly finds largest eigenvalueSpectral Transformation A ∈ C n nhas eigen pair (λ, x) p(τ) and q(τ) are polynomials in τ Polynomial transformationp(A) has eigen pair (p(λ), x) Rational transformation[q(A)]-1p(A) has eigen pair ( p(λ)/q(λ), x) Shift-Invert(A-µ I)-1has eigen pair ( 1/(λ-µ), x)Inverse Iteration Use a prior estimate of eigenvalue in shift-invert transformation Due to ill-conditioning, linear solver preferred to inverse Pre-factorize to keep per iteration complexity lowRayleigh Quotient Iteration Use current estimate of eigenvalue as shift AdvantagesFaster convergence: quadratic in general and cubic for hermitian problem DisadvantagesPer iteration complexity highSubspace Iteration Used for finding multiple eigenvalues simultaneously. Generalization of power iteration to multiple vectors. Need better normalization than individually normalizing each vector otherwise every vector will converge to v1Subspace Iteration Start with Q0∈ Cnkwhose columns are orthonormal Iteration stepsZj= A Qj-1Orthonormalize Zj= XjRjColumns of Xjare orthonormalRjis upper triangularQj= XjTest for convergence Convergence rate = Upper Hessenberg Matrix Upper Hessenberg MatrixH(i,j) = 0 for i>(j+1)Hermitian ⇒ Tri-diagonal A → HHouseholder reductionGivens rotationBoth are O(n3) in generalSchur’s Triangularization Theorem ∀ A ∈ C n n, ∃ Q, R,such thatQ is a unitary matrix (not unique)R is an upper triangular matrix (not unique)A Q = Q RDiagonal elements of R are eigenvalues of A A → R → { λ } 2ndstep is trivial but 1ststep is very difficult A → H → { λ }In practice, we go from A to upper hessenberg HQR Iteration Iteration steps:Ak= QkRkAk+1= RkQk Every iteration is similarity and hence preserves eigenvaluesAk+1= QkTAkQk If converges then converges to A= R Doesn’t converge for matrices with complex or negative eigenvaluesExplicitly shifted QR iterationConvert A to upper hessenberg H using similarity transformationIteration stepsHk- αk= QkRkHk= RkQk+ αkEfficiency considerationsHouseholder reduction is efficient for A HQR factorization of H can be done efficiently using Givens rotationRelated to inverse power iteration with shift αkImplicitly shifted QR iteration Combine two complex conjugate shifts Can handle complex eigenvalues and eigenvectors using real arithmetic, thus increasing efficiencyDefinitions For A ∈ Cn nand 0 ≠ b ∈ Cn 1,{ b, Ab, A2b, L, Aj-1b } : Krylov sequenceKj= span{ b, Ab, L, Aj-1b } : Krylov subspaceKnj= ( b | Ab | L | Aj-1b ) : Krylov matrixKi= Range( Knj)Krylov Subspace Let A have n distinct eigenvalues λ1, L, λn, with orthonormal eigenvectors x1, L, xnwhich form an orthonormal basis for Rn. For any vector b ∈ Rn, b = c1x1+ L + cnxn Let’s analyze the structure of Krylov matrix, Knj= ( b | Ab | L | Aj-1b )Krylov Subspace If ci≠ 0 ∀ i, Rank(Knj) = min( j, n) If number of non-zero ci’s is m,Rank(Knj) = min( j, m)Basis for Krylov Subspace Krylov sequence forms a basis for Krylov subspace but it is ill-conditioned. Better to work with an orthonormal basis. Lanczos algorithm builds an orthonormal basis for Krylov subspace for hermitian matrices. Arnoldi algorithm generalizes this to non-hermitian
View Full Document