Unformatted text preview:

1Lecture-9Conjugate Direction Algorithm(Solution of Linear System orMinimization of A Quadratic Function)Conjugate Gradient• Linear conjugate gradient: for solving linear systems Ax=b with PD matrix, A.– Exact solution in n steps (Hestenes & Stiefel, 1950s)– Approximate solution in fewer than n steps• Non-linear conjugate gradient: for solving large-scale non-linear optimization problems.– Fletcher and Reeves, 1964– Polk-Ribiere, 19692Conjugate GradientOr minimize the following function:bAx=A is symmetric PD.(1)xbAxxxTT−=21)(φ(2))()( xrbAxx=−=∇φr(x) is the residualji 0 ≠∀=jTiApp{}110,,,−=npppS KThe set S is conjugate wrt A ifLinear Independence0then 0 if1210111100=====++−−−nnnpppσσσσσσσKK{}110,,,−=npppS KS is linearly independent Conjugate set is also linearly independent.ji 0 ≠∀=jTiAppTherefore, A has at most n conjugate directions.3Conjugate Direction Methodkkkkpxxα+=+1ji 0 ≠∀=jTiAppLine search1D minimizer of a quadratic functionxbAxxxTT−=21)(φkTkkTkkApppφα∇−=Convergence Rate of Steepest Descent() ()()()() )( 0 0 0)21(kTkkTkkkkTTkkTkkTkTkkTkTkkTkkTkTkkTkkTkTkkkkTkkTkkkkfQfffQgggbQxQgggbQgxgbQgxQgggbQggQgxgbQggxgxbgxQgxddgxfdd∇∇∇∇=−=−=−==++−=+−−==−−−−=−ααααααααααααkkTkkTkkkffQfffxx ∇∇∇∇∇−=+1kkkkfxx∇−=+α1bQxxf−=∇)(From Lecture-54Conjugate Direction MethodkTkkTkTkQgggbQgx −=α )()())((kkkTTkpAppbAx−−−−=αkTkkTkkApppφα∇−=)()( xrbAxx=−=∇φkTkkTkkApppr−=αji 0 ≠∀=jTiAppTheorem 5.1For any x0the sequence {xk} generated by the conjugate direction algorithm, converges to the solution x*of the linear system in at most n steps.• Conjugate vectors• Linearly independent vectors• Sequence {xk}5ProofkTkkTkkApppr−=αkkkkpxxα+=+11111000 −−++++=kkkpppxxαααK1111000 −−+++=−kkkpppxxαααKProof1111000*−−++=−nnpppxxσσσK)()(1111000*−−++=−nnTkTkpppApxxApσσσK)000()(0*+++++=− KKkTkkTkAppxxApσ{}110,,,−=npppS KS is linearly independent Therefore:conjugatekTkTkAppxxAp =− )(0*(A)kTkTkkAppxxAp )(0*−=σ(B)6ProofkTkkTkkApppr−=αkkkkpxxα+=+11111000 −−++++=kkkpppxxαααK1111000 −−+++=−kkkpppxxαααK0AxpAxpTkkTk=kTkkTkkTkTkrpAxbpxxApxxAp −=−=−=− )()()(*0*0)(0=− xxApkTkkTkTkrpxxAp −=− )(0*)()( xrbAxx =−=∇φkTkTkAppxxAp =− )(0*(A)ProofkTkTkrpxxAp −=− )(0*kTkkTkkApppr−=αkkασ=Therefore:QEDkTkTkkAppxxAp )(0*−=σ(B)7Interpretation of Theorem 5.1If A is a diagonal matrix, then we can minimize the 1-D function along coordinate axes in n iterations.Interpretation of Theorem 5.1If A is not a diagonal matrix, then we can not minimize the function along coordinate axes in n iterations.8Transformed ProblemxSx1ˆ−=[]110,,,−=npppS KxbSxASSxxxTTTTˆ)(ˆ)(ˆ21)()ˆ(ˆ−==ϕϕxbAxxxTT−=21)(φBy conjugacy STASis a diagonal matrix.LetwhereNow we can minimize along coordinate directions in transformed space.However, each coordinate direction in transformed space correspond to the conjugate direction in the original space due to xSx1ˆ−=Therefore, we conclude the conjugate direction algorithm converges in n steps.xcxDxxxTTˆ)(ˆˆ21)()ˆ(ˆ−==ϕϕWhen Hessian is diagonal, each coordinate minimization correctly determines one of the components of the solution x*.Therefore, after k 1-D minimizations, the quadratic has been minimized on the subspace spanned by e1,e2,….,ek.9Theorem 5.2 Let x0be any starting point and suppose that the sequence {xk} is generated by the conjugate direction algorithm. Then1,,0for 0 −== kipriTkKand xkis minimizer of over the set xbAxxxTT−=21)(φ{}},,{|100 −+=kppspanxxx K(3)1,,0for 0)~( −== kipxriTK{}},,{|100 −+=kppspanxxx K)()(11000 −−+++=kkppxhσσφσK),,(110 −=kσσσσK1,,0 ,0)(*−==∂∂kixhiKσ1,,0 0)(11*0*00−==+++∇−−kipppxiTkkKKσσφFirst show that a point minimizes over the set (3) if and only ifx~φLetWhereSince is strictly convex quadratic, it has a unique minimizer:)(σhChain rule()1,,0 0~−== kipxriTK)()( xrbAxx=−=∇φr(x) is the residualProof10Proof111 −−−+=kkkkAprrα)()( xrbAxx=−=∇φkkkkAprrα+=+1kkkkpxxα+=+1(A)From (A)0001Aprrα+=000001)( pAprprTTα+=0000001AppprprTTTα+=001=prTkTkkTkkApppr−=αBecause1,,0for 0 −== kipriTkKProof111 −−−+=kkkkAprrα0111111=+=−−−−−− kTkkkTkkTkApprprpα0111=+=−−− kTikkTikTiApprprpα2,,0 −=ki K2,,0for 0 −== kipriTkK(A)From (A)inductionDefinition kTkkTkkApppr−=αAndConjugacyTherefore1,,0for 0 −== kipriTkKQEDTrue 2,,0for 01−==−kipriTkK11How do we select conjugate directions• Eigenvalues of A are mutually orthogonal and conjugate wrt to A.• Gram-Schmidt process to produce conjugate directions instead of orthogonal vectors.Basic Properties of the CG1−+−=kkkkprpβ111−−−=kTkkTkkAppAprβEach direction is chosen to be a linear combination of the steepest descent direction and the previous direction.1−+−∇=kkkkppβφ1111 −−−−+−=kTkkTkkkTkAppAprAppβ12Algorithm 5.1)(;1 ; ; ; ; ; 0 0 , , ;Given 1111111100000whileendkkprpAppAprbAxrpxxAppprrWhilekrpbAxrsetxkkkkkTkkTkkkkkkkkkTkkTkkk+←+−←←−←+←−←≠←−←−←++++++++ββααp0is steepest


View Full Document

UCF COT 6505 - Conjugate Direction Algorithm

Documents in this Course
Load more
Download Conjugate Direction Algorithm
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Conjugate Direction Algorithm and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Conjugate Direction Algorithm 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?