Unformatted text preview:

1Lecture-16Lemma 5.6 & Theorem 5.7Lemma 5.6Suppose that the Algorithm 5.4 is implemented with a step lengthsuch that it satisfies strong wolf conditions with 0<c2<1/2. Then the method generates the descent directions pkthat satisfies the following inequalities: 10 ,112||||112222K,kccfpfckkTk=∀−−≤∇∇≤−−210 ,111222<<−<−−<− cc210 ,01121222<<<−−<− ccc(B)2Induction, k=0: 10 ,112||||112222K,kccfpfckkTk=∀−−≤∇∇≤−−1||||||||20002000−=∇∇∇−=∇∇ffffpfTTSo by using (B), it is easy to see the both inequalities are satisfied. Assume holds for k.;111 kkkkpfp++++−∇=βAlgorithm (5.4);111111 kTkkkTkkTkpfffpf++++++∇+∇−∇=∇β21112111||||1||||++++++∇∇+−=∇∇kkTkkkkTkfpffpfβ210 ,111222<<−<−−<− cc210 ,01121222<<<−−<− ccc(B)Proof211112111||||1||||+++++++∇∇∇∇∇∇+−=∇∇kkTkkTkkTkkkTkfpffffffpf21112111||||1||||++++++∇∇+−=∇∇kkTkkkkTkfpffpfβ;111kTkkTkFRkffff∇∇∇∇←+++β212111||||1||||kkTKkkTkfpffpf∇∇+−=∇∇++++||||21 kTkkTkpfcpf ∇≤∇+pkfcpkfpkfcTkTkTk∇−≤∇≤∇+ 212(C)Wolf’s strong condition (2) 10 ,112||||112222K,kccfpfckkTk=∀−−≤∇∇≤−−3kTkkTkkTkpfcpfpfc ∇−≤∇≤∇+ 212222122||||||||||||kkTkkkTkkkTkfpfcfpffpfc∇∇−≤∇∇≤∇∇+222122||||1||||1||||1kkTkkkTkkTkfpfcfpffpkfc∇∇−−≤∇∇+−≤∇∇+−+212111||||1||||kkTKkkTkfpffpf∇∇+−=∇∇++++22211122||||1||||||||1kkTkkkTkkkTkfpfcfpffpfc∇∇−−≤∇∇≤∇∇+−+++ 10 ,112||||112222K,kccfpfckkTk=∀−−≤∇∇≤−−From (C)22211122||||1||||||||1kkTkkkTkkkTkfpfcfpffpfc∇∇−−≤∇∇≤∇∇+−+++2221112211||||11ccfpfcckkTk−+−≤∇∇≤−−−+++From inductionQED2221112112||||11ccfpfckkTk−−≤∇∇≤−−+++ 10 ,112||||112222K,kccfpfckkTk=∀−−≤∇∇≤−−4FR and PR• We can use Lemma 5.6 to explain weakness of FR-GC method– If the method generates a bad direction and a tiny step, then subsequent directions will also be bad• On the other hand PR-GC does not have that problem|||| ||||coskkkTkkfpfp∇∇−=θ 10 ,||||||||112||||||||||||||||||||112222K,kpfccpffpfpfckkkkkkTkkk=∀∇−−≤∇∇∇≤∇−− 10 ,112||||112222K,kccfpfckkTk=∀−−≤∇∇≤−−We know the angle between steepest descent and the direction is given by:Now from Lemma 5.6FR 10 ,||||||||112||||||||||||||||11222K,kpfccpfpfpfckkkkkTkkk=∀∇−−≤∇∇≤∇−− 10 ,|||||||| cos |||||||| 21K,kpfpfkkkkk=∀∇≤≤∇χθχ5FR|||| ||||coskkkTkkfpfp∇∇−=θ 10 ,|||||||| cos |||||||| 21K,kpfpfkkkkk=∀∇≤≤∇χθχ0cos ≈kθiff||||||||kkpf<<∇Since pkis almost orthogonal to the gradient, the step from Xkto Xk+1is tiny i.e. kkkkffxx∇≈∇≈++11kTkkTkkApppf∇−=α(G)FR1111≈∇∇∇∇=+++kTkkTkFRkffffβ; 1 kkpp≈+;111 kFRkkkpfp++++−∇←β|||||||kkpf<<∇So the new direction will improve little on the previous one.6PRkTkkkTkPRkfffff∇∇∇−∇∇←+++)(111β01=+PRkβ;111 kPRkkkpfp++++−∇←β11 ++−∇=kkfpThe new search direction will be close to the steepest descent; Restart!kkkkffxx∇≈∇≈++11ExampleProblem, n=100: is order of are order of FR-CG takes thousands of iterations (improves with restarts)PR-CG 37 iterations kθcos||||1−−kkxx210−210−7Convergence of Line Search Methods (Theorem 3.2) (Theorem 5.7)|||||||coskkkTkkpfpf∇∇−=θ∑≥∞<∇022||||coskkkfθThe angle between pkand steepest descent direction If step length satisfies the Wolf’s conditions, the function is continuously differentiable, the gradient is Lipschitz continuous then:Tkf∇−Convergence of Line Search MethodskTkkTkkpfcpff ∇−≥∇−∇+)1()(2121|||| )(kkkTkkpLpffα≤∇−∇+kkkkpxxα+=+1Curvature conditionIteration schemeTherefore)1,( ,) ()(122ccpxfcppxfkkTkkTkk∈∇≥+∇αLipschitz continuous||~||||)~()(|| xxLxfxf−≤∇−∇(1)(2)8Convergence of Line Search Methods21|||| )(kkkTkkpLpffα≤∇−∇+kTkkTkkpfcpff ∇−≥∇−∇+)1()(2122||||1kkTkkppfLc ∇−≥α(3)(1)Combining (1) and (3)kkkTkkpLpffα≤∇−∇+ ||||)(2121||||)(kkTkkkpLpff ∇−∇≥+αConvergence of Line Search Methods22211||||)(1kkTkkkppfLccff∇−−≤+Lcccfcffkkkk/)1( ,||||cos21221−=∇−≤+θ20201||||cosjkjjkfcff ∇−≤∑=+θkTkkkkkpfcxfpxf ∇+≤+αα1) ()(22||||1kkTkkppfLc ∇−≥αSufficient decreaseThereforekTkkkkkpfcxfpxf ∇−−≤+ )() ()(1αα9Convergence of Line Search Methods20201||||cosjkjjkfcff ∇−≤∑=+θ∑≥∞<∇022||||coskkkfθSince f is bounded below, we have f0-fk+1is less than some positive constant for all kTaking the limits:Convergence of Line Search Methods∑≥∞<∇022||||coskkkfθ0||||cos22→∇kkfθWe can be sure that gradient norms converges to zero, provided that the search directions are never too close to orthogonality with the gradientTherefore, the steepest descent produces a gradient sequence that converges to zero, provided that it uses a line search satisfying Wolf’s conditions.We can not guarantee that the method converges to a minimizer, but only that it is attracted by stationary points.If angle is bounded awayFrom 9000cos


View Full Document

UCF COT 6505 - Lecture Notes

Documents in this Course
Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?