DOC PREVIEW
WUSTL ESE 415 - tutorial5

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

ESE 415 (U. S. Kamilov) Tutorial session (2/23/2018) TA: Hesam MazidiProblem 1Show that the Vandermonde matrixA =x4x3x2x3x2xx2x 1(1)is positive semidefinite (PSD) but not positive definite (PD).Solution 1We note that A is PSD if and only if for any y = [y1, y2, y3]Twe have yTAy ≥ 0. Further, A is PD ifand only if for any y = [y1, y2, y3]T6= [0, 0, 0]Twe have yTAy > 0 .Notice that for any y = [y1, y2, y3]TyTAy = (y1x2+ y2x + y3)2. (2)Thus, it immediately follows that A is PSD. Now, assume y = [y1, y2, y3]T6= [0, 0, 0]T. In this case,we can always find a vector [y1, y2, y3]Tsuch that (y1x2+ y2x + y3) = 0. In particular, [2, −x, −x2]satisfies this condition, concluding that A is not PD as desired.Problem2In each of the following problems fully justify your answer using optimality conditions.(i) Show that f(x; y) = (x2− 4)2+ y2has two global minima and one stationary point, which isneither a local maximum nor a local minimum.(ii) Find all local minima and all local maxima of f (x; y) = sin x + sin y + sin(x + y) within the set{0 < x < 2π; 0 < y < 2π}.Page 1 of 4ESE 415 (U. S. Kamilov) Tutorial session (2/23/2018) TA: Hesam MazidiSolution 2(i) Note that f(·) is a differentiable function with domain R2. This means that any stationary point[x∗, y∗]T(local maximum, global maximum, etc.) must satisfy ∇f([x∗, y∗]T) = 0. That is,∇f =(2(x∗2− 4) · 2x∗2y∗=(4(x∗3− 4x∗)2y∗= 0. (3)The stationary points are easily obtained form eq. 3 as [x∗, y∗]T= [0, 0]T, [2, 0]T, or [−2, 0]T.Since f(·) is twice differentiable, we can invoke sufficient second-order optimality conditions.In particular, denote H([x∗, y∗]T) as the Hessian of f (·) obtained at [x∗, y∗]T. Then, ifH([x∗, y∗]T) is positive definite it follows that [x∗, y∗]Tis a local maximum; if H([x∗, y∗]T) isnegative definite, [x∗, y∗]Tis a local minimum.H([x∗, y∗]T) =4(3x∗2− 4) 00 2(4)We therefore have the following characterizations:• [2, 0]Tis a local minimum since H([2, 0]T) is positive definite.• [−2, 0]Tis a local minimum since H([−2, 0]T) is positive definite.Note that H([0, 0]T) is indefinite, that is, it is neither positive semidefinite nor negative semidef-inite. By invoking the necessary second order optimality conditions, which state that if a station-ary point [x∗, y∗]Tis a local minimum (local maximum) then H([x∗, y∗]T) is positive semidefi-nite (negative semidefinite), we conclude that [0, 0]Tis only a stationary point.(ii) We approach this problem similar to (i).∇f =cos(x) + cos(x + y)cos(y) + cos(x + y)= 0. (5)Therefore, the stationary points [x∗, y∗]Tare [π, π]T, [π/3, π/3]T, and [5/3 π, 5/3 π]T. We pro-ceed by checking the Hessian of f at stationary points:H([x∗, y∗]T) =− sin(x) − sin(x + y) − sin(x + y)− sin(x + y) − sin(y) − sin(x + y)(6)Clearly H([π/3, π/3]T) is negative definite while H([5/3 π, 5/3 π]T) is positive definite. That is,[π/3, π/3]Tis a local maximum and [5/3 π, 5/3 π]Tis a local minimum.We note that H([π, π]T) = 0. This means that we cannot determine whether [π, π]Tis a localmaximum or local minimum from the necessary and sufficient second-order optimality con-ditions. One way to tackle this problem is to check the behavior of the function around thestationary point [π, π]Tby using the Taylor approximation:f(x, y) − f(π, π) = sin(x) + sin(y) + sin(x, y) − 0= sin(x − π + π) + sin(y − π + π) + sin(x − π + y − π + 2π)(use trigonometric identities) = − sin(x − π) − sin(y − π) + sin(x − π + y − π)=13!(x − π)3+13!(y − π)3−13![(x − π) + (y − π)]3+ o[(x − π)3+ (y − π)3]By examining f(x, y) − f (π, π) along (π + ∆x, π + ∆y) using above approximation, we see that[π, π]Tcannot be a local maximum or a local minimum. Thus, [π, π]Tis only a stationary point.Page 2 of 4ESE 415 (U. S. Kamilov) Tutorial session (2/23/2018) TA: Hesam MazidiProblem 3Show that the Newton method is affine invariant; that is, Given f and a non-singular matrix A,applying the Newton method for function g(y) = f (Ay) would not affect the iterative progressiontoward the optimal point, assuming it exists.Solution 3Define x = Ay for all y ∈ Rn, that is, x is a affine transformation of y. We first note that we have aone-to-one mapping from x to y since we can set x = Ay and y = A−1x.The Newton updates for g are given byyk+1= yk− (∇2g(yk))−1∇g(yk) (7)= yk− (AT∇2f(Ayk)A)−1AT∇f(Ayk) (8)= yk− A−1(∇2f(Ayk))−1∇f(Ayk) (9)Note that we can re-write eq. 10 asyk+1= yk− A−1(∇2f(xk))−1∇f(xk) (10)Therefore, we haveyk+1− yk= −A−1(∇2f(xk))−1∇f(xk) (11)A(yk+1− yk) = −(∇2f(xk))−1∇f(xk) (12)= xk+1− xk. (13)This means that the progression iterates {yk+1− yk}k≥0have a one-to-one mapping with {xk+1−xk}k≥0. In particular, the rate of convergence is unaffected for any non-singular affine transformationof variables (is this true for the gradient descent method?).Problem 4In this section we explore basics results concerning strong convexity. A differentiable function f issaid to be strongly convex with constant M > 0 iff(y) ≥ f(x) + ∇f(x)T(y − x) +M2kx − yk2∀ x, y in domain of f (14)Show that(i) g(x) = f (x) −M2kxk2is convex(ii) (∇f (x) − ∇f(y))T(x − y) ≥ Mkx − yk2, ∀x, yPage 3 of 4ESE 415 (U. S. Kamilov) Tutorial session (2/23/2018) TA: Hesam MazidiSolution 4(i) Note that g is convex if and only if g(y) ≥ g(x) + ∇g(x)T(y − x) ∀x, y. Therefore, we exploitthe strong convexity of f to establish the convexity of g.First, we have∇g(x) = ∇f(x) − Mx, ∇g(x)T(y − x) = ∇f(x)T(y − x) − MxTy + Mkxk2(15)using eq. 14 we get the desired result.(ii) To establish this inequality we need a property of convex functions. Specifically, h is a convexfunction then if and only if(∇h(x) − ∇h(y))T(x − y) ≥ 0. (16)Equipped with eq. 16 and using the result of (i), we have(∇g(x) − ∇g(y))T(x − y) ≥ 0 (17)(∇f(x) − ∇f(y) + M(y − x))T(x − y) ≥ 0 (18)(∇f(x) − ∇f(y))T(x − y) ≥ Mky − xk2(19)as desired.Page 4 of


View Full Document

WUSTL ESE 415 - tutorial5

Download tutorial5
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view tutorial5 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view tutorial5 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?