ESE 415 (U. S. Kamilov) Tutorial session (2/23/2018) TA: Hesam MazidiProblem 1Show that the Vandermonde matrixA =x4x3x2x3x2xx2x 1(1)is positive semidefinite (PSD) but not positive definite (PD).Solution 1We note that A is PSD if and only if for any y = [y1, y2, y3]Twe have yTAy ≥ 0. Further, A is PD ifand only if for any y = [y1, y2, y3]T6= [0, 0, 0]Twe have yTAy > 0 .Notice that for any y = [y1, y2, y3]TyTAy = (y1x2+ y2x + y3)2. (2)Thus, it immediately follows that A is PSD. Now, assume y = [y1, y2, y3]T6= [0, 0, 0]T. In this case,we can always find a vector [y1, y2, y3]Tsuch that (y1x2+ y2x + y3) = 0. In particular, [2, −x, −x2]satisfies this condition, concluding that A is not PD as desired.Problem2In each of the following problems fully justify your answer using optimality conditions.(i) Show that f(x; y) = (x2− 4)2+ y2has two global minima and one stationary point, which isneither a local maximum nor a local minimum.(ii) Find all local minima and all local maxima of f (x; y) = sin x + sin y + sin(x + y) within the set{0 < x < 2π; 0 < y < 2π}.Page 1 of 4ESE 415 (U. S. Kamilov) Tutorial session (2/23/2018) TA: Hesam MazidiSolution 2(i) Note that f(·) is a differentiable function with domain R2. This means that any stationary point[x∗, y∗]T(local maximum, global maximum, etc.) must satisfy ∇f([x∗, y∗]T) = 0. That is,∇f =(2(x∗2− 4) · 2x∗2y∗=(4(x∗3− 4x∗)2y∗= 0. (3)The stationary points are easily obtained form eq. 3 as [x∗, y∗]T= [0, 0]T, [2, 0]T, or [−2, 0]T.Since f(·) is twice differentiable, we can invoke sufficient second-order optimality conditions.In particular, denote H([x∗, y∗]T) as the Hessian of f (·) obtained at [x∗, y∗]T. Then, ifH([x∗, y∗]T) is positive definite it follows that [x∗, y∗]Tis a local maximum; if H([x∗, y∗]T) isnegative definite, [x∗, y∗]Tis a local minimum.H([x∗, y∗]T) =4(3x∗2− 4) 00 2(4)We therefore have the following characterizations:• [2, 0]Tis a local minimum since H([2, 0]T) is positive definite.• [−2, 0]Tis a local minimum since H([−2, 0]T) is positive definite.Note that H([0, 0]T) is indefinite, that is, it is neither positive semidefinite nor negative semidef-inite. By invoking the necessary second order optimality conditions, which state that if a station-ary point [x∗, y∗]Tis a local minimum (local maximum) then H([x∗, y∗]T) is positive semidefi-nite (negative semidefinite), we conclude that [0, 0]Tis only a stationary point.(ii) We approach this problem similar to (i).∇f =cos(x) + cos(x + y)cos(y) + cos(x + y)= 0. (5)Therefore, the stationary points [x∗, y∗]Tare [π, π]T, [π/3, π/3]T, and [5/3 π, 5/3 π]T. We pro-ceed by checking the Hessian of f at stationary points:H([x∗, y∗]T) =− sin(x) − sin(x + y) − sin(x + y)− sin(x + y) − sin(y) − sin(x + y)(6)Clearly H([π/3, π/3]T) is negative definite while H([5/3 π, 5/3 π]T) is positive definite. That is,[π/3, π/3]Tis a local maximum and [5/3 π, 5/3 π]Tis a local minimum.We note that H([π, π]T) = 0. This means that we cannot determine whether [π, π]Tis a localmaximum or local minimum from the necessary and sufficient second-order optimality con-ditions. One way to tackle this problem is to check the behavior of the function around thestationary point [π, π]Tby using the Taylor approximation:f(x, y) − f(π, π) = sin(x) + sin(y) + sin(x, y) − 0= sin(x − π + π) + sin(y − π + π) + sin(x − π + y − π + 2π)(use trigonometric identities) = − sin(x − π) − sin(y − π) + sin(x − π + y − π)=13!(x − π)3+13!(y − π)3−13![(x − π) + (y − π)]3+ o[(x − π)3+ (y − π)3]By examining f(x, y) − f (π, π) along (π + ∆x, π + ∆y) using above approximation, we see that[π, π]Tcannot be a local maximum or a local minimum. Thus, [π, π]Tis only a stationary point.Page 2 of 4ESE 415 (U. S. Kamilov) Tutorial session (2/23/2018) TA: Hesam MazidiProblem 3Show that the Newton method is affine invariant; that is, Given f and a non-singular matrix A,applying the Newton method for function g(y) = f (Ay) would not affect the iterative progressiontoward the optimal point, assuming it exists.Solution 3Define x = Ay for all y ∈ Rn, that is, x is a affine transformation of y. We first note that we have aone-to-one mapping from x to y since we can set x = Ay and y = A−1x.The Newton updates for g are given byyk+1= yk− (∇2g(yk))−1∇g(yk) (7)= yk− (AT∇2f(Ayk)A)−1AT∇f(Ayk) (8)= yk− A−1(∇2f(Ayk))−1∇f(Ayk) (9)Note that we can re-write eq. 10 asyk+1= yk− A−1(∇2f(xk))−1∇f(xk) (10)Therefore, we haveyk+1− yk= −A−1(∇2f(xk))−1∇f(xk) (11)A(yk+1− yk) = −(∇2f(xk))−1∇f(xk) (12)= xk+1− xk. (13)This means that the progression iterates {yk+1− yk}k≥0have a one-to-one mapping with {xk+1−xk}k≥0. In particular, the rate of convergence is unaffected for any non-singular affine transformationof variables (is this true for the gradient descent method?).Problem 4In this section we explore basics results concerning strong convexity. A differentiable function f issaid to be strongly convex with constant M > 0 iff(y) ≥ f(x) + ∇f(x)T(y − x) +M2kx − yk2∀ x, y in domain of f (14)Show that(i) g(x) = f (x) −M2kxk2is convex(ii) (∇f (x) − ∇f(y))T(x − y) ≥ Mkx − yk2, ∀x, yPage 3 of 4ESE 415 (U. S. Kamilov) Tutorial session (2/23/2018) TA: Hesam MazidiSolution 4(i) Note that g is convex if and only if g(y) ≥ g(x) + ∇g(x)T(y − x) ∀x, y. Therefore, we exploitthe strong convexity of f to establish the convexity of g.First, we have∇g(x) = ∇f(x) − Mx, ∇g(x)T(y − x) = ∇f(x)T(y − x) − MxTy + Mkxk2(15)using eq. 14 we get the desired result.(ii) To establish this inequality we need a property of convex functions. Specifically, h is a convexfunction then if and only if(∇h(x) − ∇h(y))T(x − y) ≥ 0. (16)Equipped with eq. 16 and using the result of (i), we have(∇g(x) − ∇g(y))T(x − y) ≥ 0 (17)(∇f(x) − ∇f(y) + M(y − x))T(x − y) ≥ 0 (18)(∇f(x) − ∇f(y))T(x − y) ≥ Mky − xk2(19)as desired.Page 4 of
View Full Document