DOC PREVIEW
MIT 6 854J - Lecture Notes

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

MIT OpenCourseWarehttp://ocw.mit.edu 6.854J / 18.415J Advanced Algorithms Fall 2008�� For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.18.415/6.854 Advanced Algorithms October 3, 2001 Lecture 7 Lecturer: Michel X. Goemans Scribe: Nick Hanssens and Nick Matsakis 1 Properties of Barrier Problem Last lecture, we used the strict convexity of the logarithmic barrier function to show that BP(p) has at most one solution. Now we will prove that a solution exists. We assume throughout and without loss of generality that the rank of A = m where A is rn x n. Theorem 1 Assume that 3 2 > 0 :Ax = b (2.e. BP(p) is feasible), and 3 S > 0, 3 y :ATjj+S = C. Then BP(p) is finite and has a unique solution. Proof of Theorem 1: Take any x > 0 such that Ax = b. We have: = (s^T+jjT~)~-pCln~i j = STx+eT~x-pClnxj j = jjTb +C(ij xj -p ln xj), (this sum cannot be arbitrarily negative) j implying that the objective function is lower bounded by a constant (this follows from the fact that for p > 0, Sj x +plnx tends to +oo as x goes to 0 or to +m). Therefore the infimum of BP(p) is finite for every p > 0. To show that the infimum is attained (that there exists an optimum solution), it is sufficient to notice that the argument above also leads to upper and lower bounds on xj in order to have a value below the one for 2, which means that we can restrict our attention to a compact set; this implies that the infimum is attained. Finally, we have shown last time that if an optimum solution exists then it is unique. For any p > 0, the unique solution to BP(p) is called the p-center. 2 Karush, Kuhn, and Tucker (KKT) Conditions Remember the optimality conditions from last lecture. The solution x is optimum for BP(p) if 3 y such thatwhere By setting s to be pXV1e, these conditions can be re-written as 3 y, s such that 2.1 Definition of Algorithm To find the p-center, we need to solve (1)-(3). However, the constraints (3) are quadratic, making this hard to solve1. Instead, in order to find (or approximate) the p-center, we use an iterative method based on Netwon's method. We assume we have a solution that satisfies (1) and (2), but not necessarily (3). We will then linearize equations (3) around our values of x and s, and solve the corresponding linear system. This gives us new values for x and s and we proceed. We will show that if we start "close enough" from the p-center then after this update step we will be even closer, and this iterative process will converge to the p-center of BP(p). 2.2 Update Derivation Replacing x, y and s with and ignoring Ax . As in (3), we arrive at AAx = 0, A~AY+AS= 0, xj .sj +Axj -sj +xj .Asj = /L. We claim this system has the unique solution, where lif such a system was easy to solve then this would give a simple algorithm for linear programming by setting p to 0 and replacing the strict inequalities by inequalities.Indeed, (6) implies that Axj +xjsjlAsj = psj' -xj,or in vector notation, Premultiplying by A, using (4) and the fact that Arr: = b, we get Observe that this is not a square system of equations (but we have m equations in n unknowns). Substituting As by -ATAy (because of (5)), we get But AXS-'AT is an m x m matrix of rank m since A has rank m and X and S-' are diagonal matrices with positive diagonal elements. Thus AXS-'AT is invertible and we derive (7). (5) then immediately implies (8), and (10) implies (9). At each step, then, replace x and s with the values x +Ax and s +As (y can always be derived from x and s). We will show that this iteration will converge to the p-center of BP(p). 3 Definitions and Properties 3.1 Proximity Measure Let o(x, s,p) = 1 lv 11 be the proximity measure where vj = -1. Note that this will be zero at the p-center. We will show that llvll decreases with each iteration. 3.2 ds and dx As (x,S,p) + (x +Ax, s + As, p), our proximity vector v becomes w where: p +Axj -Asj --1 lu --Axj -Asj which we are hoping will be small. P For the analysis, it will be useful to rescale the x-space and the s-space so that the the current iterates x and s are equal, but in a way that xjsj remains constant. For this, we will rescale component j of any vector in the x-space by and component j of any vector in the s space by fi.Rescaling Ax and As, we express wj as wj = dxj . dsj where dxj = (3. c)and 3.3 Properties Property 1 AxlAs. Proof of Property 1: This is true because Ax is in the nullspace of A while As is in the columnspace of AT. Indeed, premultiplying (5) by AxT and using (4), we get that AxTAs = 0. Observe that although x + Ax and s +As do not necessarily satisfy (and will not) that (xj + Axj)(sj + Asj) = p, on average they do since the duality gap (x + AX)^ (s + As) = Cj(xjsj + Axjsj +xjAsj + AxjAsj) =np by the above property and (6).Property 2 dxlds. Proof of Property 2: dxTds = Cjdxj -ds . -'CjAxj . Asj = 0, using the definition of dxj, 3-P dsj and Property 1. Property 3 Proof of Property 3: 4 Theorems 2 and 3 Theorem 2 If a(x, s,p) < 1, then x + Ax > 0 and s + As > 0. Theorem 3 If a(x,s,p) < a < 1, then o(x + AX,s + As, p) < -. These two theorems guarantee that, provided a(x,s,p) is sufficiently small, repeated updates x + Ax, s + As given in the algorithm will converge to the p-center. Theorem 2 was not proved in this lecture. The proof for theorem 3 is provided below. Theorem 3 shows that the convergence is quadratic (provided we are close enough). Proof of Theorem 3: We have that o2(x + Ax, s + As, p) = 1 lw1I2 = CjW: = Cjdx: . ds:. Using the fact that 4 a . b < (a + b)2 and Property 2, Using property 3,5 Taking the square root, we get Now, consider these two terms. The first, x.vz, is equal to a2 by definition. The second, ? 3maxj p/ (xj .sj), is at most 1/(1-a) by the following argument: since 1-a is strictly positive under the conditions of the theorem. Using these upper bounds in (ll), we can conclude the proof by stating, a(x +Ax, s +As, p) 5 a2 2-(1-0)' Corollary 4 Ifa < $, then -< a < $. This corollary gives us a necessary initial bound on a to guarantee convergence. Theorem 5 Theorems 2 and 3 show that the updates Ax, As given in the algorithm will converge to the p-center. However, instead of …


View Full Document

MIT 6 854J - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?