15.081J/6.251J Introduction to Mathematical Programming Lecture 20: The Affine Scaling Algorithm1 Outline Slide 1 • History • Geometric intuition • Algebraic development • Affine Scaling • Convergence • Initialization • Practical pe rformance 2 History Slide 2 • In 1984, Karmakar at AT&T “invented” interior point method • In 1985, Affine scaling “invented” at IBM + AT&T seeking intuitive ver- sion of Karmarkar’s algorithm • In early computational tes ts, A.S. far outperformed simplex and Kar- markar’s algorithm • In 1989, it was realised Dikin invented A.S. in 1967 3 Geometric intuition 3.1 Notation Slide 3 ′ min c x s.t. Ax = b x ≥ 0 and its dual max p ′ b s.t. p ′ A ≤ c ′ • P = {x | Ax = b, x ≥ 0} • {x ∈ P | x > 0} the interior of P and its e lements interior points 1� � � � � � � c . . . x0 x1 x2 3.2 The idea Slide 4 4 Algebraic development 4.1 Theorem Slide 5 β ∈ (0, 1), y ∈ ℜn: y > 0, and � n S = x ∈ ℜn �� (xi −2 yi)2 ≤ β2 . yii=1 Then, x > 0 for every x ∈ S Proof • x ∈ S • (xi − yi)2 ≤ β2 yi 2 < yi 2 • |xi − yi| < yi; −xi + yi < yi, and hence xi > 0 Slide 6 x ∈ S is equivalent to �|Y −1(x − y)�| ≤ β Replace original LP : ′min c x s.t. Ax = b �|Y −1(x − y)�| ≤ β. 2� � � � � � � � � � d = x − y min c ′ d s.t. Ad = 0 ||Y −1d|| ≤ β 4.2 Solution Slide 7 If rows of A are linearly independent and c is not a linear combination of the rows of A, then • optimal solution d ∗ : Y 2(c − A ′ p)d ∗ = −β� � , p = (AY 2A ′ )−1AY 2 c. �|Y (c − A ′ p)�| • x = y + d ∗ ∈ P ′ ′ ′ • c x = c y − β�|Y (c − A ′ p)�| < c y 4.2.1 Proof Slide 8 ′ • AY 2A ′ is invertible;if not, there exists some z � 0 such that z AY 2A ′ z = 0= • w = Y A ′ z; w ′ w = 0 ⇒ w = 0 • Hen ce A ′ z = 0 contradiction • Since c is not a linear combination of the rows of A, c − A ′ p =� 0 and d ∗ is well defined • d ∗ feasible Y −1d ∗ = −β � Y (c − A ′ p) � ⇒ ||Y −1d ∗ || = β �|Y (c − A ′ p)�| Ad ∗ = 0, since AY 2(c − A ′ p) = 0 • ′ ′ ′ c d = (c − p A)d = (c ′ − p ′ A)Y Y −1d ≥ − �|Y (c − A ′ p)�| · ||Y −1d|| ′ ≥ −β�|Y (c − A p)�|. Slide 9 • ′ ∗ ′ ′ ∗ c d = (c − p A)d ′ ′ Y 2(c − A ′ p) = −(c − p A) β � � �|Y (c − A ′ p)�| � �′ � � Y (c − A ′ p) Y (c − A ′ p) = −β � � �|Y (c − A ′ p)�| ′ = −β�|Y (c − A p)�|. • c ′ x = c ′ y + c ′ d ∗ = c ′ y − β�|Y (c − A ′ p)�| 34.3 Interpretation Slide 10 • y be a nondegenerate BFS with basis B • A = [B N ] • Y = diag(y1, . . . , ym, 0, . . . , 0) and Y 0 = diag (y1, . . . , ym), then AY = [BY 0 0] p = (AY 2A ′ )−1AY 2 c = 0 BY 20cB(B ′ )−1Y −2B−1= (B ′ )−1 cB • Vectors p dual estimates • r = c − A ′ p becomes r e duced costs: r = c − A ′ (B ′ )−1 cB • Under degeneracy? 4.4 Termination Slide 11 y and p be primal and dual feasible so lutions with ′ c y − b ′ p < ǫ y ∗ and p ∗ be optimal primal and dual solutions. Then, c ′ y ∗ ≤ c ′ y < c ′ y ∗ + ǫ, b ′ p ∗ − ǫ < b ′ p ≤ b ′ p ∗ 4.4.1 Proof Slide 12 • c ′ y ∗ ≤ c ′ y ′ ∗ • By weak duality, b ′ p ≤ c y ′ • Since c y − b ′ p < ǫ, c ′ y < b ′ p + ǫ ≤ c ′ y ∗ + ǫ b ′ p ∗ = c ′ y ∗ ≤ c ′ y < b ′ p + ǫ 45 Affine S caling 5.1 Inputs Slide 13 • (A, b, c); • an initial primal feasible solution x0 > 0 • the optimality tolerance ǫ > 0 • the para meter β ∈ (0, 1) 5.2 The Algorithm Slide 14 1. (Initialization) Start with some feasible x0 > 0; let k = 0. 2. (Computation of dual estimates and reduced costs) Given some feas ible x k > 0, let Xk = diag(x1 k , . . . , x nk ), p k = (AX2 kA ′ )−1AXk2 c, r k = c − A ′ p k . 3. (Optimality check) Let e = (1, 1, . . . , 1). If rk ≥ 0 and e ′ Xkrk < ǫ, then stop; the current solution xk is primal ǫ-optimal and pk is dual ǫ-optimal. 4. (Unboundedness check) If −X2 krk ≥ 0 then stop; the optimal cost is −∞. 5. (Update of primal solution) Let X2 kr k x k+1 = x k − β . ||Xkrk|| 5.3 Variants Slide 15 • ||u||∞ = maxi |ui|, γ(u) = max{ui | ui > 0} • γ(u) ≤ ||u||∞ ≤ ||u|| • Short-step method. • Long-step variants X2 k x k+1 = x k − β kr||Xkrk||∞ X2 kr k x k+1 = x k − β γ(Xkrk) 5� � 6 Convergence 6.1 Assumptions Slide 16 Assumptions A: (a) The rows of the matrix A are linearly independent. (b) The vector c is not a linear combination of the rows of A. (c) There exists an optimal solution. (d) There exists a positive feasible solution. Assumptions B: (a) Every BFS to the primal problem is nondegenerate. (b) At every BFS to the primal problem, the reduced cost of every nonbasic variable is nonzero. 6.2 Theorem Slide 17 If we apply the long-step affine scaling algorithm with ǫ = 0, the following hold: (a) For the Long-step variant and under Assumptions A and B, …
View Full Document