ℓ1-norm Methods for Convex-Cardinality Problems Part II • total variation • iterated weighted ℓ1 heuristic • matrix rank constraints Prof. S. Boyd, EE364b, Stanford UniversityTotal vari ation r econstruction • fit xcor with piecew ise constant xˆ, no more than k jumps • convex-cardinality problem: minimize �xˆ − xcor�2 subject to card(Dx) ≤ k (D is first order differe nce matrix) • heuristic: minimize �xˆ − xcor�2 + γ�Dx�1; vary γ to adjust num ber of jumps • �Dx�1 is total variation of signal xˆ• method is c a lle d total variation reconstruction • unlike ℓ2 based reconstruction, TVR filters high fr equency noise out while preserving shar p jumps Prof. S. Boyd, EE364b, Stanford University 1Example (§6.3.3 in BV book) signal x ∈ R2000 and corrupted signal xcor ∈ R2000 2 1 0 −1 −2 0 500 1000 1500 2000 xcor x 2 1 0 −1 −2 0 500 1000 1500 2000 i Prof. S. Boyd, EE364b, Stanford University 2Total vari ation r econstruction for three values of γ −2 0 2 ˆx0 500 1000 1500 2000 −2 0 2 ˆx0 500 1000 1500 2000 0 500 1000 1500 2000 i −2 0 2 ˆx Prof. S. Boyd, EE364b, Stanford University 3ℓ2 reconstruction for three values of γ 2 xˆ xˆ xˆ 0 −2 0 500 1000 1500 2000 2 0 −2 0 500 1000 1500 2000 2 0 −2 0 500 1000 1500 2000 i Prof. S. Boyd, EE364b, Stanford University 4� � Example: 2D total vari ation reconstruction • x ∈ Rn are values of pixels on N × N grid (N = 31, so n = 961) • assumption: x has r elatively few big changes in value (i.e., boundaries) • we have m = 120 linear measurements, y = F x (Fij ∼ N (0, 1)) • as convex-cardinality problem: minimize card(xi,j − xi+1,j) + card(xi,j − xi,j+1) subject to y = F x • ℓ1 heuristic (objec tive is a 2D version of total variation) minimize |xi,j − xi+1,j| + |xi,j − xi,j+1| subject to y = F x Prof. S. Boyd, EE364b, Stanford University 5TV reconstruction original TV reconstruction 1.5 1.5 11 0.5 0.5 00 −0.5 −0.5 00 55 10 3010 30 15 2515 25 2020 2020 15 1525 25 10 10 30 3055 35 35 . . . not bad for 8× more variables than measurements! Prof. S. Boyd, EE364b, Stanford University 6ℓ2 reconstruction original ℓ2 reconstruction 1.5 1.5 11 0.5 0.5 00 −0.5 −0.5 0 0 5 5 10 30 10 30 15 25 15 25 20 20 20 20 25 15 25 15 10 10 30 5 30 5 35 35 . . . this is what you’d expect with 8× more variables tha n measurements Prof. S. Boyd, EE364b, Stanford University 7Iterated weighted ℓ1 heuristic • to minimize card(x) over x ∈ C w := 1 repeat minimize � diag(w)x�1 over x ∈ C wi := 1/(ǫ + |xi|) • first iteration is basic ℓ1 heuristic • increases relative we ight on small xi • typically converges in 5 or f ewer steps • often gives a m odest improvement (i.e., reduction in card(x)) over basic ℓ1 heuristic Prof. S. Boyd, EE364b, Stanford University 8� Interpretation • wlog we can take x � 0 (by writing x = x+ − x−, x+, x− � 0, and replacing card(x) with card(x+) + card(x−)) • we’ll use approximation card(z) ≈ log(1 + z/ǫ), where ǫ > 0, z ∈ R+ • using this approximation, we get (nonconvex) problem minimize in =1 log(1 + xi/ǫ) subject to x ∈ C, x � 0 • we’ll find a loca l solution by linear izing objective at current point, n n n (k) � � � xi − xlog(1 + xi/ǫ) ≈ log(1 + x(ik)/ǫ) + (ik)ǫ + xi=1 i=1 i=1 i Prof. S. Boyd, EE364b, Stanford University 9and solving resulting convex problem minimize �in =1 wixi subject to x ∈ C, x � 0 with wi = 1/(ǫ + xi), to ge t next iterate • repeat until convergence to get a local solution Prof. S. Boyd, EE364b, Stanford University 10Sparse solution of linear inequalities • minimize card(x) over polyhedron {x | Ax � b}, A ∈ R100×50 • ℓ1 heuristic finds x ∈ R50 with card(x) = 44 • iterated weighted ℓ1 heuristic finds x with card(x) = 36 (global solution, via branch & bound, is card(x) = 32) 1 2 3 4 5 6 iteration Prof. S. Boyd, EE364b, Stanford University 0 10 20 30 40 50 card(x) iterated ℓ1 ℓ1 11Detecting changes in time series model • AR(2) scalar time-series model y(t + 2) = a(t)y(t + 1) + b(t)y(t) + v(t), v(t) II D N (0, 0.52) • assumption: a(t) and b(t) are piecewise constant, change infrequently • given y(t), t = 1, . . . , T , estimate a(t), b(t), t = 1, . . . , T − 2 • heuristic: minimize over variables a(t), b(t), t = 1, . . . , T − 1 �T −2(y(t + 2) − a(t)y(t + 1) − b(t)y(t))2 t=1 �T −2 +γ t=1 (|a(t + 1) − a(t)| + |b (t + 1) − b(t)|) • vary γ to trade off fit versus number of changes in a, b Prof. S. Boyd, EE364b, Stanford University 12Time series and true coefficients y(t) 3 1 0.82 0.6 0.41 0.2 00 −0.2 b(t) −0.4−1 a(t)−0.6 −2 −0.8 −1 −3 0 50 100 150 200 250 300 50 100 150 200 250 300 tt Prof. S. Boyd, EE364b, Stanford University 13TV heuristic and iterated TV heuristic left: TV with γ = 10; right: ite r a ted TV, 5 iterations, ǫ = 0.005 1 0.8 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 −1 1 0.8 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 −1 50 100 150 200 250 300 50 100 150 200 250 300 t t Prof. S. Boyd, EE364b, Stanford University 14� Extension to matrices • Rank is natural analog of card for m a trices • convex-rank problem: convex, except for Rank in objective or constraints • rank problem reduces to card problem when matrice s are diagonal: Rank(diag(x)) = card(x) • analog of ℓ1 heuristic: use nuclear norm, �X�∗ = i σi(X) (sum of singular values; dual of spectral norm) • for X � 0, re duces to Tr X (for x � 0, �x�1 reduces to 1T x) Prof. S. Boyd, EE364b, Stanford University 15Factor modeling • given matrix Σ ∈ Sn +, find approximation of form Σˆ= F FT + D, where F ∈ Rn×r , D is diagonal nonnegative • gives underlying factor model (with r factors) x = F z + v, v ∼ N (0, D), z ∼ N (0, I) • model with fewest factors: minimize Rank X subject to X � 0, D � 0 diagonal
View Full Document