Unformatted text preview:

Subgradients• subgradients• strong and weak subgradient calculus• optimality conditions via subgradients• directional derivativesProf. S. Boyd, EE364b, Stanford UniversityBasic inequalityrecall basic inequality for convex differentiable f:f(y) ≥ f(x) + ∇f(x)T(y − x)• first-order approximation of f at x is global underestimator• (∇f(x), −1) supports epi f at (x, f(x))what if f is not differentiable?Prof. S. Boyd, EE364b, Stanford Uni ver si ty 1Subgradient of a functiong is a subgradient of f (not necessarily convex) at x iff(y) ≥ f(x) + gT(y − x) for all yx1x2f(x1) + gT1(x − x1)f(x2) + gT2(x − x2)f(x2) + gT3(x − x2)f(x)g2, g3are subgradients at x2; g1is a subgradient at x1Prof. S. Boyd, EE364b, Stanford Uni ver si ty 2• g is a subgradient of f at x iff (g, −1) supports epi f at (x, f(x))• g is a subgradient iff f(x) + gT(y − x) is a global (affine)underestimator of f• if f is convex and differentiable, ∇f(x ) is a subgradi en t of f at xsubgradients come up in several contexts:• algorithms for nondifferentiable convex optimiza tio n• convex analysis, e.g., optimality conditions, duali ty for nondiffer en ti ab leproblems(if f(y) ≤ f(x) + gT(y − x) for all y, then g is a supergradient)Prof. S. Boyd, EE364b, Stanford Uni ver si ty 3Examplef = max{f1, f2}, with f1, f2convex and differentiablex0f1(x)f2(x)f(x)• f1(x0) > f2(x0): unique subgradient g = ∇f1(x0)• f2(x0) > f1(x0): unique subgradient g = ∇f2(x0)• f1(x0) = f2(x0): subgradients form a line segment [∇f1(x0), ∇f2(x0)]Prof. S. Boyd, EE364b, Stanford Uni ver si ty 4Subdiff erenti al• set of all subgradients of f at x is called the subdifferential of f at x,denoted ∂f(x)• ∂f(x) is a closed convex set (can be empty)if f is convex,• ∂f(x) is nonempty, for x ∈ relint dom f• ∂f(x) = {∇f(x)}, if f is differentiable at x• if ∂f(x) = {g}, then f is differentiable at x and g = ∇f(x)Prof. S. Boyd, EE364b, Stanford Uni ver si ty 5Examplef(x) = |x|f(x) = |x|∂f (x)xx1−1righthand plot showsS{(x, g) | x ∈ R, g ∈ ∂f(x)}Prof. S. Boyd, EE364b, Stanford Uni ver si ty 6Subgradient calculus• weak subgradient calculus: formulas for finding one subgradientg ∈ ∂f(x)• strong subgradient calculus: formulas for finding the wholesubdifferential ∂f(x), i.e., all subgradients of f at x• many algorithms for nondifferentiable convex optimizat ion re qu ir e onl yone subgradient at each step, so weak calculus suffices• some algorithms, optimality conditions, etc., need whole subdifferential• roughly speaking: if you can compute f(x), you can usually compute ag ∈ ∂f(x)• we’ll assume that f is convex, and x ∈ relint dom fProf. S. Boyd, EE364b, Stanford Uni ver si ty 7Some basic rules• ∂f(x) = {∇f(x)} if f is differentiab l e at x• scaling: ∂(αf) = α∂f (if α > 0)• addition: ∂(f1+ f2) = ∂f1+ ∂f2(RHS is addition of point-to-setmappings)• affine transformation of variables: if g(x) = f(Ax + b), then∂g(x) = AT∂f(Ax + b)• finite pointwise maximum: if f = maxi=1,...,mfi, then∂f(x) = Co[{∂fi(x) | fi(x) = f(x)},i.e., convex hull of union of subdifferentials of ‘active’ fun cti ons at xProf. S. Boyd, EE364b, Stanford Uni ver si ty 8f(x) = max{f1(x), . . . , fm(x)}, with f1, . . . , fmdifferentiable∂f(x) = Co{∇fi(x) | fi(x) = f(x)}example: f(x) = kxk1= max{sTx | si∈ {−1, 1}}11−1−1∂f (x) at x = (0, 0 )11−1at x = (1, 0 )(1,1)at x = (1, 1 )Prof. S. Boyd, EE364b, Stanford Uni ver si ty 9Pointwise supremumif f = supα∈Afα,cl Co[{∂fβ(x) | fβ(x) = f(x)} ⊆ ∂f(x)(usually get equality, but requires some technical conditions to hold, e.g.,A compact, fαcts in x and α)roughly speaking, ∂f(x) is closure of convex hull of union ofsubdifferentials of active functionsProf. S. Boyd, EE364b, Stanford Uni ver si ty 10Weak rule for pointwise supremumf = supα∈Afα• find any β for which fβ(x) = f(x) (assuming supr em um is achie ved )• choose any g ∈ ∂fβ(x)• then, g ∈ ∂f(x)Prof. S. Boyd, EE364b, Stanford Uni ver si ty 11examplef(x) = λmax(A(x)) = supkyk2=1yTA(x)ywhere A(x) = A0+ x1A1+ · · · + xnAn, Ai∈ Sk• f is pointwise supremum of gy(x) = yTA(x)y over kyk2= 1• gyis affine in x, with ∇gy(x) = (yTA1y, . . . , yTAny)• hence, ∂f(x) ⊇ Co {∇gy| A(x)y = λmax(A(x))y, kyk2= 1}(in fact equality holds here)to find one subgradient at x, can choose any unit eigenvector y associatedwith λmax(A(x)); then(yTA1y, . . . , yTAny) ∈ ∂f(x)Prof. S. Boyd, EE364b, Stanford Uni ver si ty 12Expec tati on• f(x) = E f(x, ω), with f convex in x for each ω, ω a random variab l e• for each ω, choo se any gω∈ ∂f(x, ω) (so ω 7→ gωis a function)• then, g = E gω∈ ∂f(x)Monte Carlo method for (approxim ate ly) computing f(x) and a g ∈ ∂f(x):• generate independent samples ω1, . . . , ωKfrom distribution of ω• f(x) ≈ (1/K)PKi=1f(x, ωi)• for each i choose gi∈ ∂xf(x, ωi)• g = (1/K)PKi=1giis an (approximate) subgradient(more on this later)Prof. S. Boyd, EE364b, Stanford Uni ver si ty 13Minimizationdefine g(y) as the optimal value ofminimize f0(x)subject to fi(x) ≤ yi, i = 1, . . . , m(ficonvex; variable x)with λ⋆an optimal dual variable, we haveg(z) ≥ g(y) −mXi=1λ⋆i(zi− yi)i.e., −λ⋆is a subgradient of g at yProf. S. Boyd, EE364b, Stanford Uni ver si ty 14Compos it ion• f(x) = h(f1(x), . . . , fk(x)), with h convex nondecreasing, ficonvex• find q ∈ ∂h(f1(x), . . . , fk(x)), gi∈ ∂fi(x)• then, g = q1g1+ · · · + qkgk∈ ∂f(x)• reduces to standard formula for differentiable h, fiproof:f(y) = h(f1(y), . . . , fk(y))≥ h(f1(x) + gT1(y − x), . . . , fk(x) + gTk(y − x))≥ h(f1(x), . . . , fk(x)) + qT(gT1(y − x), . . . , gTk(y − x))= f(x) + gT(y − x)Prof. S. Boyd, EE364b, Stanford Uni ver si ty 15Subgradients and s uble vel setsg is a subgradient at x means f(y) ≥ f(x) + gT(y − x)hence f(y) ≤ f(x) =⇒ gT(y − x) ≤ 0f(x) ≤ f(x0)x0g ∈ ∂f (x0)x1∇f(x1)Prof. S. Boyd, EE364b, Stanford Uni ver si ty 16• f differentiable at x0: ∇f(x0) is nor mal to the subleve l set{x | f(x) ≤ f(x0)}• f nondifferentiable at x0: subgradient defines a supporting hyperplaneto sublevel set through x0Prof. S. Boyd, EE364b, Stanford Uni ver si ty 17Quasigradientsg 6= 0 is a quasigradient of f at x ifgT(y − x) ≥ 0 =⇒ f(y) ≥ f(x)holds for all ygxf(y) ≤ f(x)quasigradients at x form a coneProf. S. Boyd,


View Full Document

Stanford EE 364B - Subgradients

Download Subgradients
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Subgradients and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Subgradients 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?