Stanford EE 364B - Stochastic Model Predictive Control

Unformatted text preview:

Stochastic Model Predictive Control• stochastic finite horizon control• stochastic dynamic programming• certainty equivalent model predictive controlProf. S. Boyd, EE364b, Stanford UniversityCausal state-feedback control• linear dynamical system, over finite time horizon:xt+1= Axt+ But+ wt, t = 0, . . . , T − 1– xt∈ Rnis state, ut∈ Rmis the input at time t– wtis the process noise (or exogeneous input) at time t• Xt= (x0, . . . , xt) is the state history up to time t• causal state-feedback control:ut= φt(Xt) = ψt(x0, w0, . . . , wt−1), t = 0, . . . , T − 1• φt: R(t+1)n→ Rmcalled the control policy at time tProf. S. Boyd, EE364b, Stanfor d University 1Stochastic finite horizon control• (x0, w0, . . . , wT −1) is a random variable• objective: J = EPT −1t=0ℓt(xt, ut) + ℓT(xT)– convex stage cost functions ℓt: Rn× Rm→ R, t = 0, . . . , T − 1– convex terminal cost function ℓT: Rn→ R• J depends on control policies φ0, . . . , φT −1• constraints: ut∈ Ut, t = 0, . . . , T − 1– convex input constraint sets U0, . . . , UT −1• stochastic control problem: choose control policies φ0, . . . , φT −1tominimize J, subject to constraintsProf. S. Boyd, EE364b, Stanfor d University 2Stochastic finite horizon control• an infinite dimensional problem: variables are functions φ0, . . . , φT −1– can restrict policies to finite dimensional subspace, e.g., φtall affine• key idea: we have recourse (a.k.a. feedback, closed-loop control)– we can change utbased on the observed state history x0, . . . , xt– cf standard (‘open loop’) optimal control problem, where we committo u0, . . . , uT −1ahead of time• in general case, need to evaluate J (for given control policies) viaMonte Carlo simulationProf. S. Boyd, EE364b, Stanfor d University 3‘Solution’ via dynamic programming• let Vt(Xt) be optimal value of objective, from t on, starting from initialstate history Xt• VT(XT) = ℓT(xT); J⋆= E V0(x0)• Vtcan be found by backward recursion: for t = T − 1, . . . , 0Vt(Xt) = infv∈U{ℓt(xt, v) + E(Vt+1((Xt, Axt+ Bv + wt))|Xt)}• Vt, t = 0, . . . , T are convex functions• optimal policy is causal state feedbackφ⋆t(Xt) = argminv∈U{ℓt(xt, v) + E(Vt+1((Xt, Axt+ Bv + wt))|Xt)}Prof. S. Boyd, EE364b, Stanfor d University 4Independent process noise• assume x0, w0, . . . , wT −1are independent• Vtdepends only on the current state xt(and not the state history Xt)• Bellman equations: VT(xT) = ℓT(xT); for t = T − 1, . . . , 0,Vt(xt) = infv∈U{ℓt(xt, v) + E Vt+1(Axt+ Bv + wt)}• optimal policy is a function of current state xtφ⋆(xt) = argminv∈U{ℓt(xt, v) + E Vt+1(Axt+ Bv + wt)}Prof. S. Boyd, EE364b, Stanfor d University 5Linear quadratic stochastic control• special case of linear stochastic control• Ut= Rm• x0, w0, . . . , wT −1are independent, withE x0= 0, E wt= 0, E x0xT0= Σ, E wtwTt= Wt• ℓt(xt, ut) = xTtQtxt+ uTtRtut, with Qt 0, Rt≻ 0• ℓT(xT) = xTTQTxT, with QT 0Prof. S. Boyd, EE364b, Stanfor d University 6• can show value functions are quadratic, i.e.,Vt(xt) = xTtPtxt+ qt, t = 0, . . . , T• Bellman recursion: PT= QT, qT= 0; for t = T − 1, . . . , 0,Vt(z) = infv{zTQtz + vTRtv+ E((Az + Bv + wt)TPt+1(Az + Bv + wt) + qt+1)}• works out toPt= ATPt+1A − ATPt+1B(BTPt+1B + Rt)−1BTPt+1A + Qtqt= qt+1+ Tr(WtPt+1)Prof. S. Boyd, EE364b, Stanfor d University 7• optimal policy is linear state feedback: φ⋆t(xt) = Ktxt,Kt= −(BTPt+1B + Rt)−1BTPt+1A(which, strangely, does not depend on Σ, W0, . . . , WT −1)• optimal costJ⋆= E V0(x0)= Tr(ΣP0) + q0= Tr(ΣP0) +T −1Xt=0Tr(WtPt+1)Prof. S. Boyd, EE364b, Stanfor d University 8Certainty equivalent model predictive control• at every time t we solve the certainty equivalent problemminimizePT −1τ =tℓt(xτ, uτ) + ℓT(xT)subject to uτ∈ Uτ, τ = t, . . . , T − 1xτ +1= Axτ+ Buτ+ ˆwτ |t, τ = t, . . . , T − 1with variables xt+1, . . . , xT, ut, . . . , uT −1and data xt, ˆwt|t, . . . , ˆwT −1|t• ˆwt|t, . . . , ˆwT −1|tare predicted values of wt, . . . , wT −1based on Xt(e.g., conditional expectations)• call solution ˜xt+1, . . . , ˜xT, ˜ut, . . . , ˜uT −1• we take φmpc(Xt) = ˜ut– φmpcis a function of Xtsince ˆwt|t, . . . , ˆwT −1|tare functions of XtProf. S. Boyd, EE364b, Stanfor d University 9Certainty equivalent model predictive control• widely used, e.g., in ‘revenue management’• based on (bad) approximations:– future values of disturbance are exactly as predicted; there is nofuture uncertainty– in future, no recourse is available• yet, often works very wellProf. S. Boyd, EE364b, Stanfor d University 10Example• system with n = 3 states, m = 2 inputs; horizon T = 50• A, B chosen randomly• quadratic stage cost: ℓt(x, u) = kxk22+ kuk22• quadratic final cost: ℓT(x) = kxk22• constraint set: U = {u | kuk∞≤ 0.5}• x0, w0, . . . , wT −1iid N (0, 0.25I)Prof. S. Boyd, EE364b, Stanfor d University 11Stochastic MPC: Sample trajectorysample trace of x1and u10 10 20 30 40 50−2−10120 10 20 30 40 50−0.500.5x1(t)u1(t)tProf. S. Boyd, EE364b, Stanfor d University 12Cost histogram0 200 400 600 800 10000501001500 200 400 600 800 1000050100150JmpcJrelaxJrelaxJsatProf. S. Boyd, EE364b, Stanfor d University 13Simple lower bound for quadratic stochastic control• x0, w0, . . . , wT −1independent• quadratic stage and final cost• relaxation:– ignore Ut; yields linear quadratic stochastic control problem– solve relaxed problem exactly; optimal cost is Jrelax• J⋆≥ Jrelax• for our numerical example,– Jmpc= 224.7 (via Monte Carlo)– Jsat= 271.5 (linear quadratic stochastic control with saturation)– Jrelax= 141.3Prof. S. Boyd, EE364b, Stanfor d University


View Full Document

Stanford EE 364B - Stochastic Model Predictive Control

Download Stochastic Model Predictive Control
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Stochastic Model Predictive Control and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Stochastic Model Predictive Control 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?