DOC PREVIEW
Stanford EE 363 - Linear Quadratic Stochastic Control

This preview shows page 1-2-3-4-5 out of 16 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 16 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

EE363 Winter 2008-09Lecture 5Linear Quadratic Stochastic Control• linear-quadratic stochastic control problem• solution via dynamic programming5–1Linear stochastic system• linear dynamical system, over finite time horizon:xt+1= Axt+ But+ wt, t = 0, . . . , N − 1• wtis the process noise or disturbance at time t• wtare IID with E wt= 0, E wtwTt= W• x0is independent of wt, with E x0= 0, E x0xT0= XLinear Quadratic Stochastic Control 5–2Control policies• state-feedback control: ut= φt(xt), t = 0, . . . , N − 1• φt: Rn→ Rmcalled the control policy at time t• roughly speaking: we choose input after knowing the current state, butbefore knowing the disturbance• closed-loop system isxt+1= Axt+ Bφt(xt) + wt, t = 0, . . . , N − 1• x0, . . . , xN, u0, . . . , uN−1are randomLinear Quadratic Stochastic Control 5–3Stochastic control problem• objective:J = E N−1Xt=0xTtQxt+ uTtRut+ xTNQfxN!with Q, Qf≥ 0, R > 0• J depends (in complex way) on control policies φ0, . . . , φN−1• linear-quadratic stochastic control problem: choose control policiesφ0, . . . , φN−1to minimize J(‘linear’ refers to the state dynamics; ‘quadratic’ to the objective)• an infinite dimensional problem: variables are functions φ0, . . . , φN−1Linear Quadratic Stochastic Control 5–4Solution via dynamic programming• let Vt(z) be optimal value of objective, from t on, starting at xt= zVt(z) = minφt,...,φN−1E N−1Xτ =txTτQxτ+ uTτRuτ+ xTNQfxN!subject to xτ +1= Axτ+ Buτ+ wτ, uτ= φτ(xτ)• we have– VN(z) = zTQfz– J⋆= E V0(x0) (expectation over x0)Linear Quadratic Stochastic Control 5–5• Vtcan be found by backward recursion: for t = N − 1, . . . , 0Vt(z) = zTQz + minvvTRv + E Vt+1(Az + Bv + wt)– expectation is over wt– we do not know where we will land, when we take ut= v• optimal policies have formφ⋆t(xt) = argminvvTRv + E Vt+1(Axt+ Bv + wt)Linear Quadratic Stochastic Control 5–6Explicit form• let’s show (via recursion) value functions are quadratic, with formVt(xt) = xTtPtxt+ qt, t = 0, . . . , N,with Pt≥ 0• PN= QN, qN= 0• now assume that Vt+1(z) = zTPt+1z + qt+1Linear Quadratic Stochastic Control 5–7• Bellman recursion isVt(z) = zTQz + minv{vTRv +E((Az + Bv + wt)TPt+1(Az + Bv + wt) + qt+1)}= zTQz + Tr(W Pt+1) + qt+1+minv{vTRv + (Az + Bv)TPt+1(Az + Bv)}– we use E(wTtPt+1wt) = Tr(W Pt+1)– same recursion as deterministic LQR, with added constant• optimal policy is linear state feedback: φ⋆t(xt) = Ktxt,Kt= −(BTPt+1B + R)−1BTPt+1A(same form as in deterministic LQR)Linear Quadratic Stochastic Control 5–8• plugging in optimal w gives Vt(z) = zTPtz + qt, withPt= ATPt+1A − ATPt+1B(BTPt+1B + R)−1BTPt+1A + Qqt= qt+1+ Tr(W Pt+1)– first recursion same as for deterministic LQR– second term is just a running sum• we conclude that– Pt, Ktare same as in deterministic LQR– strangely, optimal policy is same as LQR, and independent of X, WLinear Quadratic Stochastic Control 5–9• optimal cost isJ⋆= E V0(x0)= Tr(XP0) + q0= Tr(XP0) +NXt=1Tr(W Pt)• interpretation:– xT0P0x0is optimal cost of deterministic LQR, withw0= · · · = wN−1= 0– Tr(XP0) is average optimal LQR cost, with w0= · · · = wN−1= 0– Tr(W Pt) is average optimal LQR cost, for E xt= 0, E xtxTt= W ,wt= · · · = wN−1= 0Linear Quadratic Stochastic Control 5–10Infinite horizon• choose policies to minimize average stage costJ = limN→∞1NEN−1Xt=0xTtQxt+ uTtRut• optimal average stage cost isJ⋆= Tr(W Pss)where Psssatisfies the AREPss= Q + ATPssA − ATPssB(R + BTPssB)−1BTPssA– optimal average stage cost doesn’t depend on XLinear Quadratic Stochastic Control 5–11• (an) optimal policy is constant linear state feedbackut= KssxtwhereKss= −(R + BTPssB)−1BTPssA– Kssis steady-state LQR feedback gain– doesn’t depend on X, WLinear Quadratic Stochastic Control 5–12Example• system with n = 5 states, m = 2 inputs, horizon N = 30• A, B chosen randomly; A scaled so maxi|λi(A)| = 1• Q = I, Qf= 10I, R = I• x0∼ N (0, X), X = 10I• wt∼ N (0, W ), W = 0.5ILinear Quadratic Stochastic Control 5–13Sample trajectoriessample trace of (xt)1and (ut)10 5 10 15 20 25 30−4−20240 5 10 15 20 25 30−1012(xt)1(ut)1tblue: optimal stochastic control, red: no control (u0= · · · = uN−1= 0)Linear Quadratic Stochastic Control 5–14Cost histogramcost histogram for 1000 simulations0 100 200 300 400 500 600 70001002000 100 200 300 400 500 600 70001002000 100 200 300 400 500 600 70001002000 100 200 300 400 500 600 7000100200J⋆JpreJolJncJLinear Quadratic Stochastic Control 5–15Comparisonswe compared optimal stochastic control (J⋆= 224.2) with• ‘prescient’ control– decide input sequence with full knowledge of future disturbances– u0, . . . , uN−1computed assuming all wtare known– Jpre= 137.6• ‘open-loop’ control– u0, . . . , uN−1depend only on x0– u0, . . . , uN−1computed assuming w0= · · · = wN−1= 0– Jol= 423.7• no control– u0= · · · = uN−1= 0– Jnc= 442.0Linear Quadratic Stochastic Control


View Full Document

Stanford EE 363 - Linear Quadratic Stochastic Control

Download Linear Quadratic Stochastic Control
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Linear Quadratic Stochastic Control and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Linear Quadratic Stochastic Control 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?