Stanford EE 363 - Lecture 3 - Infinite horizon linear quadratic regulator

Unformatted text preview:

EE363 Winter 2008-09Lecture 3Infinite horizon linear quadratic regulator• infinite horizon LQR problem• dynamic programming solution• receding horizon LQR control• closed-loop system3–1Infinite horizon LQR problemdiscrete-time system xt+1= Axt+ But, x0= xinitproblem: choose u0, u1, . . . to minimizeJ =∞Xτ =0xTτQxτ+ uTτRuτwith given constant state and input weight matricesQ = QT≥ 0, R = RT> 0. . . an infinite dimensional problemInfinite horizon linear quadratic regulator 3–2problem: it’s possible that J = ∞ for all input sequences u0, . . .xt+1= 2xt+ 0ut, xinit= 1let’s assume (A, B) is controllablethen for any xinitthere’s an input sequenceu0, . . . , un−1, 0, 0, . . .that steers x to zero at t = n, and keeps it therefor this u, J < ∞and therefore, minuJ < ∞ for any xinitInfinite horizon linear quadratic regulator 3–3Dynamic programming solutiondefine value function V : Rn→ RV (z) = minu0,...∞Xτ =0xTτQxτ+ uTτRuτsubject to x0= z, xτ +1= Axτ+ Buτ• V (z) is the minimum LQR cost-to-go, starting from state z• doesn’t depend on time-to-go, which is always ∞; infinite horizonproblem is shift invariantInfinite horizon linear quadratic regulator 3–4Hamilton-Jacobi equationfact: V is quadratic, i.e., V (z) = zTP z, where P = PT≥ 0(can be argued directly from first principles)HJ equation:V (z) = minwzTQz + wTRw + V (Az + Bw)orzTP z = minwzTQz + wTRw + (Az + Bw)TP (Az + Bw)minimizing w is w∗= −(R + BTP B)−1BTP Azso HJ equation iszTP z = zTQz + w∗TRw∗+ (Az + Bw∗)TP (Az + Bw∗)= zTQ + ATP A − ATP B(R + BTP B)−1BTP AzInfinite horizon linear quadratic regulator 3–5this must hold for all z, so we conclude that P satisfies the AREP = Q + ATP A − ATP B(R + BTP B)−1BTP Aand the optimal input is constant state feedback ut= Kxt,K = −(R + BTP B)−1BTP Acompared to finite-horizon LQR problem,• value function and optimal state feedback gains are time-invariant• we don’t have a recursion to compute P ; we only have the AREInfinite horizon linear quadratic regulator 3–6fact: the ARE has only one positive semidefinite solution Pi.e., ARE plus P = PT≥ 0 uniquely characterizes value functionconsequence: the Riccati recursionPk+1= Q + ATPkA − ATPkB(R + BTPkB)−1BTPkA, P1= Qconverges to the unique PSD solution of the ARE(when (A, B) controllable)(later we’ll see direct methods to solve ARE)thus, infinite-horizon LQR optimal control is same as steady-state finitehorizon optimal controlInfinite horizon linear quadratic regulator 3–7Receding-horizon LQR controlconsider cost functionJt(ut, . . . , ut+T −1) =τ =t+TXτ =txTτQxτ+ uTτRuτ• T is called horizon• same as infinite horizon LQR cost, truncated after T steps into futureif (u∗t, . . . , u∗t+T −1) minimizes Jt, u∗tis called (T -step ahead) optimalreceding horizon controlin words:• at time t, find input sequence that minimizes T -step-ahead LQR cost,starting at current time• then use only the first inputInfinite horizon linear quadratic regulator 3–8example: 1-step ahead receding horizon controlfind ut, ut+1that minimizeJt= xTtQxt+ xTt+1Qxt+1+ uTtRut+ uTt+1Rut+1first term doesn’t matter; optimal choice for ut+1is 0; optimal utminimizesxTt+1Qxt+1+ uTtRut= (Axt+ But)TQ(Axt+ But) + uTtRutthus, 1-step ahead receding horizon optimal input isut= −(R + BTQB)−1BTQAxt. . . a constant state feedbackInfinite horizon linear quadratic regulator 3–9in general, optimal T -step ahead LQR control isut= KTxt, KT= −(R + BTPTB)−1BTPTAwhereP1= Q, Pi+1= Q + ATPiA − ATPiB(R + BTPiB)−1BTPiAi.e.: same as the optimal finite horizon LQR control, T − 1 steps beforethe horizon N• a constant state feedback• state feedback gain converges to infinite horizon optimal as horizonbecomes long (assuming controllability)Infinite horizon linear quadratic regulator 3–10Closed-loop systemsuppose K is LQR-optimal state feedback gainxt+1= Axt+ But= (A + BK)xtis called closed-loop system(xt+1= Axtis called open-loop system)is closed-loop system stable? considerxt+1= 2xt+ ut, Q = 0, R = 1optimal control is ut= 0xt, i.e., closed-loop system is unstablefact: if (Q, A) observable and (A, B) controllable, then closed-loop systemis stableInfinite horizon linear quadratic regulator


View Full Document

Stanford EE 363 - Lecture 3 - Infinite horizon linear quadratic regulator

Download Lecture 3 - Infinite horizon linear quadratic regulator
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 3 - Infinite horizon linear quadratic regulator and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 3 - Infinite horizon linear quadratic regulator 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?