Stanford EE 363 - Lecture 3 - Infinite horizon linear quadratic regulator - D704358

Home> Schools> Stanford University> Electrical Engineering (EE) > EE 363> Lecture 3 - Infinite horizon linear quadratic regulator

Stanford EE 363 - Lecture 3 - Infinite horizon linear quadratic regulator

School name Stanford University

Course Ee 363- Linear Dynamical Systems

Pages 11

Download Save

Unformatted text preview:

EE363 Winter 2008-09Lecture 3Infinite horizon linear quadratic regulator• infinite horizon LQR problem• dynamic programming solution• receding horizon LQR control• closed-loop system3–1Infinite horizon LQR problemdiscrete-time system xt+1= Axt+ But, x0= xinitproblem: choose u0, u1, . . . to minimizeJ =∞Xτ =0xTτQxτ+ uTτRuτwith given constant state and input weight matricesQ = QT≥ 0, R = RT> 0. . . an infinite dimensional problemInfinite horizon linear quadratic regulator 3–2problem: it’s possible that J = ∞ for all input sequences u0, . . .xt+1= 2xt+ 0ut, xinit= 1let’s assume (A, B) is controllablethen for any xinitthere’s an input sequenceu0, . . . , un−1, 0, 0, . . .that steers x to zero at t = n, and keeps it therefor this u, J < ∞and therefore, minuJ < ∞ for any xinitInfinite horizon linear quadratic regulator 3–3Dynamic programming solutiondefine value function V : Rn→ RV (z) = minu0,...∞Xτ =0xTτQxτ+ uTτRuτsubject to x0= z, xτ +1= Axτ+ Buτ• V (z) is the minimum LQR cost-to-go, starting from state z• doesn’t depend on time-to-go, which is always ∞; infinite horizonproblem is shift invariantInfinite horizon linear quadratic regulator 3–4Hamilton-Jacobi equationfact: V is quadratic, i.e., V (z) = zTP z, where P = PT≥ 0(can be argued directly from first principles)HJ equation:V (z) = minwzTQz + wTRw + V (Az + Bw)orzTP z = minwzTQz + wTRw + (Az + Bw)TP (Az + Bw)minimizing w is w∗= −(R + BTP B)−1BTP Azso HJ equation iszTP z = zTQz + w∗TRw∗+ (Az + Bw∗)TP (Az + Bw∗)= zTQ + ATP A − ATP B(R + BTP B)−1BTP AzInfinite horizon linear quadratic regulator 3–5this must hold for all z, so we conclude that P satisfies the AREP = Q + ATP A − ATP B(R + BTP B)−1BTP Aand the optimal input is constant state feedback ut= Kxt,K = −(R + BTP B)−1BTP Acompared to finite-horizon LQR problem,• value function and optimal state feedback gains are time-invariant• we don’t have a recursion to compute P ; we only have the AREInfinite horizon linear quadratic regulator 3–6fact: the ARE has only one positive semidefinite solution Pi.e., ARE plus P = PT≥ 0 uniquely characterizes value functionconsequence: the Riccati recursionPk+1= Q + ATPkA − ATPkB(R + BTPkB)−1BTPkA, P1= Qconverges to the unique PSD solution of the ARE(when (A, B) controllable)(later we’ll see direct methods to solve ARE)thus, infinite-horizon LQR optimal control is same as steady-state finitehorizon optimal controlInfinite horizon linear quadratic regulator 3–7Receding-horizon LQR controlconsider cost functionJt(ut, . . . , ut+T −1) =τ =t+TXτ =txTτQxτ+ uTτRuτ• T is called horizon• same as infinite horizon LQR cost, truncated after T steps into futureif (u∗t, . . . , u∗t+T −1) minimizes Jt, u∗tis called (T -step ahead) optimalreceding horizon controlin words:• at time t, find input sequence that minimizes T -step-ahead LQR cost,starting at current time• then use only the first inputInfinite horizon linear quadratic regulator 3–8example: 1-step ahead receding horizon controlfind ut, ut+1that minimizeJt= xTtQxt+ xTt+1Qxt+1+ uTtRut+ uTt+1Rut+1first term doesn’t matter; optimal choice for ut+1is 0; optimal utminimizesxTt+1Qxt+1+ uTtRut= (Axt+ But)TQ(Axt+ But) + uTtRutthus, 1-step ahead receding horizon optimal input isut= −(R + BTQB)−1BTQAxt. . . a constant state feedbackInfinite horizon linear quadratic regulator 3–9in general, optimal T -step ahead LQR control isut= KTxt, KT= −(R + BTPTB)−1BTPTAwhereP1= Q, Pi+1= Q + ATPiA − ATPiB(R + BTPiB)−1BTPiAi.e.: same as the optimal finite horizon LQR control, T − 1 steps beforethe horizon N• a constant state feedback• state feedback gain converges to infinite horizon optimal as horizonbecomes long (assuming controllability)Infinite horizon linear quadratic regulator 3–10Closed-loop systemsuppose K is LQR-optimal state feedback gainxt+1= Axt+ But= (A + BK)xtis called closed-loop system(xt+1= Axtis called open-loop system)is closed-loop system stable? considerxt+1= 2xt+ ut, Q = 0, R = 1optimal control is ut= 0xt, i.e., closed-loop system is unstablefact: if (Q, A) observable and (A, B) controllable, then closed-loop systemis stableInfinite horizon linear quadratic regulator

View Full Document


School:
Email:
New Password:
Confirm Password:

Stanford EE 363 - Lecture 3 - Infinite horizon linear quadratic regulator

Sign up for free to view:

Please select your school