Stanford EE 363 - Lecture Notes - D2571889

Home> Schools> Stanford University> Electrical Engineering (EE) > EE 363> Lecture Notes

Stanford EE 363 - Lecture Notes

Pages 32

Download Save

Unformatted text preview:

EE363 Winter 2008-09Lecture 1Linear quadratic regulator:Discrete-time finite horizon• LQR cost function• multi-objective interpretation• LQR via least-squares• dynamic programming solution• steady-state LQR control• extensions: time-varying systems, tracking problems1–1LQR problem: backgrounddiscrete-time system xt+1= Axt+ But, x0= xinitproblem: choose u0, u1, . . . so that• x0, x1, . . . is ‘small’, i.e., we get good regulation or control• u0, u1, . . . is ‘small’, i.e., using small input effort or actuator authority• we’ll define ‘small’ soon• these are usually competing objectives, e.g., a large u can drive x tozero fastlinear quadratic regulator (LQR) theory addresses this questionLinear quadratic regulator: Discrete-time finite horizon 1–2LQR cost functionwe define quadratic cost functionJ(U) =N−1Xτ=0xTτQxτ+ uTτRuτ+ xTNQfxNwhere U = (u0, . . . , uN−1) andQ = QT≥ 0, Qf= QTf≥ 0, R = RT> 0are given state cost, final state cost, and input cost matricesLinear quadratic regulator: Discrete-time finite horizon 1–3• N is called time horizon (we’ll consider N = ∞ later)• first term measures state deviation• second term measures input size or actuator authority• last term measures final state deviation• Q, R set relative weights of state deviation and input usage• R > 0 means any (nonzero) input adds to cost JLQR problem: find ulqr0, . . . , ulqrN−1that minimizes J(U)Linear quadratic regulator: Discrete-time finite horizon 1–4Comparison to least-norm inputc.f. least-norm input that steers x to xN= 0:• no cost attached to x0, . . . , xN−1• xNmust be exactly zerowe can approximate the least-norm input by takingR = I, Q = 0, Qflarge, e.g., Qf= 108ILinear quadratic regulator: Discrete-time finite horizon 1–5Multi-objective interpretationcommon form for Q and R:R = ρI, Q = Qf= CTCwhere C ∈ Rp×nand ρ ∈ R, ρ > 0cost is thenJ(U) =NXτ=0kyτk2+ ρN−1Xτ=0kuτk2where y = Cxhere√ρ gives relative weighting of output norm and input normLinear quadratic regulator: Discrete-time finite horizon 1–6Input and output objectivesfix x0= xinitand horizon N; for any input U = (u0, . . . , uN−1) define• input cost Jin(U) =PN−1τ=0kuτk2• output cost Jout(U) =PNτ=0kyτk2these are (competing) objectives; we want both smallLQR quadratic cost is Jout+ ρJinLinear quadratic regulator: Discrete-time finite horizon 1–7plot (Jin, Jout) for all possible U:JinJoutU1U2U3• shaded area shows (Jin, Jout) achieved by some U• clear area shows (Jin, Jout) not achieved by any ULinear quadratic regulator: Discrete-time finite horizon 1–8three sample inputs U1, U2, and U3are shown• U3is worse than U2on both counts (Jinand Jout)• U1is better than U2in Jin, but worse in Joutinterpretation of LQR quadratic cost:J = Jout+ ρJin= constantcorresponds to a line with slope −ρ on (Jin, Jout) plotLinear quadratic regulator: Discrete-time finite horizon 1–9JinJoutU1U2U3J = Jout+ ρJin= constant• LQR optimal input is at boundary of shaded region, just touching line ofsmallest possible J• u2is LQR optimal for ρ shown• by varying ρ from 0 to +∞, can sweep out optimal tradeoff curveLinear quadratic regulator: Discrete-time finite horizon 1–10LQR via least-square sLQR can be formulated (and solved) as a least-squares problemX = (x0, . . . xN) is a linear function of x0and U = (u0, . . . , uN−1):x0...xN=0 ···B 0 ···AB B 0 ···......AN−1B AN−2B ··· Bu0...uN−1+IA...ANx0express as X = GU + Hx0, where G ∈ RNn×Nm, H ∈ RNn×nLinear quadratic regulator: Discrete-time finite horizon 1–11express LQR cost asJ(U) =diag(Q1/2, . . . , Q1/2, Q1/2f)(GU + Hx0)2+diag(R1/2, . . . , R1/2)U2this is just a (big) least-squares problemthis solution method requires forming and solving a least-squares problemwith size N(n + m) × Nmusing a naive method (e.g., QR factorization), cost is O(N3nm2)Linear quadratic regulator: Discrete-time finite horizon 1–12Dynamic programming solution• gives an efficient, recursive method to solve LQR least-squares problem;cost is O(Nn3)• (but in fact, a less naive approach to solve the LQR least-squaresproblem will have the same complexity)• useful and important idea on its own• same ideas can be used for many other problemsLinear quadratic regulator: Discrete-time finite horizon 1–13Value functionfor t = 0, . . . , N define the value function Vt: Rn→ R byVt(z) = minut,...,uN−1N−1Xτ=txTτQxτ+ uTτRuτ+ xTNQfxNsubject to xt= z, xτ+1= Axτ+ Buτ, τ = t, . . . , T• Vt(z) gives the minimum LQR cost-to-go, starting from state z at time t• V0(x0) is min LQR cost (from state x0at time 0)Linear quadratic regulator: Discrete-time finite horizon 1–14we will find that• Vtis quadratic, i.e., Vt(z) = zTPtz, where Pt= PTt≥ 0• Ptcan be found recursively, working backward from t = N• the LQR optimal u is easily expressed in terms of Ptcost-to-go with no time left is just final state cost:VN(z) = zTQfzthus we have PN= QfLinear quadratic regulator: Discrete-time finite horizon 1–15Dynamic programming principle• now suppose we know Vt+1(z)• what is the optimal choice for ut?• choice of utaffects– current cost incurred (through uTtRut)– where we land, xt+1(hence, the min-cost-to-go from xt+1)• dynamic programming (DP) principle:Vt(z) = minwzTQz + wTRw + Vt+1(Az + Bw)– zTQz + wTRw is cost incurred at time t if ut= w– Vt+1(Az + Bw) is min cost-to-go from where you land at t + 1Linear quadratic regulator: Discrete-time finite horizon 1–16• follows from fact that we can minimize in any order:minw1,...,wkf(w1, . . . , wk) = minw1minw2,...,wkf(w1, . . . , wk)|{z }a fct of w1in words:min cost-to-go from where you are = min over(current cost incurred + min cost-to-go from where you land)Linear quadratic regulator: Discrete-time finite horizon 1–17Example: path optimization• edges show possible flights; each has some cost• want to find min cost route or path from SF to NYSFNYSeattleAtlantaChicagoDenverLos AngelesLinear quadratic regulator: Discrete-time finite horizon 1–18dynamic programming (DP):• V (i) is min cost from airport i to NY, over all possible paths• to find min cost from city i to NY: minimize sum of flight cost plus mincost to NY from where you land, over all flights out of city

View Full Document


School:
Email:
New Password:
Confirm Password:

Stanford EE 363 - Lecture Notes

Sign up for free to view:

Please select your school