Unformatted text preview:

Pontryagin s maximum principle Emo Todorov Applied Mathematics and Computer Science Engineering University of Washington Winter 2012 Emo Todorov UW AMATH CSE 579 Winter 2012 Lecture 5 1 9 Pontryagin s maximum principle For deterministic dynamics x f x u we can compute extremal open loop trajectories i e local minima by solving a boundary value ODE problem with given x 0 and T x qT x where t is the gradient of the optimal cost to go function called costate Emo Todorov UW AMATH CSE 579 Winter 2012 Lecture 5 2 9 Pontryagin s maximum principle For deterministic dynamics x f x u we can compute extremal open loop trajectories i e local minima by solving a boundary value ODE problem with given x 0 and T x qT x where t is the gradient of the optimal cost to go function called costate Definition deterministic Hamiltonian H x u x u f x u T Theorem continuous time maximum principle If x t u t 0 t T is the optimal state control trajectory starting at x 0 then qT x satisfying there exists a costate trajectory t with T x x u Emo Todorov UW H x u f x u Hx x u x x u fx x u T e arg min H x u e u AMATH CSE 579 Winter 2012 Lecture 5 2 9 Derivation from the HJB equation continuous time For deterministic dynamics x f x u the optimal cost to go in the finite horizon setting satisfies the HJB equation n o vt x t min x u f x u T vx x t min H x u vx x t u u If the optimal control law is x t we can set u and drop the min 0 vt x t x x t f x x t T vx x t Now differentiate w r t x and suppress the dependences for clarity T T T 0 vtx x T x u fx x fu vx vxx f Using the identity v x vtx vxx f and regrouping yields T T T 0 v x x fT x vx x u fu vx v x Hx x Hu Since u is optimal we have Hu 0 thus Emo Todorov UW Hx x where vx AMATH CSE 579 Winter 2012 Lecture 5 3 9 Derivation via Largrange multipliers discrete time Optimize total cost subject to dynamics constraints xk 1 f xk uk Define the Lagrangian L x u as L N 1 qT xN k 0 xk uk f xk uk qT xN xk 1 T k 1 N 1 T xT N N x0 0 k 0 H xk uk k 1 xT k k Setting Lx L 0 and explicitly minimizing w r t u yields Theorem discrete time maximum principle If xk uk 0 k N is the optimal state control trajectory starting at x0 then there exists a costate trajectory k with N x qT xN satisfying xk 1 k uk Emo Todorov UW H xk uk k 1 f xk uk Hx xk uk k 1 x xk uk fx xk uk T k 1 e k 1 arg min H xk u e u AMATH CSE 579 Winter 2012 Lecture 5 4 9 Gradient of the total cost The maximum principle provides an efficient way to evaluate the gradient of the total cost w r t u and thereby optimize the controls numerically Theorem gradient For given control trajectory uk let xk k be such that xk 1 k with x0 given and N f xk uk x xk uk fx xk uk T k 1 x qT xN Let J x u be the total cost Then J x u Hu xk uk k 1 u xk uk fu xk uk T k 1 uk Note that xk can be found in a forward pass since it does not depend on and then k can be found in a backward pass Emo Todorov UW AMATH CSE 579 Winter 2012 Lecture 5 5 9 Proof by induction The cost accumulated from time k until the end can be written recursively as Jk xk N uk N 1 xk uk Jk 1 xk 1 N uk 1 N 1 Noting that uk affects future costs only through xk 1 f xk uk we have J u xk uk fu xk uk T J uk k xk 1 k 1 We need to show that k For k N we have J For k N this holds because JN qT xk k J x xk uk fx xk uk T J xk k xk 1 k 1 which is identical to k x xk uk fx xk uk T k 1 Emo Todorov UW AMATH CSE 579 Winter 2012 Lecture 5 6 9 Enforcing terminal states The final state x T is usually different from the minimum of the final cost qT because it reflects a trade off between final and running cost We can enforce x T x as a boundary condition and remove the boundary condition on T Once the solution is found we can construct a function qT such that qT x T However if T 6 0 then x T is not the T x minimum of this qT We can also define the problem as infinite horizon average cost in which case it is usually suboptimal to have an asymptotic state different from the minimum of the state cost function The maximum principle does not apply to infinite horizon problems so one has to use the HJB equations Emo Todorov UW AMATH CSE 579 Winter 2012 Lecture 5 7 9 More tractable problems When the dynamics and cost are in the restricted form a x Bu 1 x u q x uT Ru 2 x the Hamiltonian can be minimized analytically which yields the ODE 1 T x a x qx x a x x T BR B qT x If B R depend on x with boundary conditions x 0 and T x the second equation has additional terms involving the derivatives of B R Emo Todorov UW AMATH CSE 579 Winter 2012 Lecture 5 8 9 More tractable problems When the dynamics and cost are in the restricted form a x Bu 1 x u q x uT Ru 2 x the Hamiltonian can be minimized analytically which yields the ODE 1 T x a x qx x a x x T BR B qT x If B R depend on x with boundary conditions x 0 and T x the second equation has additional terms involving the derivatives of B R We have Hu R x u B x T and Huu R x 0 Thus the maximum principle here is both a necessary and a sufficient condition for a local minimum Emo Todorov UW AMATH CSE 579 Winter 2012 Lecture 5 8 9 Pendulum example Passive dynamics a x x2 k sin x1 ax x 0 k cos x1 1 0 Optimal control u r 1 2 ODE with q 0 x 1 x 2 1 2 Emo Todorov UW x2 k sin x1 r …


View Full Document

UW MATH 579 - Pontryagin maximum Principle

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view Pontryagin maximum Principle and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Pontryagin maximum Principle and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?