11Kalman FiltersSwitching Kalman FilterGraphical Models – 10708Carlos GuestrinCarnegie Mellon UniversityNovember 20th, 2006Readings:K&F: 4.5, 12.2, 12.3, 12.42Adventures of our BN hero Compact representation for probability distributions Fast inference Fast learning Approximate inference But… Who are the most popular kids?1. Naïve Bayes2 and 3. Hidden Markov models (HMMs)Kalman Filters23The Kalman Filter An HMM with Gaussian distributions Has been around for at least 50 years Possibly the most used graphical model ever It’s what does your cruise control tracks missiles controls robots … And it’s so simple… Possibly explaining why it’s so used Many interesting models build on it… An example of a Gaussian BN (more on this later)4Example of KF –SLATSimultaneous Localization and Tracking[Funiak, Guestrin, Paskin, Sukthankar ’06] Place some cameras around an environment, don’t know where they are Could measure all locations, but requires lots of grad. student (Stano) time Intuition: A person walks around If camera 1 sees person, then camera 2 sees person, learn about relative positions of cameras35Example of KF –SLAT Simultaneous Localization and Tracking[Funiak, Guestrin, Paskin, Sukthankar ’06]6Multivariate GaussianMean vector:Covariance matrix:47Conditioning a Gaussian Joint Gaussian: p(X,Y) ~ N(µ;Σ) Conditional linear Gaussian: p(Y|X) ~ N(µY|X; σ2)8Gaussian is a “Linear Model” Conditional linear Gaussian: p(Y|X) ~ N(β0+βX; σ2)59Conditioning a Gaussian Joint Gaussian: p(X,Y) ~ N(µ;Σ) Conditional linear Gaussian: p(Y|X) ~ N(µY|X; ΣYY|X)10Conditional Linear Gaussian (CLG) –general case Conditional linear Gaussian: p(Y|X) ~ N(β0+ΒX; ΣYY|X)611Understanding a linear Gaussian –the 2d caseVariance increases over time (motion noise adds up)Object doesn’t necessarily move in a straight line12Tracking with a Gaussian 1 p(X0) ~ N(µ0,Σ0) p(Xi+1|Xi) ~ N(Β Xi+ β; ΣXi+1|Xi)713Tracking with Gaussians 2 –Making observations We have p(Xi) Detector observes Oi=oi Want to compute p(Xi|Oi=oi) Use Bayes rule: Require a CLG observation model p(Oi|Xi) ~ N(W Xi+ v; ΣOi|Xi)14Operations in Kalman filter Compute Start with At each time step t: Condition on observation Prediction (Multiply transition model) Roll-up (marginalize previous time step) I’ll describe one implementation of KF, there are others Information filterX1O1= X5X3X4X2O2= O3= O4= O5=815Exponential family representation of Gaussian: Canonical Form16Canonical form Standard form and canonical forms are related: Conditioning is easy in canonical form Marginalization easy in standard form917Conditioning in canonical form First multiply: Then, condition on value B = y18Operations in Kalman filter Compute Start with At each time step t: Condition on observation Prediction (Multiply transition model) Roll-up (marginalize previous time step)X1O1= X5X3X4X2O2= O3= O4= O5=1019Prediction & roll-up in canonical form First multiply: Then, marginalize Xt:10-708 –Carlos Guestrin 200620Announcements Lectures the rest of the semester: Special time: Monday Nov 27 - 5:30-7pm, Wean 4615A: Dynamic BNs Wed. 11/30, regular class time: Causality (Richard Scheines) Friday 12/1, regular class time: Finish Dynamic BNs & Overview of Advanced Topics Deadlines & Presentations: Project Poster Presentations: Dec. 1st3-6pm (NSH Atrium) popular vote for best poster Project write up: Dec. 8thby 2pm by email 8 pages – limit will be strictly enforced Final: Out Dec. 1st, Due Dec. 15thby 2pm (strict deadline)1121What if observations are not CLG? Often observations are not CLG CLG if Oi= Β Xi+ βo+ ε Consider a motion detector Oi= 1 if person is likely to be in the region Posterior is not Gaussian22Linearization: incorporating non-linear evidence p(Oi|Xi) not CLG, but… Find a Gaussian approximation of p(Xi,Oi)= p(Xi) p(Oi|Xi) Instantiate evidence Oi=oiand obtain a Gaussian for p(Xi|Oi=oi) Why do we hope this would be any good? Locally, Gaussian may be OK1223Linearization as integration Gaussian approximation of p(Xi,Oi)= p(Xi) p(Oi|Xi) Need to compute moments E[Oi] E[Oi2] E[OiXi] Note: Integral is product of a Gaussian with an arbitrary function24Linearization as numerical integration Product of a Gaussian with arbitrary function Effective numerical integration with Gaussian quadrature method Approximate integral as weighted sum over integration points Gaussian quadrature defines location of points and weights Exact if arbitrary function is polynomial of bounded degree Number of integration points exponential in number of dimensions d Exact monomials requires exponentially fewer points For 2d+1 points, this method is equivalent to effective Unscented Kalman filter Generalizes to many more points1325Operations in non-linear Kalman filter Compute Start with At each time step t: Condition on observation (use numerical integration) Prediction (Multiply transition model, use numerical integration) Roll-up (marginalize previous time step)X1O1= X5X3X4X2O2= O3= O4= O5= 26What you need to know about Kalman Filters Kalman filter Probably most used BN Assumes Gaussian distributions Equivalent to linear system Simple matrix operations for computations Non-linear Kalman filter Usually, observation or motion model not CLG Use numerical integration to find Gaussian approximation1427What if the person chooses different motion models? With probability θ, move more or less straight With probability 1-θ, do the “moonwalk”28The moonwalk1529What if the person chooses different motion models? With probability θ, move more or less straight With probability 1-θ, do the “moonwalk”30Switching Kalman filter At each time step, choose one of k motion models: You never know which one! p(Xi+1|Xi,Zi+1) CLG indexed by Zi p(Xi+1|Xi,Zi+1=j) ~ N(βj0+ ΒjXi; ΣjXi+1|Xi)1631Inference in switching KF – one step Suppose p(X0) is Gaussian Z1takes one of two values p(X1|Xo,Z1) is CLG Marginalize X0 Marginalize Z1 Obtain
View Full Document