DOC PREVIEW
CMU CS 10708 - Dynamic Bayesian Networks

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Readings K F 18 1 18 2 18 3 18 4 Dynamic Bayesian Networks Beyond 10708 Graphical Models 10708 Carlos Guestrin Carnegie Mellon University December 1st 2006 1 Dynamic Bayesian network DBN HMM defined by Transition model P X t 1 X t Observation model P O t X t Starting state distribution P X 0 DBN Use Bayes net to represent each of these compactly Starting state distribution P X 0 is a BN silly e g performance in grad school DBN Vars Happiness Productivity HiraBlility Fame Observations PapeR Schmooze 2 1 Unrolled DBN Start with P X 0 For each time step add vars as defined by 2 TBN 3 Sparse DBN and fast inference Sparse DBN Time t t 1 A A Fast inference t 2 t 3 A A B B B B C C C C D D D D E E E E F F F F 4 2 Even after one time step What happens when we marginalize out time t Time t t 1 A A B B C C D D E E F F 5 Sparse DBN and fast inference 2 Structured representation of belief often yields good approximate Almost Sparse DBN Time t t 1 A A Fast inference t 2 t 3 A A B B B B C C C C D D D D E E E E F F F F 6 3 BK Algorithm for approximate DBN inference Boyen Koller 98 Assumed density filtering Choose a factored representation P for the belief state Every time step belief not representable with P project into representation Time t t 1 A A t 2 t 3 A A B B B B C C C C D D D D E E E E F F F F 7 A simple example of BK FullyFactorized Distribution Assumed density Fully factorized True P X t 1 Time t t 1 A A B B C C D D E E F F Assumed Density t 1 for P X 8 4 Computing Fully Factorized Distribution at time t 1 Assumed density Time t A Fully factorized Assumed Density t 1 for P X t 1 Computing t 1 for P X i A B B C C D D E E F F 9 General case for BK Junction Tree Represents Distribution Assumed density Fully factorized True P X t 1 Time t t 1 A A B B C C D D E E F F Assumed Density t 1 for P X 10 5 Computing factored belief state in the next time step Introduce observations in current time step A Use J tree to calibrate time t beliefs B B C C D D Compute t 1 belief project into approximate belief state marginalize into desired factors corresponds to KL projection Equivalent to computing marginals over factors directly A E E F F For each factor in t 1 step belief Use variable elimination 11 Error accumulation Each time step projection introduces error Will error add up causing unbounded approximation error as t 12 6 Contraction in Markov process 13 BK Theorem Error does not grow unboundedly Theorem If Markov chain contracts at a rate of usually very small and assumed density projection at each time step has error bounded by usually large then the expected error at every iteration is bounded by 14 7 Example BAT network Forbes et al 15 BK results Boyen Koller 98 16 8 Thin Junction Tree Filters Paskin 03 BK assumes fixed approximation clusters TJTF adapts clusters over time attempt to minimize projection error 17 Hybrid DBN many continuous and discrete variables DBN with large number of discrete and continuous variables of mixture of Gaussian components blows up in one time step Need many smart tricks e g see Lerner Thesis Reverse Water Gas Shift System RWGS Lerner et al 02 18 9 DBN summary DBNs factored representation of HMMs Kalman filters sparse representation does not lead to efficient inference Assumed density filtering BK factored belief state representation is assumed density Contraction guarantees that error does blow up but could still be large Thin junction tree filter adapts assumed density over time Extensions for hybrid DBNs 19 This semester Bayesian networks Markov networks factor graphs decomposable models junction trees parameter learning structure learning semantics exact inference variable elimination context specific independence approximate inference sampling importance sampling MCMC Gibbs variational inference loopy belief propagation generalized belief propagation Kikuchi Bayesian learning missing data EM Chow Liu IPF GIS Gaussian and hybrid models discrete and continuous variables temporal and template models Kalman filter linearization switching Kalman filter assumed density filtering DBNs BK Causality Just the beginning 20 10 Quick overview of some hot topics Conditional Random Fields Maximum Margin Markov Networks Relational Probabilistic Models e g the parameter sharing model that you learned for a recommender system in HW1 Hierarchical Bayesian Models e g Khalid s presentation on Dirichlet Processes Influence Diagrams 21 Generative v Discriminative models Intuition Want to Learn h X a Y X features Y set of variables Generative classifier e g Na ve Bayes Markov networks Assume some functional form for P X Y P Y Estimate parameters of P X Y P Y directly from training data Use Bayes rule to calculate P Y X x This is a generative model Indirect computation of P Y X through Bayes rule But can generate a sample of the data P X y P y P X y Discriminative classifiers e g Logistic Regression Conditional Random Fields Assume some functional form for P Y X Estimate parameters of P Y X directly from training data This is the discriminative model Directly learn P Y X can have lower sample complexity But cannot obtain a sample of the data because P X is not available 22 11 Conditional Random Fields Lafferty et al 01 Define a Markov network using a log linear model for P Y X Features e g for pairwise CRF Learning maximize conditional log likelihood sum of log likelihoods you know and love learning algorithm based on gradient descent very similar to learning MNs 23 Max Conditional Likelihood D x1 t x1 m x t xm Estimation Classification f x y Don t need to learn entire distribution 24 12 OCR Example We want argmaxword wT f word brace Equivalently wT f wT f wT f brace wT f brace wT f aaaaa aaaab brace wT f zzzzz a lot 25 Max Margin Estimation Goal find w such that x D A A wTf x t x wTf x y wT f x t x f x y 0 y t x w fx y 0 tx y Maximize margin Gain over y grows with of mistakes in y tx y t w f craze 2 craze 2 t w f zzzzz 5 zzzzz 5 26 13 M3Ns Maximum Margin Markov Networks Taskar et al 03 D x1 t x1 xm t xm Estimation Classification f x y 27 Propositional Models and Generalization Suppose you learn a model for social networks for CMU from FaceBook data to predict movie preferences How would you apply when new people …


View Full Document

CMU CS 10708 - Dynamic Bayesian Networks

Documents in this Course
Lecture

Lecture

15 pages

Lecture

Lecture

25 pages

Lecture

Lecture

24 pages

causality

causality

53 pages

lecture11

lecture11

16 pages

Exam

Exam

15 pages

Notes

Notes

12 pages

lecture

lecture

18 pages

lecture

lecture

16 pages

Lecture

Lecture

17 pages

Lecture

Lecture

15 pages

Lecture

Lecture

17 pages

Lecture

Lecture

19 pages

Lecture

Lecture

42 pages

Lecture

Lecture

16 pages

r6

r6

22 pages

lecture

lecture

20 pages

lecture

lecture

35 pages

Lecture

Lecture

19 pages

Lecture

Lecture

21 pages

lecture

lecture

21 pages

lecture

lecture

13 pages

review

review

50 pages

Semantics

Semantics

30 pages

lecture21

lecture21

26 pages

MN-crf

MN-crf

20 pages

hw4

hw4

5 pages

lecture

lecture

12 pages

Lecture

Lecture

25 pages

Lecture

Lecture

25 pages

Lecture

Lecture

14 pages

Lecture

Lecture

15 pages

Load more
Download Dynamic Bayesian Networks
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Dynamic Bayesian Networks and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Dynamic Bayesian Networks 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?