Dynamic Bayesian Networks Beyond 10708Dynamic Bayesian network (DBN)Unrolled DBN“Sparse” DBN and fast inferenceEven after one time step!!“Sparse” DBN and fast inference 2BK Algorithm for approximate DBN inference [Boyen, Koller ’98]A simple example of BK: Fully-Factorized DistributionComputing Fully-Factorized Distribution at time t+1General case for BK: Junction Tree Represents DistributionComputing factored belief state in the next time stepError accumulationContraction in Markov processBK TheoremExample – BAT network [Forbes et al.]BK results [Boyen, Koller ’98]Thin Junction Tree Filters [Paskin ’03]Hybrid DBN (many continuous and discrete variables)DBN summaryFinalAnd the winners are…This semester…Quick overview of some hot topics...Max (Conditional) LikelihoodOCR ExampleMax Margin EstimationM3Ns: Maximum Margin Markov Networks [Taskar et al. ’03]Propositional Models and GeneralizationGeneralization requires Relational Models (e.g., see tutorials by Getoor & Domingos)Reasoning about decisions K&F Chapters 21 & 22Example of an Influence DiagramMany, many, many more topics we didn’t even touch, e.g.,...What next?1Dynamic Bayesian NetworksBeyond 10708Graphical Models – 10708Carlos GuestrinCarnegie Mellon UniversityDecember 1st, 2006Readings:K&F: 13.1, 13.2, 13.3Dynamic Bayesian network (DBN)HMM defined byTransition model P(X(t+1)|X(t))Observation model P(O(t)|X(t))Starting state distribution P(X(0))DBN – Use Bayes net to represent each of these compactlyStarting state distribution P(X(0)) is a BN(silly) e.g, performance in grad. school DBN Vars: Happiness, Productivity, HiraBlility, FameObservations: PapeR, SchmoozeUnrolled DBNStart with P(X(0))For each time step, add vars as defined by 2-TBN4“Sparse” DBN and fast inference“Sparse” DBN Fast inferenceTime tCBAFEDt+1C’ A’ B’ F’ D’ E’ B’’ A’’ B’’’ C’’’ A’’’ t+2 t+3 C’’ E’’ D’’ E’’’ F’’’ D’’’ F’’5Even after one time step!!What happens when we marginalize out time t?Time t t+1C’ A’ CBAB’ F’ D’ FEDE’6“Sparse” DBN and fast inference 2“Sparse” DBN Fast inferenceAlmost!Structured representation of belief often yields good approximate?B’’ A’’ B’’’ C’’’ A’’’ Time t t+1C’ A’ t+2 t+3 CBAB’ C’’ E’’ D’’ E’’’ F’’’ D’’’ F’ D’ FEDE’ F’’7BK Algorithm for approximate DBN inference [Boyen, Koller ’98]Assumed density filtering:Choose a factored representation P for the belief stateEvery time step, belief not representable with P, project into representation^^B’’ A’’ B’’’ C’’’ A’’’ Time t t+1C’ A’ t+2 t+3 CBAB’ C’’ E’’ D’’ E’’’ F’’’ D’’’ F’ D’ FEDE’ F’’8A simple example of BK: Fully-Factorized DistributionAssumed density:Fully factorizedTime t t+1C’ A’ CBAB’ F’ D’ FEDE’ True P(X(t+1)):Assumed Density for P(X(t+1)):^9Computing Fully-Factorized Distribution at time t+1Assumed density:Fully factorizedTime t t+1C’ A’ CBAB’ F’ D’ FEDE’ Assumed Density for P(X(t)):Computingfor P(Xi(t+1)):^ ^10General case for BK: Junction Tree Represents DistributionAssumed density:Fully factorizedTime t t+1C’ A’ CBAB’ F’ D’ FEDE’ True P(X(t+1)):Assumed Density for P(X(t+1)):^11Computing factored belief state in the next time stepIntroduce observations in current time stepUse J-tree to calibrate time t beliefsCompute t+1 belief, project into approximate belief statemarginalize into desired factorscorresponds to KL projectionEquivalent to computing marginals over factors directlyFor each factor in t+1 step beliefUse variable eliminationC’ A’ CBAB’ F’ D’ FEDE’12Error accumulationEach time step, projection introduces errorWill error add up?causing unbounded approximation error as t!113Contraction in Markov process14BK TheoremError does not grow unboundedly!Theorem: If Markov chain contracts at a rate of (usually very small), and assumed density projection at each time step has error bounded by (usually large) then the expected error at every iteration is bounded by /.15Example – BAT network [Forbes et al.]16BK results [Boyen, Koller ’98]17Thin Junction Tree Filters [Paskin ’03] BK assumes fixed approximation clustersTJTF adapts clusters over time attempt to minimize projection error18Hybrid DBN (many continuous and discrete variables)DBN with large number of discrete and continuous variables# of mixture of Gaussian components blows up in one time step!Need many smart tricks…e.g., see Lerner ThesisReverse Water Gas Shift System(RWGS) [Lerner et al. ’02]19DBN summaryDBNsfactored representation of HMMs/Kalman filterssparse representation does not lead to efficient inferenceAssumed density filteringBK – factored belief state representation is assumed densityContraction guarantees that error does blow up (but could still be large)Thin junction tree filter adapts assumed density over timeExtensions for hybrid DBNsFinalOut: Later todayDue: December 10th at NOON (STRICT DEADLINE)Start Early!!!20And the winners are… Popular Vote:Learning and prediction of emotion components in a conversation using dynamic bayesian networks (Ekaterina Spriggs) Instructors’ Choice:Temporal model for Enron email dataset (Leman Akoglu and Seungil Huh)Learning low-treewidth CRFs via Graph cuts (Dafna Shahaf)1111This semester…Bayesian networks, Markov networks, factor graphs, decomposable models, junction trees, parameter learning, structure learning, semantics, exact inference, variable elimination, context-specific independence, approximate inference, sampling, importance sampling, MCMC, Gibbs, variational inference, loopy belief propagation, generalized belief propagation, Kikuchi, Bayesian learning, missing data, EM, Chow-Liu, IPF, Gaussian and hybrid models, discrete and continuous variables, temporal and template models, Kalman filter, linearization, conditional random fields, assumed density filtering, DBNs, BK, Causality,…Just the beginning… 11Quick overview of some hot topics...Maximum Margin Markov NetworksRelational Probabilistic ModelsInfluence Diagrams24Max (Conditional) Likelihoodx1,t(x1)… xm,t(xm)Df(x,y)Estimation
View Full Document