BN Semantics 3 – Now it’s personal!Independencies encoded in BNUnderstanding independencies in BNs – BNs with 3 nodesUnderstanding independencies in BNs – Some examplesUnderstanding independencies in BNs – Some more examplesAn active trail – ExampleActive trails formalizedActive trails and independence?More generally: Soundness of d-separationExistence of dependency when not d-separatedMore generally: Completeness of d-separationInterpretation of completenessAlgorithm for d-separationWhat you need to knowAnnouncementsBuilding BNs from independence propertiesMinimal I-mapsObtaining a minimal I-mapMinimal I-map not unique (or minimum)Perfect maps (P-maps)Inexistence of P-maps 1Inexistence of P-maps 2Obtaining a P-mapI-EquivalenceSkeleton of a BNWhat about V-structures?Same V-structures not necessaryImmoralities & I-EquivalenceSlide 29Identifying the skeleton 1Identifying the skeleton 2Identifying immoralitiesFrom immoralities and skeleton to BN structuresSlide 341BN Semantics 3 – Now it’s personal!Graphical Models – 10708Carlos GuestrinCarnegie Mellon UniversitySeptember 22nd, 2008Readings:K&F: 3.3, 3.410-708 – Carlos Guestrin 2006-200810-708 – Carlos Guestrin 2006-20082Independencies encoded in BNWe said: All you need is the local Markov assumption(Xi NonDescendantsXi | PaXi)But then we talked about other (in)dependenciese.g., explaining awayWhat are the independencies encoded by a BN?Only assumption is local MarkovBut many others can be derived using the algebra of conditional independencies!!!10-708 – Carlos Guestrin 2006-20083Understanding independencies in BNs – BNs with 3 nodesZYXLocal Markov Assumption:A variable X is independentof its non-descendants given its parents and only its parents Z YXZ YXZYXIndirect causal effect:Indirect evidential effect:Common cause:Common effect:10-708 – Carlos Guestrin 2006-20084Understanding independencies in BNs – Some examplesAHCEGDBFKJI10-708 – Carlos Guestrin 2006-20085Understanding independencies in BNs – Some more examplesAHCEGDBFKJI10-708 – Carlos Guestrin 2006-20086An active trail – ExampleA HCEGDBFF’’F’When are A and H independent?10-708 – Carlos Guestrin 2006-20087Active trails formalizedA trail X1 – X2 – · · · –Xk is an active trail when variables O{X1,…,Xn} are observed if for each consecutive triplet in the trail:Xi-1XiXi+1, and Xi is not observed (XiO)Xi-1XiXi+1, and Xi is not observed (XiO)Xi-1XiXi+1, and Xi is not observed (XiO)Xi-1XiXi+1, and Xi is observed (Xi2O), or one of its descendents10-708 – Carlos Guestrin 2006-20088Active trails and independence?Theorem: Variables Xi and Xj are independent given Z{X1,…,Xn} if the is no active trail between Xi and Xj when variables Z{X1,…,Xn} are observedAHCEGDBFKJI10-708 – Carlos Guestrin 2006-20089More generally: Soundness of d-separationGiven BN structure GSet of independence assertions obtained by d-separation:I(G) = {(X Y|Z) : d-sepG(X;Y|Z)}Theorem: Soundness of d-separationIf P factorizes over G then I(G)I(P)Interpretation: d-separation only captures true independenciesProof discussed when we talk about undirected models10-708 – Carlos Guestrin 2006-200810Existence of dependency when not d-separatedTheorem: If X and Y are not d-separated given Z, then X and Y are dependent given Z under some P that factorizes over G Proof sketch: Choose an active trail between X and Y given ZMake this trail dependent Make all else uniform (independent) to avoid “canceling” out influenceAHCEGDBFKJI10-708 – Carlos Guestrin 2006-200811More generally: Completeness of d-separationTheorem: Completeness of d-separationFor “almost all” distributions where P factorizes over to G, we have that I(G) = I(P)“almost all” distributions: except for a set of measure zero of parameterizations of the CPTs (assuming no finite set of parameterizations has positive measure)Means that if all sets X & Y that are not d-separated given Z, then ¬ (XY|Z)Proof sketch for very simple case:10-708 – Carlos Guestrin 2006-200812Interpretation of completenessTheorem: Completeness of d-separationFor “almost all” distributions that P factorize over to G, we have that I(G) = I(P)BN graph is usually sufficient to capture all independence properties of the distribution!!!!But only for complete independence:P (X=xY=y | Z=z), 8 x2Val(X), y2Val(Y), z2Val(Z)Often we have context-specific independence (CSI) 9 x2Val(X), y2Val(Y), z2Val(Z): P (X=xY=y | Z=z)Many factors may affect your gradeBut if you are a frequentist, all other factors are irrelevant 10-708 – Carlos Guestrin 2006-200813Algorithm for d-separationHow do I check if X and Y are d-separated given ZThere can be exponentially-many trails between X and YTwo-pass linear time algorithm finds all d-separations for X1. Upward passMark descendants of Z2. Breadth-first traversal from XStop traversal at a node if trail is “blocked”(Some tricky details apply – see reading)AHCEGDBFKJI10-708 – Carlos Guestrin 2006-200814What you need to knowd-separation and independencesound procedure for finding independenciesexistence of distributions with these independencies(almost) all independencies can be read directly from graph without looking at CPTsAnnouncementsHomework 1:Due next Wednesday – beginning of class!It’s hard – start early, ask questionsAudit policyNo sitting in, official auditors only, see course website10-708 – Carlos Guestrin 2006-200816Building BNs from independence propertiesFrom d-separation we learned:Start from local Markov assumptions, obtain all independence assumptions encoded by graphFor most P’s that factorize over G, I(G) = I(P)All of this discussion was for a given G that is an I-map for PNow, give me a P, how can I get a G?i.e., give me the independence assumptions entailed by PMany G are “equivalent”, how do I represent this?Most of this discussion is not about practical algorithms, but useful concepts that will be used by practical algorithmsPractical algs next time10-708 – Carlos Guestrin 2006-200817Minimal I-mapsOne option: G is an I-map for PG is as simple as possibleG is a minimal I-map for P if deleting any edges from G
View Full Document