Clique Trees 3 Let’s get BP right Undirected Graphical Models Here the couples get to swing!Factor divisionIntroducing message passing with divisionLauritzen-Spiegelhalter Algorithm (a.k.a. belief propagation)Clique tree invariantBelief propagation and clique tree invariantSubtree correctnessClique trees versus VEClique tree summaryAnnouncementsSwinging Couples revisitedPotentials (or Factors) in Swinging CouplesComputing probabilities in Markov networks v. BNsNormalization for computing probabilitiesFactorization in Markov networksGlobal Markov assumption in Markov networksThe BN Representation TheoremMarkov networks representation Theorem 1What about the other direction for Markov networks ?Markov networks representation Theorem 2 (Hammersley-Clifford Theorem)Representation Theorem for Markov NetworksCompleteness of separation in Markov networksWhat are the “local” independence assumptions for a Markov network?Local independence assumptions for a Markov networkEquivalence of independencies in Markov networksMinimal I-maps and Markov NetworksHow about a perfect map?Unifying properties of BNs and MNsWhat you need to know so far about Markov networksSome common Markov networks and generalizationsPairwise Markov NetworksA very simple vision applicationLogarithmic representationLog-linear Markov network (most common representation)Structure in cliquesFactor graphsSummary of types of Markov nets1Clique Trees 3Let’s get BP rightUndirected Graphical ModelsHere the couples get to swing!Graphical Models – 10708Carlos GuestrinCarnegie Mellon UniversityOctober 25th, 2006Readings:K&F: 9.1, 9.2, 9.3, 9.4K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.610-708 – Carlos Guestrin 20062 Factor divisionLet X and Y be disjoint set of variablesConsider two factors: 1(X,Y) and 2(Y)Factor =1/20/0=010-708 – Carlos Guestrin 20063 Introducing message passing with divisionVariable elimination (message passing with multiplication)message:belief:Message passing with division:message:belief update:C2: SEC4: GJSC1: CDC3: GDS10-708 – Carlos Guestrin 20064 Separator potentials ijone per edge (same both directions)holds “last message”initialized to 1Message i!jwhat does i think the separator potential should be? i!jupdate belief for j:pushing j to what i thinks about separatorreplace separator potential:C2: SEC4: GJSC1: CDC3: GDSLauritzen-Spiegelhalter Algorithm (a.k.a. belief propagation)10-708 – Carlos Guestrin 20065 Clique tree invariantClique tree potential:Product of clique potentials divided by separators potentialsClique tree invariant:P(X) = (X)10-708 – Carlos Guestrin 20066 Belief propagation and clique tree invariantTheorem: Invariant is maintained by BP algorithm!BP reparameterizes clique potentials and separator potentialsAt convergence, potentials and messages are marginal distributions10-708 – Carlos Guestrin 20067 Subtree correctnessInformed message from i to j, if all messages into i (other than from j) are informedRecursive definition (leaves always send informed messages)Informed subtree:All incoming messages informedTheorem:Potential of connected informed subtree T’ is marginal over scope[T’]Corollary:At convergence, clique tree is calibrated i = P(scope[i]) ij = P(scope[ij])10-708 – Carlos Guestrin 20068 Clique trees versus VEClique tree advantagesMulti-query settingsIncremental updatesPre-computation makes complexity explicitClique tree disadvantagesSpace requirements – no factors are “deleted”Slower for single queryLocal structure in factors may be lost when they are multiplied together into initial clique potential10-708 – Carlos Guestrin 20069 Clique tree summarySolve marginal queries for all variables in only twice the cost of query for one variableCliques correspond to maximal cliques in induced graphTwo message passing approachesVE (the one that multiplies messages)BP (the one that divides by old message)Clique tree invariantClique tree potential is always the sameWe are only reparameterizing clique potentialsConstructing clique tree for a BNfrom elimination orderfrom triangulated (chordal) graphRunning time (only) exponential in size of largest cliqueSolve exactly problems with thousands (or millions, or more) of variables, and cliques with tens of nodes (or less)10-708 – Carlos Guestrin 200610 AnnouncementsRecitation tomorrow, don’t miss it!!!Khalid on Undirected Models10-708 – Carlos Guestrin 200611 Swinging Couples revisitedThis is no perfect map in BNsBut, an undirected model will be a perfect map10-708 – Carlos Guestrin 200612 Potentials (or Factors) in Swinging Couples10-708 – Carlos Guestrin 200613 Computing probabilities in Markov networks v. BNsIn a BN, can compute prob. of an instantiation by multiplying CPTsIn an Markov networks, can only compute ratio of probabilities directly10-708 – Carlos Guestrin 200614 Normalization for computing probabilitiesTo compute actual probabilities, must compute normalization constant (also called partition function)Computing partition function is hard! ! Must sum over all possible assignments10-708 – Carlos Guestrin 200615 Factorization in Markov networksGiven an undirected graph H over variables X={X1,...,Xn}A distribution P factorizes over H if 9 subsets of variables D1µX,…, DmµX, such that the Di are fully connected in Hnon-negative potentials (or factors) 1(D1),…, m(Dm)also known as clique potentialssuch that Also called Markov random field H, or Gibbs distribution over H10-708 – Carlos Guestrin 200616 Global Markov assumption in Markov networksA path X1 – … – Xk is active when set of variables Z are observed if none of Xi 2 {X1,…,Xk} are observed (are part of Z) Variables X are separated from Y given Z in graph H, sepH(X;Y|Z), if there is no active path between any X2X and any Y2Y given ZThe global Markov assumption for a Markov network H is10-708 – Carlos Guestrin 200617 The BN Representation TheoremJoint probabilitydistribution:ObtainIf conditionalindependenciesin BN are subset of conditional independencies in PImportant because: Independencies are sufficient to obtain BN structure GIf joint probabilitydistribution:ObtainThen conditionalindependenciesin BN
View Full Document