Loopy Belief Propagation Generalized Belief Propagation Unifying Variational and GBP Learning Parameters of MNsMore details on Loopy BPAn example of running loopy BP(Non-)Convergence of Loopy BPLoopy BP in Factor graphsSlide 6What you need to know about loopy BPAnnouncementsLoopy BP v. Clique trees: Two ends of a spectrumGeneralize cluster graphRunning intersection propertyExamples of cluster graphsTwo cluster graph satisfying RIP with different edge setsGeneralized BP on cluster graphs satisfying RIPCluster graph for Loopy BPWhat if the cluster graph doesn’t satisfy RIPRegion graphs to the rescueRevisiting Mean-FieldsInterpretation of energy functionalEntropy of a tree distributionLoopy BP & Bethe approximationGBP & Kikuchi approximationWhat you need to know about GBPLearning Parameters of a BNLog Likelihood for MNLog Likelihood doesn’t decompose for MNsDerivative of Log Likelihood for MNsSlide 28Iterative Proportional Fitting (IPF)What you need to know about learning MN parameters?1Loopy Belief PropagationGeneralized Belief PropagationUnifying Variational and GBPLearning Parameters of MNsGraphical Models – 10708Carlos GuestrinCarnegie Mellon UniversityNovember 10th, 2006Readings:K&F: 11.3, 11.5Yedidia et al. paper from the class websiteChapter 9 - Jordan10-708 – Carlos Guestrin 20062 More details on Loopy BPNumerical problem:messages < 1 get multiplied togetheras we go around the loopsnumbers can go to zeronormalize messages to one:Zi!j doesn’t depend on Xj, so doesn’t change the answerComputing node “beliefs” (estimates of probs.): DifficultySATGradeHappyJobCoherenceLetterIntelligence10-708 – Carlos Guestrin 20063 An example of running loopy BP10-708 – Carlos Guestrin 20064 (Non-)Convergence of Loopy BPLoopy BP can oscillate!!!oscillations can smalloscillations can be really bad!Typically, if factors are closer to uniform, loopy does well (converges)if factors are closer to deterministic, loopy doesn’t behave well One approach to help: damping messagesnew message is average of old message and new one: often better convergencebut, when damping is required to get convergence, result often badgraphs from Murphy et al. ’9910-708 – Carlos Guestrin 20065 Loopy BP in Factor graphsWhat if we don’t have pairwise Markov nets?1. Transform to a pairwise MN2. Use Loopy BP on a factor graphMessage example:from node to factor:from factor to node:A B C D EABC ABD BDE CDE10-708 – Carlos Guestrin 20066 Loopy BP in Factor graphsFrom node i to factor j:F(i) factors whose scope includes Xi From factor j to node i:Scope[j] = Y[{Xi}A B C D EABC ABD BDE CDE10-708 – Carlos Guestrin 20067 What you need to know about loopy BPApplication of belief propagation in loopy graphsDoesn’t always convergedamping can helpgood message schedules can help (see book)If converges, often to incorrect, but useful resultsGeneralizes from pairwise Markov networks by using factor graphs10-708 – Carlos Guestrin 20068 AnnouncementsMonday’s special recitationPradeep Ravikumar on exciting new approximate inference algorithms10-708 – Carlos Guestrin 20069 Loopy BP v. Clique trees: Two ends of a spectrumDifficultySATGradeHappyJobCoherenceLetterIntelligenceDIGGJSLHGJCDGSI10-708 – Carlos Guestrin 200610 Generalize cluster graphGeneralized cluster graph: For set of factors FUndirected graphEach node i associated with a cluster Ci Family preserving: for each factor fj 2 F, 9 node i such that scope[fi]µ CiEach edge i – j is associated with a set of variables Sij µ Ci Å Cj10-708 – Carlos Guestrin 200611 Running intersection property(Generalized) Running intersection property (RIP)Cluster graph satisfies RIP if whenever X2 Ci and X2 Cj then 9 one and only one path from Ci to Cj where X2Suv for every edge (u,v) in the path10-708 – Carlos Guestrin 200612 Examples of cluster graphs10-708 – Carlos Guestrin 200613 Two cluster graph satisfying RIP with different edge sets10-708 – Carlos Guestrin 200614 Generalized BP on cluster graphs satisfying RIPInitialization:Assign each factor to a clique (), Scope[]µC() Initialize cliques: Initialize messages:While not converged, send messages:Belief:10-708 – Carlos Guestrin 200615 Cluster graph for Loopy BPDifficultySATGradeHappyJobCoherenceLetterIntelligence10-708 – Carlos Guestrin 200616 What if the cluster graph doesn’t satisfy RIP10-708 – Carlos Guestrin 200617 Region graphs to the rescueCan address generalized cluster graphs that don’t satisfy RIP using region graphs:Yedidia et al. from class websiteExample in your homework! Hint – From Yedidia et al.:Section 7 – defines region graphsSection 9 – message passing on region graphsSection 10 – An example that will help you a lot!!! 10-708 – Carlos Guestrin 200618 Revisiting Mean-FieldsChoice of Q:Optimization problem:10-708 – Carlos Guestrin 200619 Interpretation of energy functionalEnergy functional:Exact if P=Q:View problem as an approximation of entropy term:10-708 – Carlos Guestrin 200620 Entropy of a tree distributionEntropy term:Joint distribution:Decomposing entropy term:More generally: di number neighbors of XiDifficultySATGradeHappyJobCoherenceLetterIntelligence10-708 – Carlos Guestrin 200621 Loopy BP & Bethe approximationEnergy functional:Bethe approximation of Free Energy:use entropy for trees, but loopy graphs:Theorem: If Loopy BP converges, resulting ij & i are stationary point (usually local maxima) of Bethe Free energy! DifficultySATGradeHappyJobCoherenceLetterIntelligence10-708 – Carlos Guestrin 200622 GBP & Kikuchi approximationExact Free energy: Junction TreeBethe Free energy:Kikuchi approximation: Generalized cluster graph spectrum from Bethe to exactentropy terms weighted by counting numberssee Yedidia et al.Theorem: If GBP converges, resulting Ci are stationary point (usually local maxima) of Kikuchi Free energy! DifficultySATGradeHappyJobCoherenceLetterIntelligenceDIGGJSLHGJCDGSI10-708 – Carlos Guestrin 200623 What you need to know about GBPSpectrum between Loopy BP & Junction Trees:More computation, but typically better answersIf satisfies RIP, equations are very simpleGeneral setting, slightly trickier
View Full Document