Machine learning: lecture 23Tommi S. JaakkolaMIT [email protected]• Course evaluations ... on-going• Project submission (Friday Dec 3):– only electronic submissions (pdf or ps); see the coursewebsite– if you need an extension (and have a reason), you need toask. Late submissions are not possible otherwise.• Final exam, in class (Wed Dec 8):– a part of the lecture on Monday Dec 6 will be review– comprehensive (covers all the course material) but theemphasis will be on the material since the midterm– as promised, EM and HMMs will be on the exam– open book, laptops fine if not connectedTommi Jaakkola, MIT CSAIL 2Exact inference• All exact inference algorithms for Bayesian networks performessentially the same calculations but operate on differentreprese ntations• The junction tree algorithm is a simple message passingalgorithm over clusters of variablesPreliminary steps:1. transform the Baye sian network into an undirected modelvia moralization (“marry parents”)2. triangulate the resulting undirected graph (add edges)3. identify the cliques (clusters) of the resulting triangulatedgraph4. construct the junction tree from the cliquesTommi Jaakkola, MIT CSAIL 3Exact inference: preliminary steps• Moralizationx4x1x2x3x5original graphTommi Jaakkola, MIT CSAIL 4Exact inference: preliminary steps• Moralizationx4x1x2x3x5x4x1x2x3x5original graph “marry” parentsTommi Jaakkola, MIT CSAIL 5Exact inference: preliminary steps• Moralizationx4x1x2x3x5x4x1x2x3x5x4x1x2x3x5original graph “marry” parents moral graphTommi Jaakkola, MIT CSAIL 6Exact inference: preliminary steps• Moralizationx4x1x2x3x5x4x1x2x3x5x4x1x2x3x5original graph “marry” parents moral graph• Triangulation (add edges so that any cycle of four or morenode s has a “chord”)x4x1x2x3x5already triangulatedTommi Jaakkola, MIT CSAIL 7Exact inference: preliminary steps• Moralizationx4x1x2x3x5x4x1x2x3x5x4x1x2x3x5original graph “marry” parents moral graph• Triangulation (add edges so that any cycle of four or morenode s has a “chord”)x4x1x2x3x5x4x1x2x3x5already triangulated not triangulatedTommi Jaakkola, MIT CSAIL 8Exact inference: preliminary steps• Moralizationx4x1x2x3x5x4x1x2x3x5x4x1x2x3x5original graph “marry” parents moral graph• Triangulation (add edges so that any cycle of four or morenode s has a “chord”)x4x1x2x3x5x4x1x2x3x5x5x2x4x1x3already triangulated not triangulated not triangulatedTommi Jaakkola, MIT CSAIL 9Exact inference: preliminary steps cont’d• Find the maximal cliques of the triangulated graphc1x1x2x3x5x4c1= {x1, x2, x3}Tommi Jaakkola, MIT CSAIL 10Exact inference: preliminary steps cont’d• Find the maximal cliques of the triangulated graphc1x1x2x3x5x4c2x1x2x3x5x4c1c1= {x1, x2, x3} c2= {x2, x3, x4}Tommi Jaakkola, MIT CSAIL 11Exact inference: preliminary steps cont’d• Find the maximal cliques of the triangulated graphc1x1x2x3x5x4c2x1x2x3x5x4c1c3x1x2x3x5x4c1c2c1= {x1, x2, x3} c2= {x2, x3, x4} c3= {x3, x4, x5}Tommi Jaakkola, MIT CSAIL 12Exact inference: preliminary steps cont’d• Find the maximal cliques of the triangulated graphc1x1x2x3x5x4c2x1x2x3x5x4c1c3x1x2x3x5x4c1c2c1= {x1, x2, x3} c2= {x2, x3, x4} c3= {x3, x4, x5}• Clique trees and junction treesx3, x4, x5c1c2c3x2, x3, x4x1, x2, x3clique treeTommi Jaakkola, MIT CSAIL 13Exact inference: preliminary steps cont’d• Find the maximal cliques of the triangulated graphc1x1x2x3x5x4c2x1x2x3x5x4c1c3x1x2x3x5x4c1c2c1= {x1, x2, x3} c2= {x2, x3, x4} c3= {x3, x4, x5}• Clique trees and junction treesx3, x4, x5c1c2c3x2, x3, x4x1, x2, x3x3, x4x2, x3, x4c1c3x1, x2, x3x3, x4, x5c2s12s23x2, x3clique tree junction tree (with se parators)Tommi Jaakkola, MIT CSAIL 14Exact inference: potent ials• Associating graphs and potentialsP (x3|x1, x2)x1x2x5x4P (x4|x2)P (x5|x3, x4)x3P (x1)P (x2)x3, x4x2, x3, x4c1c3x1, x2, x3x3, x4, x5c2s12s23x2, x3original graph w/ probs junction tre eTommi Jaakkola, MIT CSAIL 15Exact inference: potent ials• Associating graphs and potentialsP (x3|x1, x2)x1x2x5x4P (x4|x2)P (x5|x3, x4)x3P (x1)P (x2)P (x4|x2)x2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s121original graph w/ probabilities junction tre e w/ probsψc1(x1, x2, x3) = P (x1)P (x2)P (x3|x1, x2)ψc2(x2, x3, x4) = P (x4|x2)ψc3(x3, x4, x5) = P (x5|x3, x4)ψs12(x2, x3) = 1 (separator)ψs23(x3, x4) = 1 (separator)Tommi Jaakkola, MIT CSAIL 16Exact inference: message passing• Select a root clique• Collect evidencem1→2(x2, x3)x2, x3, x4c1c3x1, x2, x3x3, x4, x5c2s12s23x2, x3x3, x4rootm3→2(x3, x4)• Distribute evidencem2→3(x3, x4)x2, x3, x4c1c3x1, x2, x3x3, x4, x5c2s12s23x2, x3x3, x4rootm2→1(x2, x3)Tommi Jaakkola, MIT CSAIL 17Exact inference: message passing• Collect evidenceP (x4|x2)x2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s121Tommi Jaakkola, MIT CSAIL 18Exact inference: message passing• Collect evidenceP (x4|x2)x2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s121Evaluate new separators:ψ0s12(x2, x3) =Xx1ψc1(x1, x2, x3) = P (x2, x3)ψ0s23(x3, x4) =Xx5ψc3(x3, x4, x5) = 1Tommi Jaakkola, MIT CSAIL 19Exact inference: message passing• Collect evidenceP (x4|x2)x2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s121Messages (not explicitly used in the algorithm):m1→2(x2, x3) =ψ0s12(x2, x3)ψs12(x2, x3)=P (x2, x3)1m3→2(x3, x4) =ψ0s23(x3, x4)ψs23(x3, x4)=11Tommi Jaakkola, MIT CSAIL 20Exact inference: message passing• Collect evidenceP (x4|x2)x2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s121Update clique potentials (based on messages):ψc2(x2, x3, x4) ←ψ0s12(x2, x3)ψs12(x2, x3)| {z }m1→2(x2,x3)·ψ0s23(x3, x4)ψs23(x3, x4)·| {z }m3→2(x3,x4)ψc2(x2, x3, x4)= P (x2, x3) · 1 · P (x4|x2) = P (x2, x3, x4)followed by ψs12← ψ0s12and ψs23← ψ0s23Tommi Jaakkola, MIT CSAIL 21Exact inference: message passing• Distribute evidenceP (x2, x3, x4)x2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s12P (x2, x3)Tommi Jaakkola, MIT CSAIL 22Exact inference: message passing• Distribute evidenceP (x2, x3, x4)x2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s12P (x2, x3)Evaluate new separators:ψ0s12(x2, x3) =Xx4ψc2(x2, x3, x4) = P (x2, x3)ψ0s23(x3, x4) =Xx2ψc2(x2, x3, x4) = P (x3,
View Full Document