DOC PREVIEW
MIT 6 867 - Exact inference

This preview shows page 1-2-24-25 out of 25 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Machine learning: lecture 23Tommi S. JaakkolaMIT [email protected]• Exact inference (quickly)– message passing in junction trees• Approximate inference– belief propagation– sampling• Review for the final– what is important, what is notTommi Jaakkola, MIT CSAIL 2Exact inference: key steps• Baysian network, moralization, triangulationx1x2x3x5x4x1x2x3x5x4x1x2x3x5x4original graph moral graph already triangulated• Cliques, clique graph, and junction treex1x2x3x5x4c1c2c3c1c2c3x2, x3, x4x1, x2, x3x3, x4, x5x2, x3, x4c1c3x1, x2, x3x3, x4, x5c2s12s23x2, x3x3, x4cliques clique tree junction treeTommi Jaakkola, MIT CSAIL 3Exact inference: potentials• Associating graphs and potentialsx1x2x5x4P (x4|x2)P (x5|x3, x4)x3P (x1)P (x2)P (x3|x1, x2)x2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s121P (x4|x2)original graph w/ probabilities junction tree w/ probsP (x1, . . . , x5) =ψc1(x1, x2, x3)ψc2(x2, x3, x4)ψc3(x3, x4, x5)ψs12(x2, x3)ψs23(x3, x4)Tommi Jaakkola, MIT CSAIL 4Exact inference: message passing• Select a root clique• Collect evidencex2, x3, x4c1c3x1, x2, x3x3, x4, x5c2s12s23x2, x3x3, x4rootm3→2(x3, x4)m1→2(x2, x3)• Distribute e videncex2, x3, x4c1c3x1, x2, x3x3, x4, x5c2s12s23x2, x3x3, x4rootm2→1(x2, x3)m2→3(x3, x4)Tommi Jaakkola, MIT CSAIL 5Exact inference: message passing• Collect evidencex2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s121P (x4|x2)Tommi Jaakkola, MIT CSAIL 6Exact inference: message passing• Collect evidencex2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s121P (x4|x2)Evaluate new separators:ψ0s12(x2, x3) =Xx1ψc1(x1, x2, x3) = P (x2, x3)ψ0s23(x3, x4) =Xx5ψc3(x3, x4, x5) = 1Tommi Jaakkola, MIT CSAIL 7Exact inference: message passing• Collect evidencex2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s121P (x4|x2)Messages (not explicitly used in the algorithm):m1→2(x2, x3) =ψ0s12(x2, x3)ψs12(x2, x3)=P (x2, x3)1m3→2(x3, x4) =ψ0s23(x3, x4)ψs23(x3, x4)=11Tommi Jaakkola, MIT CSAIL 8Exact inference: message passing• Collect evidencex2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s121P (x4|x2)Update clique potentials (based on messages):ψc2(x2, x3, x4) ←ψ0s12(x2, x3)ψs12(x2, x3)| {z }m1→2(x2,x3)·ψ0s23(x3, x4)ψs23(x3, x4)·| {z }m3→2(x3,x4)ψc2(x2, x3, x4)= P (x2, x3) · 1 · P (x4|x2) = P (x2, x3, x4)followed by ψs12← ψ0s12and ψs23← ψ0s23Tommi Jaakkola, MIT CSAIL 9Exact inference: message passing• Distribute e videncex2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s12P (x2, x3)P (x2, x3, x4)Tommi Jaakkola, MIT CSAIL 10Exact inference: message passing• Distribute e videncex2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s12P (x2, x3)P (x2, x3, x4)Evaluate new separators:ψ0s12(x2, x3) =Xx4ψc2(x2, x3, x4) = P (x2, x3)ψ0s23(x3, x4) =Xx2ψc2(x2, x3, x4) = P (x3, x4)Tommi Jaakkola, MIT CSAIL 11Exact inference: message passing• Distribute e videncex2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s12P (x2, x3)P (x2, x3, x4)Messages (not explicitly used in the algorithm):m2→1(x2, x3) =ψ0s12(x2, x3)ψs12(x2, x3)=P (x2, x3)P (x2, x3)= 1m2→3(x3, x4) =ψ0s23(x3, x4)ψs23(x3, x4)=P (x3, x4)1Tommi Jaakkola, MIT CSAIL 12Exact inference: message passing• Distribute e videncex2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1)P (x2)P (x3|x1, x2)P (x5|x3, x4)1s12P (x2, x3)P (x2, x3, x4)Update c lique potentials (based on messages):ψc1(x1, x2, x3) ←ψ0s12(x2, x3)ψs12(x2, x3)ψc1(x1, x2, x3)= P (x1, x2, x3)ψc3(x3, x4, x5) ←ψ0s23(x3, x4)ψs23(x3, x4)· ψc3(x3, x4, x5) = P (x3, x4, x5)followed by ψs12← ψ0s12and ψs23← ψ0s23Tommi Jaakkola, MIT CSAIL 13Exact inference• After the collect and distribute steps the marginalprobabilities are stored locally at the clique potentials (andthe separators)x2, x3, x4c3x1, x2, x3x3, x4, x5c2s23x2, x3x3, x4c1P (x1, x2, x3)P (x3, x4, x5)s12P (x2, x3, x4)P (x2, x3)P (x3, x4)P (x1, . . . , x5) =P (x1, x2, x3)P (x2, x3, x4)P (x3, x4, x5)P (x2, x3)P (x3, x4)More generally, the resulting potentials would be proportionalto the posterior marginals, e.g., P (x1, x2, x3, data), whichcan be e asily normalized.Tommi Jaakkola, MIT CSAIL 14Outline• Exact inference (quickly)– message passing in junction trees• Approximate inference– belief propagation– sampling• Review for the final– what is important, what is notTommi Jaakkola, MIT CSAIL 15Approximate inference: motivation• We cannot solve the medical diagnosis problem(s) with e xactinference algorithms. . .. . .DiseasesFindingsdfFindingsDiseases– the largest c lique has over 100 variables (the correspondingpotential function or table would involve more than 2100elements)Tommi Jaakkola, MIT CSAIL 16Approximate inference: belief propagation• The message passing algorithm is appropriate when the modelis a (clique) tree– we need a unique path of influence between any (sets of)variables• We can still apply the message passing algorithm even if themodel is not a tree (message passing operations are definedlocally)x1x2x4x3m1→2(x2)m4→2(x2)– convergence?– accuracy?Tommi Jaakkola, MIT CSAIL 17Approximate inference: belief propagationx1x2x4x3m1→2(x2)m4→2(x2)– a set of locally consistent messages (fixed point of thealgorithm) always exists– the accuracy of the resulting marginals related to the lengthof the shortest cycle– stronger guarantees exist for finding most like lyconfigurations of variables• Works well in many large scale applications– decoding turbo (and other) codes, image processing,molecular networks, protein structure, etc.Tommi Jaakkola, MIT CSAIL 18Approximate inference: samplingx1x2x4x3• If we could draw samples xt= {xt1, xt2, xt3, xt4} from P (x),we could easily and acc urately evaluate any marginalsP (x1= 0) ≈1TTXt=1δ(xt1, 0)where δ(xt1, 0) = 1 whenever xt1= 0 and zero otherwise.• But it is hard to draw samples...Tommi Jaakkola, MIT CSAIL 19Simple remedy: importan ce sampling• We can instead draw samples from a much simplerdistribution Q(x) (e.g., where the variables may beindependent) and evaluate marginals according toP (x1= 0) =XxP (x)δ(x1, 0)=XxQ(x)P (x)Q(x)δ(x1, 0)≈1TTXt=1P (xt)Q(xt)δ(xt1, 0)where the


View Full Document

MIT 6 867 - Exact inference

Download Exact inference
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Exact inference and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Exact inference 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?