DOC PREVIEW
CMU CS 10708 - Mean Field and Variational Methods finishing off

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 1 Mean Field and Variational Methods finishing off Graphical Models – 10708 Carlos Guestrin Carnegie Mellon University November 5th, 2008 Readings: K&F: 10.1, 10.5 10-708 – ©Carlos Guestrin 2006-2008 10-708 – ©Carlos Guestrin 2006-2008 22 10-708 – ©Carlos Guestrin 2006-2008 3 What you need to know so far  Goal:  Find an efficient distribution that is close to posterior  Distance:  measure distance in terms of KL divergence  Asymmetry of KL:  D(p||q) ≠ D(q||p)  Computing right KL is intractable, so we use the reverse KL 10-708 – ©Carlos Guestrin 2006-2008 4 Reverse KL & The Partition Function Back to the general case  Consider again the defn. of D(q||p):  p is Markov net PF  Theorem:  where energy functional: Difficulty SAT Grade Happy Job Coherence Letter Intelligence3 10-708 – ©Carlos Guestrin 2006-2008 5 Understanding Reverse KL, Energy Function & The Partition Function  Maximizing Energy Functional ⇔ Minimizing Reverse KL  Theorem: Energy Function is lower bound on partition function  Maximizing energy functional corresponds to search for tight lower bound on partition function 10-708 – ©Carlos Guestrin 2006-2008 6 Structured Variational Approximate Inference  Pick a family of distributions Q that allow for exact inference  e.g., fully factorized (mean field)  Find Q2Q that maximizes  For mean field4 10-708 – ©Carlos Guestrin 2006-2008 7 Optimization for mean field  Constrained optimization, solved via Lagrangian multiplier  9 λ, such that optimization equivalent to:  Take derivative, set to zero  Theorem: Q is a stationary point of mean field approximation iff for each i: 10-708 – ©Carlos Guestrin 2006-2008 8 Understanding fixed point equation Difficulty SAT Grade Happy Job Coherence Letter Intelligence5 10-708 – ©Carlos Guestrin 2006-2008 9 Simplifying fixed point equation Difficulty SAT Grade Happy Job Coherence Letter Intelligence 10-708 – ©Carlos Guestrin 2006-2008 10  Theorem: The fixed point: is equivalent to:  where the Scope[φj] = Uj [ {Xi} Qi only needs to consider factors that intersect Xi Difficulty SAT Grade Happy Job Coherence Letter Intelligence6 10-708 – ©Carlos Guestrin 2006-2008 11 There are many stationary points! 10-708 – ©Carlos Guestrin 2006-2008 12  Initialize Q (e.g., randomly or smartly)  Set all vars to unprocessed  Pick unprocessed var Xi  update Qi:  set var i as processed  if Qi changed  set neighbors of Xi to unprocessed  Guaranteed to converge Very simple approach for finding one stationary point Difficulty SAT Grade Happy Job Coherence Letter Intelligence7 10-708 – ©Carlos Guestrin 2006-2008 13 More general structured approximations  Mean field very naïve approximation  Consider more general form for Q  assumption: exact inference doable over Q  Theorem: stationary point of energy functional:  Very similar update rule Difficulty SAT Grade Happy Job Coherence Letter Intelligence 10-708 – ©Carlos Guestrin 2006-2008 14 Computing update rule for general case  Consider one φ: Difficulty SAT Grade Happy Job Coherence Letter Intelligence8 10-708 – ©Carlos Guestrin 2006-2008 15 Structured Variational update requires inference  Compute marginals wrt Q of cliques in original graph and cliques in new graph, for all cliques  What is a good way of computing all these marginals?  Potential updates:  sequential: compute marginals, update ψj, recompute marginals  parallel: compute marginals, update all ψ’s, recompute marginals 10-708 – ©Carlos Guestrin 2006-2008 16 What you need to know about variational methods  Structured Variational method:  select a form for approximate distribution  minimize reverse KL  Equivalent to maximizing energy functional  searching for a tight lower bound on the partition function  Many possible models for Q:  independent (mean field)  structured as a Markov net  cluster variational  Several subtleties outlined in the book9 17 Loopy Belief Propagation Graphical Models – 10708 Carlos Guestrin Carnegie Mellon University November 5th, 2008 Readings: K&F: 10.2, 10.3 10-708 – ©Carlos Guestrin 2006-2008 10-708 – ©Carlos Guestrin 2006-2008 18 Recall message passing over junction trees  Exact inference:  generate a junction tree  message passing over neighbors  inference exponential in size of clique Difficulty SAT Grade Happy Job Coherence Letter Intelligence DIG GJSL HGJ CD GSI10 10-708 – ©Carlos Guestrin 2006-2008 19 Belief Propagation on Tree Pairwise Markov Nets  Tree pairwise Markov net is a tree!!!   no need to create a junction tree  Message passing:  More general equation:  N(i) – neighbors of i in pairwise MN  Theorem: Converges to true probabilities: Difficulty SAT Grade Happy Job Coherence Letter Intelligence 10-708 – ©Carlos Guestrin 2006-2008 20 Loopy Belief Propagation on Pairwise Markov Nets  What if we apply BP in a graph with loops?  send messages between pairs of nodes in graph, and hope for the best  What happens?  evidence goes around the loops multiple times  may not converge  if it converges, usually overconfident about probability values  But often gives you reasonable, or at least useful answers  especially if you just care about the MPE rather than the actual probabilities Difficulty SAT Grade Happy Job Coherence Letter Intelligence11 10-708 – ©Carlos Guestrin 2006-2008 21 More details on Loopy BP  Numerical problem:  messages < 1 get multiplied together as we go around the loops  numbers can go to zero  normalize messages to one:  Zi!j doesn’t depend on Xj, so doesn’t change the answer  Computing node “beliefs” (estimates of probs.): Difficulty SAT Grade Happy Job Coherence Letter Intelligence 10-708 – ©Carlos Guestrin 2006-2008 22 An example of running loopy BP12 10-708 – ©Carlos Guestrin 2006-2008 23 Convergence  If you tried to send all messages, and beliefs haven’t changed (by much) ! converged Difficulty SAT Grade Happy Job Coherence Letter Intelligence 10-708 – ©Carlos Guestrin 2006-2008 24 (Non-)Convergence of Loopy BP  Loopy


View Full Document

CMU CS 10708 - Mean Field and Variational Methods finishing off

Documents in this Course
Lecture

Lecture

15 pages

Lecture

Lecture

25 pages

Lecture

Lecture

24 pages

causality

causality

53 pages

lecture11

lecture11

16 pages

Exam

Exam

15 pages

Notes

Notes

12 pages

lecture

lecture

18 pages

lecture

lecture

16 pages

Lecture

Lecture

17 pages

Lecture

Lecture

15 pages

Lecture

Lecture

17 pages

Lecture

Lecture

19 pages

Lecture

Lecture

42 pages

Lecture

Lecture

16 pages

r6

r6

22 pages

lecture

lecture

20 pages

lecture

lecture

35 pages

Lecture

Lecture

19 pages

Lecture

Lecture

21 pages

lecture

lecture

21 pages

lecture

lecture

13 pages

review

review

50 pages

Semantics

Semantics

30 pages

lecture21

lecture21

26 pages

MN-crf

MN-crf

20 pages

hw4

hw4

5 pages

lecture

lecture

12 pages

Lecture

Lecture

25 pages

Lecture

Lecture

25 pages

Lecture

Lecture

14 pages

Lecture

Lecture

15 pages

Load more
Download Mean Field and Variational Methods finishing off
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Mean Field and Variational Methods finishing off and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Mean Field and Variational Methods finishing off 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?