Variational InferenceOutlineApproximate InferenceSlide 4What is Variational Inference?VI QuestionsSlide 7Mean-field Variational InferenceD(q||p) for mean field – KL the reverse direction: cross-entropy termThe Energy FunctionalSlide 11Slide 12ExampleExampleSlide 15Slide 16Slide 17Slide 18Slide 19Structured Variational InferenceSlide 21Fixed point EquationsSlide 23Fixed point EquationsSlide 25Slide 26Slide 27Slide 28Slide 29Variational Inference in PracticeVariational InferenceAmr AhmedNov. 6th 2008Outline•Approximate Inference•Variational inference formulation–Mean Field•Examples–Structured VI•ExamplesApproximate Inference•Exact inference is exponential in clique size•Posterior is highly peaked–Can be approximated by a simpler distribution•Formulate inference as an optimization problem–Define an objective: how good is Q–Define a family of simpler distributions to search over–Find Q* that best approximate PAll DistributionsPQ*Tractable familyApproximate Inference•Exact inference is exponential in clique size•Posterior inference is highly peaked–Can be approximated by a simpler distribution•Formulate inference as an optimization problem–Define an objective: how good is Q–Define a family of simpler distributions to search over–Find Q* that best approximate P•Today we will cover variational Inference –Just a possible way of such a formulation•There are many other ways–Variants of loopy BP (later in the semester)What is Variational Inference?DifficultySATGradeHappyJobCoherenceLetterIntelligenceP• Posterior “hard” to compute•We know the CPTs (factors) DifficultySATGradeHappyCoherenceLetterIntelligenceQ•Tractable Posterior Family• Set CPT (factors)•Get distribution, Q PQKLQ Min QQ*•Fully instantiated distribution Q*•Enables exact Inference • Has low tree-width• Clique tree, variable elimination , etc.VI Questions•Which family to choose–Basically we want to remove some edges•Mean field: fully factorized•Structured : Keep tractable sub-graphs•How to carry the optimization•Assume P is a Markov network –Product of factors (that is all) PQKLQ Min QQ*Outline•Approximate Inference•Variational inference formulation–Mean Field•Examples–Structured VI•ExamplesMean-field Variational InferenceDifficultySATGradeHappyJobCoherenceLetterIntelligenceP• Posterior “hard” to compute•We know the CPTs (factors) DifficultySATGradeHappyCoherenceLetterIntelligenceQ•Tractable Posterior Family• Set CPT (factors)•Get distribution, Q PQKLQ Min QQ*•Fully instantiated distribution Q*•Enables exact Inference10-708 – Carlos Guestrin 2006-20089D(q||p) for mean field – KL the reverse direction: cross-entropy term•p:•q:The Energy Functional•Theorem : •Where energy functional:•Our problem now isMinimizeMaximize QPFFXQQixi, Max Q1)(Q* iiiCZP1 jjjxqQYou getLower bound on lnZ QQPFF ,lnZMaximizing F[P,Q] tighten the boundAnd gives better prob. estimatesTractableBy construction•Our problem now is•Theorem: Q is a stationary point of mean field approximation iff for each j: QPFFXQQixi, Max Q1)(Q*TractableBy constructionThe Energy Functional iiiCZP1 jjjxqQOutline•Approximate Inference•Variational inference formulation–Mean Field•Examples–Structured VI•ExamplesExampleX1X2X3X4X5X6X7X8X9X10X11X12P: Pairwise MNQ : fully factorized MNX1X2X3X4X5X6X7X8X9X10X11X12 EjijiXX,,XP iiiXqXQ- Given the factors in P, we want to get the factors for Q. - Iterative procedure. Fix all q-i , compute qi via the above equation - Iterate until you reach a fixed pointExampleX1X2X3X4X5X6X7X8X9X10X11X12P: Pairwise MNQ : fully factorized MNX1X2X3X4X5X6X7X8X9X10X11X12 iscopeXiiXUQiiiiiiUxExq][:}{,lnexp 51}{},{21}{},{1,ln,lnexp5121xxExxExqiiXXXQXXXQi EjijiXX,,XPExampleX1X2X3X4X5X6X7X8X9X10X11X12P: Pairwise MNQ : fully factorized MNX1X2X3X4X5X6X7X8X9X10X11X12 iscopeXiiXUQiiiiiiUxExq][:}{,lnexp 51211,ln,lnexp52xxExxExqXqXqi 5251552122,ln,lnexpxxxxxqxxxq EjijiXX,,XPExampleX1X2X3X4X5X6X7X8X9X10X11X12P: Pairwise MNQ : fully factorized MNX2X3X4X5X6X7X8X9X10X11X12 iscopeXiiXUQiiiiiiUxExq][:}{,lnexpIn your homeworkX1 EjijiXX,,XPExampleX1X2X3X4X5X6X7X8X9X10X11X12P: Pairwise MNQ : fully factorized MNX2X3X4X5X6X7X8X9X10X11X12 EjijiXX,,XP iscopeXiiXUQiiiiiiUxExq][:}{,lnexpX1-q(X6) get to be tied with q(xi) for all xi that appear in a factor with it in P-i.e. fixed point equations are tied according to P- What q(x6) gets is expectations under these q(xi) of how the factor looks like- can be somehow interpreted as message passing as well (but we won’t cover this)IntuitivelyExampleX1X2X3X4X5X6X7X8X9X10X11X12P: Pairwise MNQ : fully factorized MNX2X5X6X7X10 EjijiXX,,XP 26,ln2xxEXq 610,ln10xxEXq 56,ln5xxEXq 67,ln7xxEXq iscopeXiiXUQiiiiiiUxExq][:}{,lnexp-q(X6) get to be tied with q(xi) for all xi that appear in a factor with it in P-i.e. fixed point equations are tied according to P- What q(x6) gets is expectations under these q(xi) of how the factor looks like- can be somehow interpreted as message passing as well (but we won’t cover this)IntuitivelyOutline•Approximate Inference•Variational inference formulation–Mean Field•Examples–Structured VI•Examples•Inference in LDA (time permits)Structured Variational InferenceDifficultySATGradeHappyJobCoherenceLetterIntelligenceP• Posterior “hard” to compute•We know the CPTs (factors) DifficultySATGradeHappyCoherenceLetterIntelligenceQ•Tractable Posterior Family• Set CPT
View Full Document