Machine Learning ! !! ! ! Srihari 1 1 Inference in Graphical Models Sargur Srihari [email protected] Learning ! !! ! ! Srihari 2 2 Topics 1. Objective of Inference Algorithms 2. Analysis of Complexity 3. Cases 1. Bayes Theorem Inference 2. Inference on a Chain 3. Trees 4. Factor Graphs 5. Exact Inference Algorithms for Tree graphs Sum-product algorithm, Max-sum algorithm 6. Exact inference in general graphs 7. Approximate Inference: Loopy belief propagation 8. Learning the graph structureMachine Learning ! !! ! ! Srihari 3 3 What is inference? • Graphical models represent a joint probability distribution • Types of inference tasks: 1. Compute marginal probabilities 1. Conditional probabilities can be easily computed from joint and marginals 2. Evaluate posterior distributions 1. Some of the nodes in a graph are clamped to observed values (X) 2. Compute posterior distributions of one or more subsets of nodes (latent variables Z), i.e., p(Z|X) 3. Compute maximum a posteriori probabilitiesMachine Learning ! !! ! ! Srihari Common BN Inference Problem • Conditional probability query P(Y|E=e)!– Y is a query variable, E are evidence variables – Numerator is instantiated as P(y,e)!– If W=χ-Y-E!– Denominator 4 P(Y | E = e) =P(Y, e)P(e)P(y, e) = P(y, e, w)w∑P(e) = P(y, e)y∑Allows reusing computation of (1) (1) Each term in summation is an entry in the distributionMachine Learning ! !! ! ! Srihari Analysis of Complexity • Approach of summing out the variables in the joint distribution is unsatisfactory – Returns us to exponential blow-up • PGM was precisely designed to avoid this! • Problem of inference in PGMs is NP-hard – Requires exponential time except if P=NP 5Machine Learning ! !! ! ! Srihari Analysis of Exact Inference • BN represented as table of size |Val{Xi} U PaXi}|!• Analysis of complexity usually applies to decision problems • Natural version of conditional probability problem: – BN-Pr-DP: Given a BN B over χ, a variable X ε χ, and a value x εVal(X) decide PB (X=x)>0!• This decision problem is NP-complete • Original problem is – BN-Pr: Given a BN B over χ, a variable X ε χ, and a value x εVal(X) computer PB (X=x)!• Weighted count of instantiations, with weight being the probability • This problem is #P-completeMachine Learning ! !! ! ! Srihari Proof involves 3-SAT • SAT (Satisfiability) decision problem – In propositional logic – Return true if formula C1 ^ C2 ^ ..Cm where Ci is a DNF of n binary variables qi has a satisfying assignment, • e.g., return true for 3-SAT formula (q1 V ~q2 V ~q3) ^ (~q1 V q2 V ~q3)! since q1=q2=q3=true is a satisfying assignment – Since we need to check n binary variables there are 2n assignments 7Machine Learning ! !! ! ! Srihari Complexity Classes • Decision problem is – In P if there is algorithm that decides in poly time – In NP if a guess can be verified in polynom. time • In general . Converse is biggest problem in computational complexity • Hardest problems in NP called NP-complete – If poly time solution exists, can solve any in NP • 3-SAT belongs to NP and is NP-Hard 8 P ⊆ NPMachine Learning ! !! ! ! Srihari Proof of BN-Pr-DP is NP-complete • Whether in NP: – Guess assignment ξ to network variables. Check whether X=x and P(ξ)>0 – One such guess succeeds iff P(X=x)>0. – Done in linear time • Is NP-hard: – Answer for instances in BN-Pr-DP can be used to answer an NP-hard problem – Show a reduction from 3-SAT problem 9Machine Learning ! !! ! ! Srihari Reduction of 3-SAT to BN inference • Given a 3-SAT formula φ create BN Bφwith variable X such that φ is satisfiable iff PBφ(X=x1)>0!• If BN is solved in poly time we can also solve 3-SAT in poly time 10 Q1QnQ4Q3Q2C1A1XAm–2A2CmCm–1C3C2. . .. . .Machine Learning ! !! ! ! Srihari 11 11 Inference Algorithms • Worst case is exponential • Two types of inference algorithms – Exact • Variable Elimination – Some sub-expressions depend only small no of variables • Clique trees – Approximate • Optimization – Propagation with approximate messages – Variational (analytical approximations) • Particle-based (sampling)Machine Learning ! !! ! ! Srihari Three Simple Inference Cases 1. Bayes theorem as inference 2. Inference on a chain 3. Inference on a tree 12Machine Learning ! !! ! ! Srihari 13 13 1. Bayes Theorem as Inference • Joint distribution p(x,y) over two variables x and y – Factors p(x,y)=p(x)p(y|x) • represented as directed graph (a) • We are given CPDs p(x) and p(y|x) • If we observe value of y as in (b) – Can view marginal p(x) as prior – Over latent variable x • Analogy to 2-class classifier – x =0,1 and y is continuous – Wish to infer a posteriori distribution p(x|y)Machine Learning ! !! ! ! Srihari • Using sum and product rules, we can evaluate marginal – Need to evaluate a summation • Which is then used in Bayes rule to calculate • Observations – Joint is now expressed as p(x,y)=p(y)p(x|y) • Which is shown in (c) – Thus knowing value of y we know distribution of x 14 14 Inferring posterior using BayesMachine Learning ! !! ! ! Srihari 15 15 2. Inference on a Chain • Graphs of this form are known as Markov chains – Example: N = 365 days and x is weather (cloudy,rainy,snow..) • Analysis more complex than previous case • In this case directed and
View Full Document