1Variable EliminationGraphical Models – 10708Carlos GuestrinCarnegie Mellon UniversityOctober 15th, 2008Readings:K&F: 8.1, 8.2, 8.3, 8.4 10-708 – Carlos Guestrin 2006-200810-708 –Carlos Guestrin 2006-20082Inference in BNs hopeless? In general, yes! Even approximate! In practice Exploit structure Many effective approximation algorithms (some with guarantees) For now, we’ll talk about exact inference Approximate inference later this semester10-708 –Carlos Guestrin 2006-20083General probabilistic inference Query: Using def. of cond. prob.: Normalization:FluAllergySinusHeadacheNose10-708 –Carlos Guestrin 2006-20084Probabilistic inference exampleFluAllergySinusHeadacheNose=tInference seems exponential in number of variables!10-708 –Carlos Guestrin 2006-20085FluAllergySinusHeadacheNose=tFast probabilistic inference example – Variable elimination(Potential for) Exponential reduction in computation!10-708 –Carlos Guestrin 2006-20086Understanding variable elimination –Exploiting distributivityFlu Sinus Nose=t10-708 –Carlos Guestrin 2006-20087Understanding variable elimination –Order can make a HUGE differenceFluAllergySinusHeadacheNose=t10-708 –Carlos Guestrin 2006-20088Understanding variable elimination –Intermediate resultsFluAllergySinusHeadacheNose=tIntermediate results are probability distributions10-708 –Carlos Guestrin 2006-20089Understanding variable elimination –Another examplePharmacySinusHeadacheNose=t10-708 –Carlos Guestrin 2006-200810Pruning irrelevant variablesFluAllergySinusHeadacheNose=tPrune all non-ancestors of query variablesMore generally: Prune all nodes not on active trail between evidence and query vars10-708 –Carlos Guestrin 2006-200811Variable elimination algorithm Given a BN and a query P(X|e) ∝ P(X,e) Instantiate evidence e Prune non-active vars for {X,e} Choose an ordering on variables, e.g., X1, …, Xn Initial factors {f1,…,fn}: fi= P(Xi|PaXi) (CPT for Xi) For i = 1 to n, If Xi∉{X,E} Collect factors f1,…,fkthat include Xi Generate a new factor by eliminating Xifrom these factors Variable Xihas been eliminated! Normalize P(X,e) to obtain P(X|e)IMPORTANT!!!10-708 –Carlos Guestrin 2006-200812Operations on factorsFluAllergySinusHeadacheNose=tMultiplication:10-708 –Carlos Guestrin 2006-200813Operations on factorsFluAllergySinusHeadacheNose=tMarginalization:10-708 –Carlos Guestrin 2006-200814Complexity of VE – First analysis Number of multiplications: Number of additions:10-708 –Carlos Guestrin 2006-200815Complexity of variable elimination –(Poly)-tree graphsVariable elimination order: Start from “leaves” inwards: • Start from skeleton!• Choose a “root”, any node• Find topological order for root• Eliminate variables in reverse orderLinear in CPT sizes!!! (versus exponential)10-708 –Carlos Guestrin 2006-200816What you need to know about inference thus far Types of queries probabilistic inference most probable explanation (MPE) maximum a posteriori (MAP) MPE and MAP are truly different (don’t give the same answer) Hardness of inference Exact and approximate inference are NP-hard MPE is NP-complete MAP is much harder (NPPP-complete) Variable elimination algorithm Eliminate a variable: Combine factors that include this var into single factor Marginalize var from new factor Efficient algorithm (“only” exponential in induced-width, not number of variables) If you hear: “Exact inference only efficient in tree graphical models” You say: “No!!! Any graph with low induced width” And then you say: “And even some with very large induced-width” (next week with context-specific independence)Elimination order is important! NP-complete problem Many good heuristics10-708 –Carlos Guestrin 2006-200817Announcements Recitation tomorrow Be there!! Homework 3 out later todayWhat’s next Thus far: Variable elimination (Often) Efficient algorithm for inference in graphical models Next: Understanding complexity of variable elimination Will lead to cool junction tree algorithm later10-708 –Carlos Guestrin 2006-20081810-708 –Carlos Guestrin 2006-200819Complexity of variable elimination –Graphs with loopsConnect nodes that appear together in an initial factorDifficultySATGradeHappyJobCoherenceLetterIntelligenceMoralize graph:Connect parents into a clique and remove edge directions10-708 –Carlos Guestrin 2006-200820Eliminating a node – Fill edgesEliminate variableadd Fill Edges:Connect neighborsDifficultySATGradeHappyJobCoherenceLetterIntelligence10-708 –Carlos Guestrin 2006-200821Induced graphElimination order:{C,D,S,I,L,H,J,G}DifficultySATGradeHappyJobCoherenceLetterIntelligenceThe induced graph IFYfor elimination order Yhas an edge Xi–Xjif Xiand Xjappear togetherin a factor generated by VE for elimination order Yon factors F 10-708 –Carlos Guestrin 2006-200822Different elimination order can lead to different induced graphElimination order:{G,C,D,S,I,L,H,J}DifficultySATGradeHappyJobCoherenceLetterIntelligence10-708 –Carlos Guestrin 2006-200823Induced graph and complexity of VEDifficultySATGradeHappyJobCoherenceLetterIntelligence Structure of induced graph encodes complexity of VE!!! Theorem: Every factor generated by VE subset of a maximal clique in IFY For every maximal clique in IFYcorresponds to a factor generated by VE Induced width (or treewidth) Size of largest clique in IFYminus 1 Minimal induced width – induced width of best order ≺Read complexity from cliques in induced graphElimination order:{C,D,I,S,L,H,J,G}10-708 –Carlos Guestrin 2006-200824Example: Large induced-width with small number of parentsCompact representation ⇒ Easy inference /10-708 –Carlos Guestrin 2006-200825Finding optimal elimination orderDifficultySATGradeHappyJobCoherenceLetterIntelligence Theorem: Finding best elimination order is NP-complete: Decision problem: Given a graph, determine if there exists an elimination order that achieves induced width ≤ K Interpretation: Hardness of finding elimination order in addition to hardness of inference Actually, can find elimination order in time exponential in size of largest clique – same complexity as inferenceElimination order:{C,D,I,S,L,H,J,G}10-708 –Carlos Guestrin
View Full Document