Machine Learning 1010 701 15701 15 781 Fall 2006 Graphical Models Eric Xing Visit to Asia X1 Tuberculosis X3 XRay Result X7 Smoking Lung Cancer Tuberculosis or Cancer X2 Bronchitis X4 X5 Lecture 12 October 24 2006 X6 Dyspnea Reading Chap 8 C B book X8 Eric Xing 1 What is a graphical model example from medical diagnostics z A possible world for a patient with lung problem Visit to Asia X1 Tuberculosis X3 Smoking Lung Cancer Tuberculosis or Cancer XRay Result Eric Xing X7 X2 Bronchitis X4 X5 X6 Dyspnea X8 2 1 Recap of Basic Prob Concepts z Representation what is the joint probability dist on multiple variables P X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8 z z z How many state configurations in total 28 z Are they all needed to be represented z Do we get any scientific medical insight Learning where do we get all this probabilities z Maximal likelihood estimation but how many data do we need z Where do we put domain knowledge in terms of plausible relationships between variables and plausible values of the probabilities Inference If not all variables are observable how to compute the conditional distribution of latent variables given evidence Eric Xing 3 Dependencies among variables Visit to Asia Smoking X1 X2 Patient Information Tuberculosis X3 Lung Cancer Bronchitis X4 X5 Medical Difficulties Tuberculosis or Cancer XRay Result X7 X6 Dyspnea X8 Diagnostic Tests Eric Xing 4 2 Probabilistic Graphical Models z Represent dependency structure with a graph z Node random variable Edges encode dependencies z Directed and undirected versions z z z Why is this useful z z z z Absence of edge conditional independence A language for communication A language for computation A language for development Origins z z Wright 1920 s Independently developed by Spiegelhalter and Lauritzen in statistics and Pearl in computer science in the late 1980 s Eric Xing 5 Probabilistic Graphical Models con d If Xi s are conditionally independent as described by a PGM the joint can be factored to a product of simpler terms e g Visit to Asia X1 Tuberculosis X3 Smoking Lung Cancer Tuberculosis or Cancer XRay Result X7 X2 Bronchitis X4 X5 P X1 P X2 P X3 X1 P X4 X2 P X5 X2 P X6 X3 X4 P X7 X6 P X8 X5 X6 X6 Dyspnea P X1 X2 X3 X4 X5 X6 X7 X8 X8 Why we may favor a PGM Representation cost how many probability statements are needed 2 2 4 4 4 8 4 8 36 an 8 fold reduction from 28 Algorithms for systematic and efficient inference learning computation Exploring the graph structure and probabilistic e g Bayesian Markovian semantics Incorporation of domain knowledge and causal logical structures Eric Xing 6 3 Two types of GMs z Directed edges give causality relationships Bayesian Network or Directed Graphical Model z Undirected edges simply give physical or symmetric correlations between variables Markov Random Field or Undirected Graphical model Eric Xing 7 Bayesian Network Factorization Theorem Visit to Asia X1 Tuberculosis X3 Smoking Lung Cancer Tuberculosis or Cancer XRay Result z X7 X2 Bronchitis X4 P X1 X2 X3 X4 X5 X6 X7 X8 X5 P X1 P X2 P X3 X1 P X4 X2 P X5 X2 P X6 X3 X4 P X7 X6 P X8 X5 X6 X6 Dyspnea X8 Theorem Given a DAG The most general form of the probability distribution that is consistent with the graph factors according to node given its parents d P X P X i X i i 1 where X is the set of parents of xi d is the number of nodes variables in the graph i Eric Xing 8 4 Bayesian Network Conditional Independence Semantics Structure DAG Meaning a node is conditionally independent of every other node in the network outside its Markov blanket Ancestor Y1 Parent Y2 X Local conditional distributions CPD and the DAG completely determine the joint dist Child Children s co parent Give causality relationships and facilitate a generative process Descendent Eric Xing 9 Local Structures Independencies z B Common parent z Fixing B decouples A and C given the level of gene B the levels of A and C are independent z Cascade z Knowing B decouples A and C C A A B C given the level of gene B the level gene A provides no extra prediction value for the level of gene C z V structure z Knowing C couples A and B A because A can explain away B w r t C B C If A correlates to C then chance for B to also correlate to B will decrease z The language is compact the concepts are rich Eric Xing 10 5 A simple justification B A C Eric Xing 11 Graph separation criterion z D separation criterion for Bayesian networks D for Directed edges Definition variables x and y are D separated conditionally independent given z if they are separated in the moralized ancestral graph z Example Eric Xing 12 6 Global Markov properties of DAGs z X is d separated directed separated from Z given Y if we can t send a ball from any node in X to any node in Z using the Bayesball algorithm illustrated bellow and plus some boundary conditions Defn I G all independence properties that correspond to dseparation I G X Z Y dsep G X Z Y D separation is sound and complete Eric Xing 13 Example x4 z Complete the I G of this graph x1 x3 x2 Eric Xing 14 7 Towards quantitative specification of probability distribution z Separation properties in the graph imply independence properties about the associated variables z For the graph to be useful any conditional independence properties we can derive from the graph should hold for the probability distribution that the graph represents z The Equivalence Theorem For a graph G Let D1 denote the family of all distributions that satisfy I G Let D2 denote the family of all distributions that factor according to G Then D1 D2 Eric Xing 15 Example B A p A B C C Eric Xing 16 8 Example con d z Evolution ancestor T years Qh Qm AG G A AC A C Tree Model Eric Xing 17 Example con d z Speech recognition Y1 Y2 Y3 YT XA1 A2 X XA3 XAT Hidden Markov Model Eric Xing 18 9 Example con d z Genetic Pedigree A0 B0 A1 B1 Ag Bg F0 Fg M 0 F1 Sg M 1 C g C 0 C 1 Eric Xing 19 Conditional probability tables CPTs a0 a1 0 75 b0 0 33 0 25 b1 0 67 A B C D Eric Xing P a b c d P a P b P c a b P d c a0b0 a0b1 a1b0 a1b1 c0 0 45 1 0 9 0 7 c1 0 55 0 0 1 0 3 c0 c1 d0 0 3 0 5 d1 07 0 5 20 10 Conditional probability density func CPDs P a b c d P a P b P c a b P d c B …
View Full Document