Probabilistic Graphical Models 10 708 Undirected Graphical Models Eric Xing Lecture 11 Oct 17 2005 Reading MJ Chap 2 4 and KF chap5 Review independence properties of DAGs z Defn let Il G be the set of local independence properties encoded by DAG G namely Xi NonDescendants Xi Parents Xi z Defn A DAG G is an I map independence map of P if Il G I P z z z A fully connected DAG G is an I map for any distribution since Il G I P for any P Defn A DAG G is a minimal I map for P if it is an I map for P and if the removal of even a single edge from G renders it not an I map A distribution may have several minimal I maps z Each corresponding to a specific node ordering 1 Global Markov properties of DAGs z X is d separated directed separated from Z given Y if we can t send a ball from any node in X to any node in Z using the Bayes ball algorithm illustrated bellow Defn I G all independence properties that correspond to dseparation I G X Z Y dsepG X Z Y D separation is sound and complete Chap 3 Koller Friedman P maps z z Defn A DAG G is a perfect map P map for a distribution P if I P I G Thm not every distribution has a perfect map as DAG z Pf by counterexample Suppose we have a model where A C B D and B D A C This cannot be represented by any Bayes net z e g BN1 wrongly says B D A BN2 wrongly says B D A A D D B D B C BN1 C A BN2 B C MRF 2 Undirected graphical models X1 X4 X3 X2 X5 z Pairwise non causal relationships z Can write down model and score specific configurations of the graph but no explicit way to generate samples z Contingency constrains on node configurations Canonical examples z The grid model z Naturally arises in image processing lattice physics etc z Each node may represent a single pixel or an atom z The states of adjacent or nearby nodes are coupled due to pattern continuity or electro magnetic force etc z Most likely joint configurations usually correspond to a low energy state 3 Social networks Ignoring the arrows this is a relational network among people Protein interaction networks 4 Modeling Go Information retrieval topic text image 5 Semantics of Undirected Graphs z z z Let H be an undirected graph B separates A and C if every path from a node in A to a node in C passes through a node in B sepH A C B A probability distribution satisfies the global Markov property if for any disjoint A B C such that B separates A and C A is independent of C given B I H A C B sepH A C B Undirected Graphical Models z Defn an undirected graphical model represents a distribution P X1 Xn defined by an undirected graph H and a set of positive potential functions c associated with cliques of H s t P x1 K xn 1 Z c xc c C where Z is known as the partition function Z c xc x 1 K xn c C z Also known as Markov Random Fields Markov networks z The potential function can be understood as an contingency function of its arguments assigning pre probabilistic score of their joint configuration 6 Cliques z z z For G V E a complete subgraph clique is a subgraph G V V E E such that nodes in V are fully interconnected A maximal clique is a complete subgraph s t any superset V V is not complete A sub clique is a not necessarily maximal clique A D B C z Example z max cliques A B D B C D z sub cliques A B C D all edges and singletons Example UGM using max cliques A D B C P x1 x 2 x 3 x 4 Z z c x 1 x 2 x 3 x 4 1 Z c x124 c x234 x124 c x234 For discrete nodes we can represent P X1 4 as two 3D tables instead of one 4D table 7 Example UGM using subcliques A D B C P x1 x 2 x 3 x 4 1 Z 1 Z ij xij ij 12 x12 14 x14 23 x23 24 x24 34 x34 Z z ij xij x1 x 2 x 3 x 4 ij For discrete nodes we can represent P X1 4 as 5 2D tables instead of one 4D table Interpretation of Clique Potentials X z Y Z The model implies X Z Y This independence statement implies by definition that the joint must factorize as p x y z p y p x y p z y z z We can write this as p x y z p x y p z y but p x y z p x y p z y z cannot have all potentials be marginals z cannot have all potentials be conditionals The positive clique potentials can only be thought of as general compatibility goodness or happiness functions over their variables but not as probability distributions 8 Exponential Form z Constraining clique potentials to be positive could be inconvenient e g the interactions between a pair of atoms can be either attractive or repulsive We represent a clique potential c xc in an unconstrained form using a real value energy function c xc c xc exp c xc For convenience we will call c xc a potential when no confusion arises from the context z This gives the joint a nice additive strcuture p x 1 1 exp c xc exp H x Z Z c C where the sum in the exponent is called the free energy H x c xc c C z In physics this is called the Boltzmann distribution z In statistics this is called a log linear model Example Boltzmann machines 1 4 2 3 z A fully connected graph with pairwise edge potentials on binary valued nodes for xi 1 1 or xi 0 1 is called a Boltzmann machine P x1 x 2 x 3 x 4 z exp ij xi x j Z ij 1 exp ij xi x j i xi C Z ij i 1 Hence the overall energy function has the form H x ij xi ij x j x T x 9 Example Ising spin glass models z z Nodes are arranged in a regular topology often a regular packing grid and connected only to their geometric neighbors Same as sparse Boltzmann machine where ij 0 iff i j are neighbors z z e g nodes are pixels potential function encourages nearby pixels to have similar intensities Potts model multi state Ising model Example multivariate Gaussian Distribution z A Gaussian distribution can be represented by a fully connected graph with pairwise edge potentials over continuous nodes z The overall energy has the form H x …
View Full Document