CS 59000 Statistical Machine learningLecture 21Yuan (Alan) QiOutline• Review of D-separation, Markov random fields, Markov blankets• Inference on chain• Inference on chains, factor graphsD-separation• A, B, and C are non-intersecting subsets of nodes in a directed graph.• A path from A to B is blocked if it contains a node such that eithera) the arrows on the path meet either head-to-tail or tail-to-tail at the node, and the node is in the set C, orb) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.• If all paths from A to B are blocked, A is said to be d-separated from B by C. • If A is d-separated from B by C, the joint distribution over all variables in the graph satisfies .D-separation: ExampleD-separation: I.I.D. DataBayesian Curve Fitting RevisitedD-separation implies that information from training data is summarized in w.The Markov BlanketFactors independent of xicancel between numerator and denominator.Cliques and Maximal CliquesCliqueMaximal CliqueJoint Distributionwhere is the potential over clique C and is the normalization coefficient; note: M K-state variables KMterms in Z.Energies and the Boltzmann distributionIllustration: Image De-Noising (1)Original ImageNoisy ImageIllustration: Image De-Noising (2)Illustration: Image De-Noising (3)Noisy Image Restored Image (ICM)Converting Directed to Undirected Graphs (1)Converting Directed to Undirected Graphs (2)Additional links: “marrying parents”, i.e., moralizationDirected vs. Undirected Graphs (2)Inference in Graphical ModelsInference on a ChainComputational time increases exponentially with N.Inference on a ChainInference on a ChainInference on a ChainInference on a ChainTo compute local marginals:• Compute and store all forward messages, .• Compute and store all backward messages, . • Compute Z at any node xm• Computefor all variables required.Computational time increases exponentially with N.TreesUndirected TreeDirected Tree PolytreeFactor GraphsFactor Graphs from Directed GraphsFactor Graphs from Undirected GraphsThe Sum-Product Algorithm (1)Objective:i. to obtain an efficient, exact inference algorithm for finding marginals;ii. in situations where several marginals are required, to allow computations to be shared efficiently.Key idea: Distributive LawThe Sum-Product Algorithm (2)The Sum-Product Algorithm (3)Why this is true?The Sum-Product Algorithm (4)The Sum-Product Algorithm (5)The Sum-Product Algorithm (6)The Sum-Product Algorithm (7)InitializationThe Sum-Product Algorithm (8)To compute local marginals:• Pick an arbitrary node as root• Compute and propagate messages from the leaf nodes to the root, storing received messages at every node.• Compute and propagate messages from the root to the leaf nodes, storing received messages at every node.• Compute the product of received messages at each node for which the marginal is required, and normalize if
View Full Document