1Eric Xing 1A Sketch of the Junction Tree Algorithm z The algorithm1. Moralize the graph (trivial)2. Triangulate the graph by elimination (good heuristic exist, but actually NP hard)3. Build a clique tree (using a maximum spanning tree algorithm 4. Propagation of probabilities --- a local message-passing protocolz Thm: Maximum-spanning tree of cliques ÅÆ Junction treez Results in marginal probabilities of all cliques --- solves all queries in a single runz A generic exact inference algorithm for any GMz Complexity: exponential in the size of the maximal clique --- a good elimination order often leads to small maximal clique, and hence a good (i.e., thin) JTEric Xing 2Message-passing algorithmsz Message updatez The Hugin updatez The Shafer-Shenoy updatecollect distribute∑=SVVS\*ψφWSSWψφφψ**=∑∏≠→→=ijiiSCjkkiikCijjiSmSm\)()(ψ2Eric Xing 3The Forward Algorithmz We want to calculate P(x), the likelihood of x, given the HMMz Sum over all possible ways of generating x:z To avoid summing over an exponential number of paths y, define(the forward probability)z The recursion:),,...,()(def111====kttktktyxxPyαα∑−==ikiitkttktayxp,)|(11αα∑=kkTPα)(x∑∑∑∑∏∏==−==yyxx121121yy yTtTtttyyyNttyxpapp )|(),()(,πLEric Xing 4The Forward Algorithm –derivationz Compute the forward probability:),,,...,( 111==−ktttktyxxxPα∑−==−−11111tykttttyyxxxP),,,,...,(),,...,,|(),...,,|(),,...,(111111111111−−−−−−===∑−ttkttttktyttyxxyxPxxyyPyxxPt)|()|(),,...,( 1111111===−−−∑−ktttktyttyxPyyPyxxPt)|(),,...,()|( 11111111=====−−−∑itktiittkttyyPyxxPyxPkiiitkttayxP,)|(∑−==11αAAxtx1yty1... Axt-1yt-1... ... ... ),|()|()(),,( :ruleChain BACPCBPAPCBAP=3Eric Xing 5z Elimination ≡ message passing on a clique treeE FHAE FB ACEGADCEADCB AAhmgmemfmbmcmdmRecall the Elimination and Message Passing AlgorithmA AA Ax2x3x1xTy2y3y1yT... ... ∑−==ikiitkttktayxp,)|(11αα∑=kkTPα)(x∑=efgeeamemdcepdcam),()(),|(),,(Eric Xing 6The Forward Algorithmz We can compute for all k, t, using dynamic programming!Initialization:Iteration:Termination:ktαkkkyxPπα)|( 1111==kkkkkkyxPyPyxPyxPπα)|()()|(),(111111111111=======kiiitkttktayxP,)|(∑−==11αα∑=kkTPα)(x4Eric Xing 7The Backward Algorithmz We want to compute ,the posterior probability distribution on the t thposition, given xz We start by computingz The recursion:)|( x1=ktyPForward, αtkBackward, ),...,,,,...,(),(TtkttktxxyxxPyP1111+=== x)|...()...(),,...,|,...,(),,...,(,111111111======++ktTtkttkttTtkttyxxPyxxPyxxxxPyxxP)|,...,( 11==+ktTtktyxxPβ∑+++==iitittikktyxpa111,)1|(ββA Axt+1xTyt+1yT... Axtyt... ... ... Eric Xing 8The Backward Algorithm –derivationz Define the backward probability:)|,...,( 11==+ktTtktyxxPβ∑+==++1111tykttTtyyxxP)|,,...,(),,|,...,(),|()|( 111111112111=======++++++∑ktittTtktittiktityyxxxPyyxpyyPitittiikyxpa1111+++==∑β)|(,A Axt+1xTyt+1yT... Axtyt... ... ... )|,...,()|()|( 111112111=====+++++∑itTtittiktityxxPyxpyyP),,|(),|(),()|,,( :ruleChain ααααBACPCBPAPCBAP=5Eric Xing 9The Backward Algorithmz We can compute for all k, t, using dynamic programming!Initialization:Iteration:Termination:ktβkkT∀= ,1βitittiikktyxPa1111+++==∑ββ)|(,∑=kkkP11βα)(xEric Xing 10Shafer Shenoy for HMMsz Recap: Shafer-Shenoy algorithmz Message from clique ito clique j :z Clique marginal ∑∏≠→→=ijiiSCjkkiikCjiS\)(µψµ∏→∝kkiikCiSCpi)()(µψ6Eric Xing 11Message Passing for HMMs(cont.)z A junction tree for the HMMz Rightward passz This is exactly the forward algorithm!z Leftward pass …z This is exactly the backward algorithm! A AA Ax2x3x1xTy2y3y1yT... ... ),(11xyψ),(21yyψ),(32yyψ),(TTyy1−ψ),(22xyψ),(33xyψ),(TTxyψ)(2yζ)(3yζ)(Tyζ)(1yφ)(2yφ⇒⇒),(1+ttyyψ)(ttty→−1µ),(11 ++ttxyψ)(11 ++→tttyµ)(1+↑ttyµ∑++↑++←+←−=111111tyttttttttttyyyyy )()(),()(µµψµ∑∑→−++++→−++==ttttytttyyttytttttttyayxpyxpyyyp)()|()|()()|(, 11111111µµ∑+↑→−+++→=tyttttttttttyyyyy )()(),()(11111µµψµ∑+++++←+=111111tytttttttyxpyyyp)|()()|(µ),(1+ttyyψ)(ttty←−1µ)(11 ++←tttyµ),(11 ++ttxyψ)(1+↑ttyµ12School of Computer ScienceStatistical learning with basic graphical models Probabilistic Graphical Models (10Probabilistic Graphical Models (10--708)708)Lecture 7, Oct 8, 2007Eric XingEric XingReceptor AKinase CTF FGene GGene HKinaseEKinase DReceptor BX1X2X3X4X5X6X7X8Receptor AKinase CTF FGene GGene HKinaseEKinase DReceptor BX1X2X3X4X5X6X7X8X1X2X3X4X5X6X7X8Reading: J-Chap. 5,6, KF-Chap. 87Eric Xing 13Learning Graphical ModelsThe goal:Given set of independent samples (assignments of random variables), find the best (the most likely?) Bayesian Network (both DAG and CPDs)(B,E,A,C,R)=(T,F,F,T,F)(B,E,A,C,R)=(T,F,T,T,F)……..(B,E,A,C,R)=(F,T,T,T,F)ERBAC0.9 0.1ebe0.2 0.80.01 0.990.9 0.1bebbeBEP(A | E,B)ERBACStructural learningParameter learningEric Xing 14Learning Graphical Modelsz Scenarios:z completely observed GMsz directedz undirected z partially or unobserved GMsz directedz undirected (an open research topic) z Estimation principles:z Maximal likelihood estimation (MLE)z Bayesian estimationz Maximal conditional likelihoodz Maximal "Margin" z We use learning as a name for the process of estimating the parameters, and in some cases, the topology of the network, from data.8Eric Xing 15Score-based approachData),,()()( 111 nxx K),,()()( MnMxx K1K),,()()( 221 nxx KERBACERBAC….PossiblestructuresLearnparametersMaximum likelihoodBayesianConditional likelihoodMargin …Score struc/param510−1510−K310−Eric Xing 16ML Parameter Est. for completely observed GMs of given structureZXz The data:{(z(1),x(1)), (z(2),x(2)), (z(3),x(3)), ... (z(N),x(N))}9Eric Xing 17Parameter Learningz Assume G is known and fixed,z from expert designz from an intermediate outcome of iterative structure learningz Goal: estimate from a dataset of Nindependent, identically distributed (iid) training cases D = {x1, . . . , xN}.z In general, each training case xn= (xn,1 , . . . , xn,M) is a vector of M values, one per node,z the model can be completely observable, i.e., every element in xnis known (no missing values, no hidden variables),z or, partially observable, i.e., ∃i, s.t. xn,iis not observed. z In this lecture we consider learning parameters for a single node.z Frequentist vs. Bayesian estimateEric Xing 18Bayesian Parameter Estimationz Bayesians treat the unknown parameters as a random
View Full Document