1Probabilistic Graphical Models10-708Hidden Markov models and Hidden Markov models and extensions extensions Eric Xing Eric Xing Lecture 14, Oct 31, 2005Reading: MJ-Chap. 11, 18Hidden Markov Model: from static to dynamic mixture modelsDynamic mixtureDynamic mixtureA AA AX2X3X1XTY2Y3Y1YT... ... Static mixtureStatic mixtureAX1Y1NThe sequence:The sequence:The underlying The underlying source:source:Phonemes,Phonemes,Speech signal, Speech signal, sequence of rolls, sequence of rolls, dice,dice,2Example: The Dishonest CasinoA casino has two dice:z Fair dieP(1) = P(2) = P(3) = P(5) = P(6) = 1/6z Loaded dieP(1) = P(2) = P(3) = P(5) = 1/10P(6) = 1/2Casino player switches back-&-forth between fair and loaded die once every 20 turnsGame:1. You bet $12. You roll (always with a fair die)3. Casino player rolls (maybe with fair die, maybe with loaded die)4. Highest number wins $2Puzzles regarding the dishonest casino GIVEN: A sequence of rolls by the casino player1245526462146146136136661664661636616366163616515615115146123562344QUESTIONz How likely is this sequence, given our model of how the casino works?z This is the EVALUATION problem in HMMsz What portion of the sequence was generated with the fair die, and what portion with the loaded die?z This is the DECODING question in HMMsz How “loaded” is the loaded die? How “fair” is the fair die? How often does the casino player change from fair to loaded, and back?z This is the LEARNING question in HMMs3A stochastic generative modelz Observed sequence:z Hidden sequence (a parse or segmentation):AB14 3 66 4BA A ABBDefinition (of HMM)zObservation spaceObservation spaceAlphabetic set:Euclidean space:zIndex set of hidden statesIndex set of hidden stateszTransition probabilitiesTransition probabilitiesbetween any two statesbetween any two statesorzStart probabilitiesStart probabilitieszEmission probabilitiesEmission probabilitiesassociated with each stateassociated with each stateor in general:A AA Ax2x3x1xTy2y3y1yT... ... {}Kbbb,,, L21=BdR{}M,,, L21=I,)|(,jiitjtayyp===−111().,,,,lMultinomia~)|(,,,I∈∀=−iaaayypMiiiittK1111().,,,lMultinomia~)(MypπππK211().,,,,lMultinomia~)|(,,,I∈∀=ibbbyxpKiiiittK111().,|f~)|( I∈∀⋅=iyxpiittθ14Probability of a parsez Given a sequence x = x1……xTand a parse y = y1, ……, yT,z To find how likely is the parse:(given our HMM and the sequence)p(x, y) = p(x1……xT, y1, ……, yT) (Joint probability)=p(y1) p(x1| y1)p(y2| y1) p(x2| y2) …p(yT| yT-1) p(xT| yT)= p(π1) P(y2| y1) …p(yT| yT-1) ×p(x1| y1)p(x2| y2)…p(xT| yT)= p(y1, ……, yT) p(x1……xT| y1, ……, yT)=z Marginal probability:z Posterior probability:[],,def,jtitttyyMjiijyyaa111++∏==[],defiyMiiy111∏==ππ[], anddef,ktitttxyMiKkikxy∏∏===11ηηLetTTyyyyyaa,,1211 −LπTTxyxy,,ηηL11∑∑∑∑∏∏==−==yyxx121121yy yTtTtttyyyNttyxpapp)|(),()(,πL)(/),()|( xyxxyppp=A AA Ax2x3x1xTy2y3y1yT... ... Three main questions on HMMs1.1.EvaluationEvaluationGIVEN an HMM M, and a sequence x,FIND Prob (x | M)ALGO.ForwardForward2.2.DecodingDecodingGIVEN an HMM M, and a sequence x ,FIND the sequence y of states that maximizes, e.g., P(y | x , M), or the most probable subsequence of statesALGO.ViterbiViterbi, Forward, Forward--backward backward 3.3.LearningLearningGIVEN an HMM M, with unspecified transition/emission probs.,and a sequence x,FIND parameters θ = (πi, aij, ηik) that maximize P(x | θ)ALGO.BaumBaum--Welch (EM)Welch (EM)5The Forward Algorithmz We want to calculate P(x), the likelihood of x, given the HMMz Sum over all possible ways of generating x:z To avoid summing over an exponential number of paths y, define(the forward probability)z The recursion:),,...,()(def111====kttktktyxxPy αα∑−==ikiitkttktayxp,)|(11αα∑=kkTPα)(x∑∑∑∑∏∏==−==yyxx121121yy yTtTtttyyyNttyxpapp)|(),()(,πLThe Forward Algorithm –derivationz Compute the forward probability:),,,...,( 111==−ktttktyxxxPα∑−==−−11111tykttttyyxxxP),,,,...,(),,...,,|(),...,,|(),,...,(111111111111−−−−−−===∑−ttkttttktyttyxxyxPxxyyPyxxPt)|()|(),,...,( 1111111===−−−∑−ktttktyttyxPyyPyxxPt)|(),,...,()|( 11111111=====−−−∑itktiittkttyyPyxxPyxP)|(),,...,()|( 11111111=====−−−∑itktiittkttyyPyxxPyxPkiiitkttayxP,)|(∑−==11αAA xtx1yty1... Axt-1yt-1... ... ...6The Backward Algorithmz We want to compute ,the posterior probability distribution on the t thposition, given xz We start by computingz The recursion:)|( x1=ktyPForward, αtkBackward, ),...,,,,...,(),(TtkttktxxyxxPyP1111+=== x)|...()...(),,...,|,...,(),,...,(,111111111======++ktTtkttkttTtkttyxxPyxxPyxxxxPyxxP)|,...,( 11==+ktTtktyxxPβitittiikktyxpa1111+++==ββ)|(∑,A Axt+1xTyt+1yT... Axtyt... ... ... The Backward Algorithm –derivationz Define the backward probability:)|,...,( 11==+ktTtktyxxPβ∑+==++1111tykttTtyyxxP)|,,...,()|,...,()|()|( 111112111=====+++++∑itTtittiktityxxPyxpyyPitittiikyxpa1111+++==∑β)|(,A Axt+1xTyt+1yT... Axtyt... ... ...7Posterior decodingz We can now calculatez Then, we can askz What is the most likely state at position t of sequence x:z Note that this is an MPA of a single hidden state, what if we want to a MPA of a whole hidden state sequence?z Posterior Decoding: z This is different from MPA of a whole sequenceof hidden statesz This can be understood as bit error ratevs. word error rate)()(),()|(xxxxPPyPyPktktktktβα====11)|(maxarg*x1==ktktyPk{} : *TtytktL11 ==Example:MPA of X ?MPA of (X, Y) ?x y P(x,y)0 0 0.350 1 0.0510 0.311 0.3Viterbi decodingz GIVEN x =x1, …, xT, we want to find y = y1, …, yT, such that P(y|x) is maximized:y*= argmaxyP(y|x) = argmaxπP(y,x) z Let= Probability of most likely sequence of states ending at state yt= kz The recursion:z Underflows are a significant problemz These numbers become extremely small – underflowz Solution: Take the logs of all values:),,...,,,...,(max,--},...{-1111111==kttttyyktyxyyxxPVtitkiikttktVayxpV11−==,max)|(x1x2x3……………………...……..xNState 12Kx1x2x3……………………...……..xNState 12Kx1x2x3……………………...……..xNState 12Kx1x2x3……………………...……..xNState 12KVi(t)ktVttttxyxyyyyyyttbbaayyxxp,,,,),,,,,( LLKK11121111−=π()()itkiikttktVayxpV11−++==,logmax)|(log8Computational Complexity and implementation detailsz What is the running time, and space required, for Forward, and Backward?Time: O(K2N); Space: O(KN).z Useful implementation technique to avoid
View Full Document