Hidden Markov ModelsExample: The Dishonest CasinoThe dishonest casino modelA HMM is memory-lessDefinition of a hidden Markov modelSlide 6Slide 7A parse of a sequenceGenerating a sequence by the modelLikelihood of a parseExample: the dishonest casinoSlide 12Slide 13Question # 1 – EvaluationQuestion # 2 – DecodingQuestion # 3 – LearningThe three main questions on HMMsLet’s not be confused by notationProblem 1: DecodingDecodingDecoding – main ideaThe Viterbi AlgorithmSlide 23Viterbi Algorithm – a practical detailExampleProblem 2: EvaluationSlide 27A couple of questionsEvaluationThe Forward AlgorithmThe Forward Algorithm – derivationSlide 32Relation between Forward and ViterbiMotivation for the Backward AlgorithmThe Backward Algorithm – derivationThe Backward AlgorithmComputational ComplexityPosterior DecodingSlide 39Slide 40Viterbi, Forward, BackwardCS262 Lecture 5, Win07, BatzoglouHidden Markov Models12K…12K…12K…………12K…x1x2x3xK21K2CS262 Lecture 5, Win07, BatzoglouExample: The Dishonest CasinoA casino has two dice:•Fair dieP(1) = P(2) = P(3) = P(5) = P(6) = 1/6•Loaded dieP(1) = P(2) = P(3) = P(5) = 1/10P(6) = 1/2Casino player switches back-&-forth between fair and loaded die once every 20 turnsGame:1. You bet $12. You roll (always with a fair die)3. Casino player rolls (maybe with fair die, maybe with loaded die)4. Highest number wins $2CS262 Lecture 5, Win07, BatzoglouThe dishonest casino modelFAIR LOADED0.050.050.950.95P(1|F) = 1/6P(2|F) = 1/6P(3|F) = 1/6P(4|F) = 1/6P(5|F) = 1/6P(6|F) = 1/6P(1|L) = 1/10P(2|L) = 1/10P(3|L) = 1/10P(4|L) = 1/10P(5|L) = 1/10P(6|L) = 1/2CS262 Lecture 5, Win07, BatzoglouA HMM is memory-lessAt each time step t, the only thing that affects future states is the current state tK1…2CS262 Lecture 5, Win07, BatzoglouDefinition of a hidden Markov modelDefinition: A hidden Markov model (HMM)•Alphabet = { b1, b2, …, bM }•Set of states Q = { 1, ..., K }•Transition probabilities between any two statesaij = transition prob from state i to state jai1 + … + aiK = 1, for all states i = 1…K•Start probabilities a0ia01 + … + a0K = 1•Emission probabilities within each stateei(b) = P( xi = b | i = k)ei(b1) + … + ei(bM) = 1, for all states i = 1…KK1…2End Probabilities ai0In Durbin; not neededCS262 Lecture 5, Win07, BatzoglouA HMM is memory-lessAt each time step t, the only thing that affects future states is the current state tP(t+1 = k | “whatever happened so far”) =P(t+1 = k | 1, 2, …, t, x1, x2, …, xt) =P(t+1 = k | t)K1…2CS262 Lecture 5, Win07, BatzoglouA HMM is memory-lessAt each time step t, the only thing that affects xt is the current state tP(xt = b | “whatever happened so far”) =P(xt = b | 1, 2, …, t, x1, x2, …, xt-1) =P(xt = b | t)K1…2CS262 Lecture 5, Win07, BatzoglouA parse of a sequenceGiven a sequence x = x1……xN,A parse of x is a sequence of states = 1, ……, N12K…12K…12K…………12K…x1x2x3xK21K2CS262 Lecture 5, Win07, BatzoglouGenerating a sequence by the modelGiven a HMM, we can generate a sequence of length n as follows:1. Start at state 1 according to prob a01 2. Emit letter x1 according to prob e1(x1)3. Go to state 2 according to prob a124. … until emitting xn 12K…12K…12K…………12K…x1x2x3xn21K20e2(x1)a02CS262 Lecture 5, Win07, BatzoglouLikelihood of a parseGiven a sequence x = x1……xNand a parse = 1, ……, N,To find how likely this scenario is: (given our HMM)P(x, ) = P(x1, …, xN, 1, ……, N) = P(xN | N) P(N | N-1) ……P(x2 | 2) P(2 | 1) P(x1 | 1) P(1) = a01 a12……aN-1N e1(x1)……eN(xN) 12K…12K…12K…………12K…x1x2x3xK21K2A compact way to write a01 a12……aN-1N e1(x1)……eN(xN)Enumerate all parameters aij and ei(b); n paramsExample: a0Fair : 1; a0Loaded : 2; … eLoaded(6) = 18 Then, count in x and the # of times eachparameter j = 1, …, n occursF(j, x, ) = # parameter j occurs in (x, )(call F(.,.,.) the feature counts) Then,P(x, ) = j=1…n jF(j, x, ) = = exp[j=1…n log(j)F(j, x, )]CS262 Lecture 5, Win07, BatzoglouExample: the dishonest casinoLet the sequence of rolls be:x = 1, 2, 1, 5, 6, 2, 1, 5, 2, 4Then, what is the likelihood of = Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair?(say initial probs a0Fair = ½, aoLoaded = ½)½ P(1 | Fair) P(Fair | Fair) P(2 | Fair) P(Fair | Fair) … P(4 | Fair) =½ (1/6)10 (0.95)9 = .00000000521158647211 ~= 0.5 10-9CS262 Lecture 5, Win07, BatzoglouExample: the dishonest casinoSo, the likelihood the die is fair in this runis just 0.521 10-9OK, but what is the likelihood of = Loaded, Loaded, Loaded, Loaded, Loaded, Loaded, Loaded, Loaded, Loaded, Loaded?½ P(1 | Loaded) P(Loaded, Loaded) … P(4 | Loaded) =½ (1/10)9 (1/2)1 (0.95)9 = .00000000015756235243 ~= 0.16 10-9Therefore, it somewhat more likely that all the rolls are done with the fair die, than that they are all done with the loaded dieCS262 Lecture 5, Win07, BatzoglouExample: the dishonest casinoLet the sequence of rolls be:x = 1, 6, 6, 5, 6, 2, 6, 6, 3, 6Now, what is the likelihood = F, F, …, F?½ (1/6)10 (0.95)9 ~= 0.5 10-9, same as beforeWhat is the likelihood = L, L, …, L?½ (1/10)4 (1/2)6 (0.95)9 = .00000049238235134735 ~= 0.5 10-7So, it is 100 times more likely the die is loadedCS262 Lecture 5, Win07, BatzoglouQuestion # 1 – EvaluationGIVENA sequence of rolls by the casino player1245526462146146136136661664661636616366163616515615115146123562344QUESTIONHow likely is this sequence, given our model of how the casino works?This is the EVALUATION problem in HMMsProb = 1.3 x 10-35CS262 Lecture 5, Win07, BatzoglouQuestion # 2 – DecodingGIVENA sequence of rolls by the casino player1245526462146146136136661664661636616366163616515615115146123562344QUESTIONWhat portion of the sequence was generated with the fair die, and what portion with the loaded die?This is the DECODING question in HMMsFAIR LOADED FAIRCS262 Lecture 5, Win07, BatzoglouQuestion # 3 – LearningGIVENA sequence of rolls by the casino player1245526462146146136136661664661636616366163616515615115146123562344QUESTIONHow “loaded” is the loaded die? How “fair” is the fair die? How often does the casino player change from fair to loaded, and back?This
View Full Document