11Machine LearningCS6375 --- Spring 2015aHidden Markov Models (HMM)Reading: Sections 15.1-15.3, R&NSections 13.1-13.6, AlpaydinSections 17.1-17.5, MurphyInstructor: Yang LiuSlides modified from Vincent Ng (from Andrew Moore) 2Supervised Classification• Classifiers so far: – Decision tree, nearest neighbor, neural nets, naïve bayes– Classification is done for instances separately • What if there is a dependency between instances? – e.g., Label each word in a sentence with its part-of-speech tag23A Markov SystemHas N states, called s1, s2…, sNThere are discrete timesteps,t=0, t=1, …4A Markov SystemHas N states, called s1, s2, …, sNThere are discrete timesteps,t=0, t=1, …On the t’th timestep the system isin exactly one of the availablestates. Call it qtNote: qt∈{s1, s2, …, sN}35A Markov SystemHas N states, called s1, s2, …, sNThere are discrete timesteps,t=0, t=1, …On the t’th timestep the system isin exactly one of the availablestates. Call it qtNote: qt∈{s1, s2, …, sN}Between each timestep, the nextstate is chosen randomly.6A Markov SystemHas N states, called s1, s2, …, sNThere are discrete timesteps,t=0, t=1, …On the t’th timestep the system isin exactly one of the availablestates. Call it qtNote: qt∈{s1, s2, …, sN}Between each timestep, the nextstate is chosen randomly.The current state determines theprobability distribution for thenext state.47Has N states, called s1, s2, …, sNThere are discrete timesteps,t=0, t=1, …On the t’th timestep the system isin exactly one of the availablestates. Call it qtNote: qt∈{s1, s2, …, sN}Between each timestep, the nextstate is chosen randomly.The current state determines theprobability distribution for thenext state.A Markov SystemOften notated with arcsbetween states8Formally: First-order Markov ModelA set of states Q = q1, q2…qN; the state at time t is qtCurrent state only depends on previous state Transition probability matrix ASpecial initial probability vector πConstraints:P(qi|q1...qi−1)=P(qi|qi−1)πi=P(q1=i) 1≤i≤Naij=1; 1≤ i ≤ Nj=1N∑πj=1j=1N∑aij=P(qt=j|qt−1=i) 1≤i,j≤N59Probability of a Sequence of States∏=×××=×××=−=−−+111121132112132111)|()|()()|()|()()(TtsssTTTTTttassPssPsPsssssPssPsPssssPπKKKKNote: I’m being sloppy about the starting time 0 or 1.10What is P(qt=s)? Slow, Stupid AnswerStep 1: Work out how to compute P(Q) for any path Q of length t (i.e., Q = q1q2q3.. qt)P(q1q2.. qt) = P(q1)P(q2|q1)P(q3|q2)…P(qt|qt-1)Step 2: Use this knowledge to get P(qt=s)Computation isexponential in t611What is P(qt=s) ? Clever Answer• For each state si, define pt(i) = Prob. state is siat time t= P(qt= si)• Easy to do inductive definition12What is P(qt=s) ? Clever Answer• For each state si, define pt(i) = Prob. state is siat time t= P(qt= si)• Easy to do inductive definitionOr get probs from π713What is P(qt=s) ? Clever Answer• For each state si, define pt(i) = Prob. state is siat time t= P(qt= si)• Easy to do inductive definitionOr get probs from π14What is P(qt=s) ? Clever Answer• For each state si, define pt(i) = Prob. state is siat time t= P(qt= si)• Easy to do inductive definitionOr get probs from π815What is P(qt=s) ? Clever Answer• For each state si, define pt(i) = Prob. state is siat time t= P(qt= si)• Easy to do inductive definition16What is P(qt=s) ? Clever Answer• For each state si, define pt(i) = Prob. state is siat time t= P(qt= si)• Easy to do inductive definition• Cost of computing Pt(i) for allstates Siis now O(t N2)• The stupid way was O(Nt)• This was a simple example• It uses a trick, called DynamicProgramming.917ExampleInitialize p1(3)= p2(3)= p3(3)=p1(2)= p2(2)= p3(2)=p1(1)= p2(1)= p3(1)=S1S2 S318Assume that the state is not directly observable but that the observation is a probabilistic function of the state.For example, assume that there are N urns each containing balls of different colors with a specific distribution for each urn. An initial urn is chosen at random and a ball is drawn, its color announced and replaced. The next urn is chosen randomly based on the current urn and the process repeats.Hidden Markov Models: An Example1019HMM Formalism{S, O, Π, Π, Π, Π, Α, ΒΑ, ΒΑ, ΒΑ, Β}ΠΠΠΠ = {π= {π= {π= {πi}}}} are the initial state probabilitiesA = {aij} are the state transition probabilitiesB = {bik} are the observation probabilitiesABAAABBSSSOOOSOSO20HMM Formal DefinitionAn HMM, λ, is a 5-tuple consisting of• N the number of states• M the number of possible observationsFirst orderMarkov model1121HMM NotationThe states are labeled S1S2.. SNFor a particular trial….Let T be the number of observationsT is also the number of states passed throughO = O1O2.. OTis the sequence of observationsQ = q1q2.. qTis the notation for a path of states22HMMs: An ExampleStart randomly in state 1 or 2Choose one of the outputsymbols in each state atrandom.2/1)(0)(2/1)(2/1)(2/1)(0)(0)(2/1)(2/1)(3/13/13/13/203/13/23/10333222111333231232221131211==================ZbYbXbZbYbXbZbYbXbaaaaaaaaa1223Start randomly in state 1 or 2Choose one of the outputsymbols in each state atrandom.Let’s generate a sequence ofobservations:2/1)(0)(2/1)(2/1)(2/1)(0)(0)(2/1)(2/1)(3/13/13/13/203/13/23/10333222111333231232221131211==================ZbYbXbZbYbXbZbYbXbaaaaaaaaaHMMs: An Example24HMMs: An ExampleStart randomly in state 1 or 2Choose one of the outputsymbols in each state atrandom.Let’s generate a sequence ofobservations:2/1)(0)(2/1)(2/1)(2/1)(0)(0)(2/1)(2/1)(3/13/13/13/203/13/23/10333222111333231232221131211==================ZbYbXbZbYbXbZbYbXbaaaaaaaaa1325Start randomly in state 1 or 2Choose one of the outputsymbols in each state atrandom.Let’s generate a sequence ofobservations:HMMs: An Example2/1)(0)(2/1)(2/1)(2/1)(0)(0)(2/1)(2/1)(3/13/13/13/203/13/23/10333222111333231232221131211==================ZbYbXbZbYbXbZbYbXbaaaaaaaaa26Start randomly in state 1 or 2Choose one of the outputsymbols in each state atrandom.Let’s generate a sequence ofobservations:HMMs: An Example2/1)(0)(2/1)(2/1)(2/1)(0)(0)(2/1)(2/1)(3/13/13/13/203/13/23/10333222111333231232221131211==================ZbYbXbZbYbXbZbYbXbaaaaaaaaa1427Start randomly in state 1 or 2Choose one of the outputsymbols in each state atrandom.Let’s generate a sequence ofobservations:HMMs: An Example2/1)(0)(2/1)(2/1)(2/1)(0)(0)(2/1)(2/1)(3/13/13/13/203/13/23/10333222111333231232221131211==================ZbYbXbZbYbXbZbYbXbaaaaaaaaa28Start randomly in
View Full Document