DOC PREVIEW
BYU BIO 465 - Hidden Markov Models (HMMs)

This preview shows page 1-2-16-17-18-33-34 out of 34 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Hidden Markov Models (HMMs)DefinitionState TransitionsExampleCodeObservationsMatrixThe CpG island problemHidden Markov ModelHMM is just one way of modeling p(X,S)…A simple HMMA General Definition of HMMHow to “Generate” a Sequence?Slide 14HMM as a Probabilistic ModelThree ProblemsThree Problems (cont.)Problem I: Decoding/Parsing Finding the most likely pathWhat’s the most likely path?Viterbi Algorithm: An ExampleViterbi AlgorithmProblem II: Evaluation Computing the data likelihoodData Likelihood: p(O|)The Forward AlgorithmForward Algorithm: ExampleThe Backward AlgorithmBackward Algorithm: ExampleProblem III: Training Estimating ParametersSupervised TrainingUnsupervised TrainingIntuitionBaum-Welch AlgorithmBaum-Welch Algorithm (cont.)Next Time1Hidden Markov Models (HMMs)2Definition•Hidden Markov Model is a statistical model where the system being modeled is assumed to be a Markov process with unknown parameters.•The challenge is to determine the hidden parameters from the observable parameters.3State TransitionsMarkov Model Example. --x = States of the Markov model -- a = Transition probabilities -- b = Output probabilities -- y = Observable outputs-How does this differ from a Finite State machine?4Example•Distant friend that you talk to daily about his activities (walk, shop, clean)•You believe that the weather is a discrete Markov chain (no memory) with two states (rainy, sunny), but you cant observe them directly. You know the average weather patterns5Codestates = ('Rainy', 'Sunny')observations = ('walk', 'shop', 'clean')start_probability = {'Rainy': 0.6, 'Sunny': 0.4}transition_probability = { 'Rainy' : {'Rainy': 0.7, 'Sunny': 0.3}, 'Sunny' : {'Rainy': 0.4, 'Sunny': 0.6}, }emission_probability = { 'Rainy' : {'walk': 0.1, 'shop': 0.4, 'clean': 0.5}, 'Sunny' : {'walk': 0.6, 'shop': 0.3, 'clean': 0.1}, }6Observations•Given (walk, shop, clean) –What is the probability of this sequence of observations? (is he really still at home, or did he skip the country)–What was the most likely sequence of rainy/sunny days?7MatrixRainy Sunnywalk .6*.1.4*.6shop .7*.4 .4*.4 .3*.3 .6*.3clean .7*.5 .4*.5 .3*.1 .6*.1Sunny, Rainy, Rainy = (.4*.6)(.4*.4)(.7*.5)8The CpG island problem•Methylation in human genome–“CG” -> “TG” happens randomly except where there is selection. –One area of selection is the“start regions” of genes– CpG islands = 100-1,000 bases before a gene starts•Question–Given a long sequence, how would we find the CpG islands in it?9Hidden Markov ModelCpG IslandX=ATTGATGTGAACTGGGGATCGGGCGATATATGATTGGOtherOtherHow can we identify a CpG island in a long sequence?Idea 1: Test each window of a fixed number of nucleitidesIdea2: Classify the whole sequence Class label S1: OOOO………….……OClass label S2: OOOO…………. OCC…Class label Si: OOOO…OCC..CO…O…Class label SN: CCCC……………….CC S*=argmaxS P(S|X)= argmaxS P(S,X)S*=OOOO…OCC..CO…OCpG10HMM is just one way of modeling p(X,S)…11A simple HMMParametersInitial state prob: p(B)= 0.5; p(I)=0.5State transition prob:p(BB)=0.7 p(BI)=0.3p(IB)=0.5 p(II)=0.5Output prob:P(a|B) = 0.25,…p(c|B)=0.10…P(c|I) = 0.25 …P(B)=0.5P(I)=0.5P(x|B)B I0.50.5P(x|I)0.70.30.50.5P(x|HCpG)=p(x|I)P(a|I)=0.25P(t|I)=0.25P(c|I)=0.25P(g|I)=0.25P(x|HOther)=p(x|B)P(a|B)=0.25P(t|B)=0.40P(c|B)=0.10P(g|B)=0.2512( , , , , )HMM S V B A= Π( ) : " "i k k ib v prob of generating v at sA General Definition of HMM11{ ,..., } 1NN iiπ π π=Π = =∑:i iprob of starti n g at state sπ1{ ,..., }MV v v=1{ ,..., }NS s s=N statesM symbolsInitial state probability:1{ } 1 , 1Nij ijjA a i j N a== ≤ ≤ =∑State transition probability:1{ ( )} 1 , 1 ( ) 1Mi k i kkB b v i N k M b v== ≤ ≤ ≤ ≤ =∑Output probability::ij i ja prob of going s s→13How to “Generate” a Sequence?B I0.70.30.50.5P(x|B)P(x|I)P(B)=0.5 P(I)=0.5B I BB BII II I IB BBI I… …Given a model, follow a path to generate the observations.modelSequencestatesP(a|B)=0.25P(t|B)=0.40P(c|B)=0.10P(g|B)=0.25P(a|I)=0.25P(t|I)=0.25P(c|I)=0.25P(g|I)=0.25a c g t t …14How to “Generate” a Sequence?B I0.70.30.50.5P(x|B)P(x|I)P(B)=0.5 P(I)=0.5modelSequenceP(a|B)=0.25P(t|B)=0.40P(c|B)=0.10P(g|B)=0.25P(a|I)=0.25P(t|I)=0.25P(c|I)=0.25P(g|I)=0.25a c g t t …aB I BIItgc0.50.3P(“BIIIB”, “acgtt”)=p(B)p(a|B) p(I|B)p(c|I) p(I|I)p(g|I) p(I|I)p(t|I) p(B|I)p(t|B)0.50.50.50.40.250.250.250.25t15HMM as a Probabilistic Model1 2 1 2 1 1 1 2 1 2 2 1( , ,..., , , ,..., ) ( ) ( | ) ( | ) ( | )... ( | ) ( )T T T T T Tp O O O S S S p S p O S p S S p O S p S S p O S−=1 2 1 2 1 1( , ,..., ) ( ) ( | )... ( | )T T Tp S S S p S p S S p S S−=Time/Index: t1 t2t3t4 …Data: o1o2o3o4 …Observation variable: O1 O2 O3 O4 …Hidden state variable: S1 S2 S3 S4 …Random variables/processSequential dataProbability of observations (incomplete likelihood):11 2 1 2 1,...( , ,..., ) ( , ,..., , ,... )TT T TS Sp O O O p O O O S S=∑1 2 1 2 1 1 2 2( , ,..., | , ,..., ) ( | ) ( | )... ( | )T T T Tp O O O S S S p O S p O S p O S=Joint probability (complete likelihood):State transition prob:Probability of observations with known state transitions:Init state distr.State trans. prob.Output prob.16Three Problems1. Decoding – finding the most likely path Have: model, parameters, observations (data) Want: most likely states sequence2. Evaluation – computing observation likelihood Have: model, parameters, observations (data) Want: the likelihood to generate the observed data1 2 1 2* * *1 2 1 2 1 2... ...... arg max ( ... | ) arg max ( ... , )T TT T TS S S S S SS S S p S S S O p S S S O= =1 21 2 1 2...( | ) ( | ... ) ( ... )TT TS S Sp O p O S S S p S S Sλ =∑17Three Problems (cont.) •Training – estimating parameters -Supervised Have: model, marked data( data+states sequence) Want: parameters-Unsupervised Have: model, data Want: parameters*arg max ( | )p Oλλ λ=18Problem I: Decoding/ParsingFinding the most likely pathYou can think of this as classification with all the paths as class labels…19What’s the most likely path?a c t tt ag g? ? ?? ??? ??? ??????????????1 1 11 2 1 2* * *1 2 1 2 1... ...2... arg max ( ... , ) arg max ( ) ( ) ( )i i i iT TTT T S o S S S oS S S S S SiS S S p S S S O S b v a b vπ−== =∏B I0.70.30.50.5P(x|B)P(x|I)P(B)=0.5 P(I)=0.5P(a|I)=0.25P(t|I)=0.25P(c|I)=0.25P(g|I)=0.25P(a|B)=0.25P(t|B)=0.40P(c|B)=0.10P(g|B)=0.2520Viterbi


View Full Document

BYU BIO 465 - Hidden Markov Models (HMMs)

Documents in this Course
summary

summary

13 pages

Cancer

Cancer

8 pages

Ch1

Ch1

5 pages

GNUMap

GNUMap

20 pages

cancer

cancer

8 pages

SNPs

SNPs

22 pages

Load more
Download Hidden Markov Models (HMMs)
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Hidden Markov Models (HMMs) and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Hidden Markov Models (HMMs) 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?