Unformatted text preview:

CSCI 5832 Natural Language Processing Jim Martin Lecture 9 01 14 19 1 Today 2 12 Review GT example HMMs and Viterbi POS tagging 2 01 14 19 Good Turing Intuition Notation Nx is the frequency of frequency x So N10 1 N1 3 etc To estimate counts probs for unseen species Use number of species words we ve seen once c0 c1 p0 N1 N All other estimates are adjusted down to allow for increased probabilities for unseen 3 01 14 19 HW 0 Results Favorite color Blue 8 Green 3 Red 2 Black 2 White 2 Periwinkle 1 Gamboge 1 Eau de Nil 1 Brown 1 21 events Count of counts N1 4 N2 3 N3 1 N4 5 6 7 0 N8 1 4 01 14 19 GT for a New Color Treat the 0s as 1s so N0 4 P new color 4 21 19 If we new the number of colors out there we would divide 19 by the number of colors not seen Count of counts N1 4 N2 3 N3 1 N4 5 6 7 0 N8 1 Otherwise N 1 1 1 3 4 6 4 1 5 P Periwinkle 1 5 21 07 N 2 2 1 1 3 1 P Black 1 21 047 5 01 14 19 GT for New Color But 2 twists Treat the high flyers as trusted So P Blue should stay 8 21 Count of counts N1 4 N2 3 N3 1 N4 5 6 7 0 N8 1 Use interpolation to smooth the bin counts before reestimation To deal with N3 3 1 0 1 6 01 14 19 Why Logs Simple Good Turing does linear interpolation in log space Why QuickTime QuickTime and and aa TIFF TIFF Uncompressed Uncompressed decompressor decompressor are are needed needed to to see see this this picture picture 7 01 14 19 Part of Speech tagging Part of speech tagging Parts of speech What s POS tagging good for anyhow Tag sets Rule based tagging Statistical tagging Simple most frequent tag baseline Important Ideas Training sets and test sets Unknown words 01 14 19 HMM tagging 8 Parts of Speech 8 ish traditional parts of speech Noun verb adjective preposition adverb article interjection pronoun conjunction etc Called parts of speech lexical category word classes morphological classes lexical tags POS Lots of debate in linguistics about the number nature and universality of these We ll completely ignore this debate 9 01 14 19 POS examples N V ADJ ADV P PRO DET noun chair bandwidth pacing verb study debate munch adjective purple tall ridiculous adverb unfortunately slowly preposition of by to pronoun I me mine determiner the a that those 10 01 14 19 POS Tagging example WORD tag the koala put the keys on the table DET N V DET N P DET N 11 01 14 19 POS Tagging Words often have more than one POS back The back door JJ On my back NN Win the voters back RB Promised to back the bill VB The POS tagging problem is to determine the POS tag for a particular instance of a word These examples from Dekang Lin 12 01 14 19 How hard is POS tagging Measuring ambiguity 13 01 14 19 2 methods for POS tagging 1 Rule based tagging ENGTWOL 2 Stochastic Probabilistic tagging HMM Hidden Markov Model tagging 14 01 14 19 Hidden Markov Model Tagging Using an HMM to do POS tagging Is a special case of Bayesian inference Foundational work in computational linguistics Bledsoe 1959 OCR Mosteller and Wallace 1964 authorship identification It is also related to the noisy channel model that s the basis for ASR OCR and MT 15 01 14 19 POS Tagging as Sequence Classification We are given a sentence an observation or sequence of observations Secretariat is expected to race tomorrow What is the best sequence of tags which corresponds to this sequence of observations Probabilistic view Consider all possible sequences of tags Out of this universe of sequences choose the tag sequence which is most probable given the observation sequence of n words w1 wn 16 01 14 19 Road to HMMs We want out of all sequences of n tags t1 tn the single tag sequence such that P t1 tn w1 wn is highest Hat means our estimate of the best one Argmaxx f x means the x such that f x is maximized 17 01 14 19 Road to HMMs This equation is guaranteed to give us the best tag sequence But how to make it operational How to compute this value Intuition of Bayesian classification Use Bayes rule to transform into a set of other probabilities that are easier to compute 18 01 14 19 Using Bayes Rule 19 01 14 19 Likelihood and Prior n 20 01 14 19 Two Sets of Probabilities 1 Tag transition probabilities p ti ti 1 Determiners likely to precede adjs and nouns That DT flight NN The DT yellow JJ hat NN So we expect P NN DT and P JJ DT to be high Compute P NN DT by counting in a labeled corpus 21 01 14 19 Two Sets of Probabilities 2 Word likelihood probabilities p wi ti VBZ 3sg Pres verb likely to be is Compute P is VBZ by counting in a labeled corpus 22 01 14 19 An Example the verb race Secretariat NNP is VBZ expected VBN to TO race VB tomorrow NR People NNS continue VB to TO inquire VB the DT reason NN for IN the DT race NN for IN outer JJ space NN How do we pick the right tag 23 01 14 19 Disambiguating race 24 01 14 19 Example P NN TO 00047 P VB TO 83 P race NN 00057 P race VB 00012 P NR VB 0027 P NR NN 0012 P VB TO P NR VB P race VB 00000027 P NN TO P NR NN P race NN 00000000032 So we correctly choose the verb reading 25 01 14 19 Hidden Markov Models What we ve described with these two kinds of probabilities is a Hidden Markov Model Let s just spend a bit of time tying this into the model First some definitions 26 01 14 19 Definitions A weighted finite state automaton adds probabilities to the arcs The sum of the probabilities leaving any arc must sum to one A Markov chain is a special case in which the input sequence uniquely determines which states the automaton will go through Markov chains can t represent inherently ambiguous problems Useful for assigning probabilities to unambiguous sequences 27 01 14 19 Markov chain for weather 28 01 14 19 Markov chain for words 29 01 14 19 Markov chain First order Observable Markov Model A set of states Q q1 q2 qN the state at time t is qt Transition probabilities a set of probabilities A a01a02 an1 ann Each aij represents the probability of transitioning from state i to state j The set of these is the transition probability matrix A Current state only depends on previous state P qi q1 qi 1 P qi qi 1 …


View Full Document

CU-Boulder CSCI 5832 - Lecture 9

Loading Unlocking...
Login

Join to view Lecture 9 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 9 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?