Unformatted text preview:

!Introduc*on!to!N,grams!!!Language!Modeling!Dan!Jurafsky!Probabilis1c!Language!Models!• Today’s!goal:!assign!a!probability!to!a!sentence!• Machine!Transla*on:!• P(high!winds!tonite)!>!P(large!winds!tonite)!• Spell!Correc*on!• The!office!is!about!fiIeen!minuets!from!my!house!• P(about!fiIeen!minutes!from)!>!P(about!fiIeen!minuets!from)!• Speech!Recogni*on!• P(I!saw!a!van)!>>!P(eyes!awe!of!an)!• +!Summariza*on,!ques*on,answering,!etc.,!etc.!!!Why?!Dan!Jurafsky!Probabilis1c!Language!Modeling!• Goal:!compute!the!probability!of!a!sentence!or!sequence!of!words:!!!!!!P(W)!=!P(w1,w2,w3,w4,w5…wn)!• Related!task:!probability!of!an!upcoming!word:!!!!!!!P(w5|w1,w2,w3,w4)!• A!model!that!computes!either!of!these:!!!!!!!!!!!P(W)!!!!!or!!!!!P(wn|w1,w2…wn,1)!!!!!!!!!!is!called!a!language!model.!• Be_er:!the!grammar!!!!!!!But!language! model!or!LM!is!standard!Dan!Jurafsky!How!to!compute!P(W)!• How!to!compute!this!joint!probability:!• P(its,!water,!is,!so,!transparent,!that)!• Intui*on:!let’s!rely!on!the!Chain!Rule!of!Probability!Dan!Jurafsky!Reminder:!The!Chain!Rule!• Recall!the!defini*on!of!condi*onal!probabili* es!! !!!!!!!Rewri*ng:!!• More!variables:!!P(A,B,C,D)!=!P(A)P(B|A)P(C|A,B)P(D|A,B,C)!• The!Chain!Rule!in!General!!!P(x1,x2,x3,…,xn)!=!P(x1)P(x2|x1)P(x3|x1,x2)…P(xn|x1,…,xn,1)!!Dan!Jurafsky!The!Chain!Rule!applied!to!compute!joint!probability!of!words!in!sentence!!!!P(“its!water!is!so!transparent”)!=!!P(its)!×!P(water|its)!×!!P(is|its!water)!!!!!!!!!!!×!!P(so|its!water!is)!×!!P(transparent|its!water!is!so)! ! P(w1w2…wn) = P(wi| w1w2…wi"1)i#Dan!Jurafsky!How!to!es1mate!these!probabili1 e s!• Could!we!just!count!and!divide?!• No!!!Too!many!possible!sentences!!• We’ll!never!see!enough!data!for!es*ma*ng!these!! P(the | its water is so transparent that) =Count(its water is so transparent that the)Count(its water is so transparent that)Dan!Jurafsky!Markov!Assump1on!• Simpl ifyi ng!assump *on:!!!• Or!maybe!!! P(the | its water is so transparent that) " P(the | that)! P(the | its water is so transparent that) " P(the | transparent that)Andrei!Markov!Dan!Jurafsky!Markov!Assump1on!• In!other!words,!we!approximate!each!component!in!the!product!! ! P(w1w2…wn) " P(wi| wi#k…wi#1)i$ ! P(wi| w1w2…wi"1) # P(wi| wi"k…wi"1)Dan!Jurafsky!Simplest!case:!Unigram!mode l!fifth, an, of, futures, the, an, incorporated, a, a, the, inflation, most, dollars, quarter, in, is, mass!!thrift, did, eighty, said, hard, 'm, july, bullish!!that, or, limited, the!Some!automa*cally!generated!sentences!from!a!unigram!model! ! P(w1w2…wn) " P(wi)i#Dan!Jurafsky!" Condi*on!on!the!previous!word:! Bigram!model!texaco, rose, one, in, this, issue, is, pursuing, growth, in, a, boiler, house, said, mr., gurria, mexico, 's, motion, control, proposal, without, permission, from, five, hundred, fifty, five, yen!!outside, new, car, parking, lot, of, the, agreement, reached!!this, would, be, a, record, november! ! P(wi| w1w2…wi"1) # P(wi| wi"1)Dan!Jurafsky!NJgram!models!• We!can!extend!to!trigrams,!4,grams,!5,grams!• In!general!this!is!an!insufficient!model!of!language!• because!language!has!longJdistance!dependencie s:!!“The!computer!which!I!had!just!put!into!the!machine!room!on!the!fiIh!floor!crashed.”!• But!we!can!oIen!get!away!with!N,gram!models!!Introduc*on!to!N,grams!!!Language!Modeling!!Es*ma*ng!N,gram!Probabili*es!!!Language!Modeling!Dan!Jurafsky!Es1ma1ng!bigram!probabili1e s!• The!Maximum!Likelihood!Es*mate!! P(wi| wi"1) =count(wi"1,wi)count(wi"1)! P(wi| wi"1) =c(wi"1,wi)c(wi"1)Dan!Jurafsky!An!example!<s>!I!am!Sam!</s>!<s>!Sam!I!am!</s>!<s>!I!do!not!like!green!eggs!and!ham!</s>!!! P(wi| wi"1) =c(wi"1,wi)c(wi"1)Dan!Jurafsky!More!examples:!!Berkeley!Restaurant!Project!sentences!• can!you!tell!me!about!any!good!cantonese!restaurants!close!by!• mid!priced!thai!food!is!what!i’m!looking!for!• tell!me!about!chez!panisse!• can!you!give!me!a!lis*ng!of!the!kinds!of!food!that!are!available!• i’m!looking!for!a!good!place!to!eat!breakfast!• when!is!caffe!venezia!open!during!the!dayDan!Jurafsky!Raw!bigram!counts!• Out!of!9222!sentences!Dan!Jurafsky!Raw!bigram!probabili1es!• Normalize!by!unigrams:!• Result:!Dan!Jurafsky!Bigram!es1mate s!of!sentence!pr obabili1es!P(<s>!I!want!english!food!</s>)!=!!P(I|<s>)!!!!! !×!!P(want|I)!!!!×!!P(english|want)!!!!!×!!P(food|english)!!!!!×!!P(</s>|food)!!!!!!!!=!!.000031!Dan!Jurafsky!What!kinds!of!knowledge?!• P(english|want)!!=!.0011!• P(chinese|want)!=!!.0065!• P(to|want)!=!.66!• P(eat!|!to)!=!.28!• P(food!|!to)!=!0!• P(want!|!spend)!=!0!• P!(i!|!<s>)!=!.25!Dan!Jurafsky!Prac1cal!Issues!• We!do!everything!in!log!space!• Avoid!underflow!• (also!adding!is !faster!than!mul *pl ying)!log(p1! p2! p3! p4) = log p1+ log p2+ log p3+ log p4Dan!Jurafsky!Language!Modeling!Toolkits!• SRILM!• h_p://www.speech.sri.com/projects/srilm/ !Dan!Jurafsky!Google!NJGram!Release,!August!2006!…Dan!Jurafsky!Google!NJGram!Release!• serve as the incoming 92!• serve as the incubator 99!• serve as the independent 794!• serve as the index 223!• serve as the indication 72!• serve as the indicator 120!• serve as the indicators 45!• serve as the indispensable 111!• serve as the indispensible 40!• serve as the individual 234!http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.htmlDan!Jurafsky!Google!Book!NJgrams!• h_p://ngrams.googlelabs.com/!!Es*ma*ng!N,gram!Probabili*es!!!Language!Modeling!!Evalua*on!and!Perplexity!!!Language!Modeling!Dan!Jurafsky!Evalua1on:!How!good!is!our!model?!• Does!our!language!model!prefer!good!sentences!to!bad!ones?!• Assign!higher!probability!to!“real”!or!“frequently!observed”!sentences!!• Than!“ungramma*cal”!or!“rarely!observed”!sentences?!• We!train!parameters!of!our!model!on!a!training!set.!• We!test!the!model’s!performance!on!data!we!haven’t!seen.!• A!test!set!is!an!unseen!dataset!that!is!different!from!our!training!set,!totally!unused.!• An!evalua1on!metric!tells!us!how!well!our!model!does!on!the!test!set.!Dan!Jurafsky!Extrinsic!evalua1on!of!NJgram !m ode ls!• Best!evalua*on!for!comparing!models!A!and!B!• Put!each!model!in!a!task!•


View Full Document

Stanford CS 124 - Language Modeling

Download Language Modeling
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Language Modeling and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Language Modeling 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?