Unformatted text preview:

How to Use ProbabilitiesGoals of this lecture3 Kinds of StatisticsFugue for TinhornsNotation for GreenhornsWhat does that really mean?p is a function on sets of “outcomes”Slide 8Required Properties of p (axioms)Commas denote conjunctionSlide 11Simplifying Right Side: Backing OffSimplifying Left Side: Backing OffFactoring Left Side: The Chain RuleSlide 15Slide 16Slide 17Remember Language ID?Slide 19Apply the Chain RuleBack Off On Right SideChange the NotationAnother Independence AssumptionSimplify the NotationSlide 25Slide 26Definition: Probability ModelEnglish vs. PolishWhat is “X” in p(X)?What is “X” in “p(X)”?Random Variables: What is “variable” in “p(variable=value)”?Slide 32Slide 33Slide 34Slide 35Back to trigram model of p(horses)A Different ModelImproving the New Model: Weaken the Indep. AssumptionWhich Model is Better?Measure Performance!Cross-Entropy (“xent”)Slide 421How to Use ProbabilitiesThe Crash Course600.465 – Intro to NLP – J. Eisner 2Goals of this lecture•Probability notation like p(X | Y):–What does this expression mean?–How can I manipulate it?–How can I estimate its value in practice?•Probability models:–What is one?–Can we build one for language ID?–How do I know if my model is any good?600.465 – Intro to NLP – J. Eisner 33 Kinds of Statistics•descriptive: mean Hopkins SAT (or median)•confirmatory: statistically significant?•predictive: wanna bet?this course – why?Fugue for Tinhorns•Opening number from Guys and Dolls–1950 Broadway musical about gamblers–Words & music by Frank Loesser•Video: http://www.youtube.com/watch?v=NxAX74gM8DY •Lyrics: http://www.lyricsmania.com/fugue_for_tinhorns_lyrics_guys_and_dolls.html 600.465 – Intro to NLP – J. Eisner 4600.465 – Intro to NLP – J. Eisner 5probabilitymodelNotation for Greenhorns“Paul Revere”p(Paul Revere wins | weather’s clear) = 0.90.9600.465 – Intro to NLP – J. Eisner 6What does that really mean?p(Paul Revere wins | weather’s clear) = 0.9•Past performance?–Revere’s won 90% of races with clear weather•Hypothetical performance?–If he ran the race in many parallel universes …•Subjective strength of belief?–Would pay up to 90 cents for chance to win $1•Output of some computable formula?–Ok, but then which formulas should we trust? p(X | Y) versus q(X | Y)600.465 – Intro to NLP – J. Eisner 7p is a function on sets of “outcomes” weather’s clear Paul Revere winsAll Outcomes (races)p(win | clear)  p(win, clear) / p(clear)600.465 – Intro to NLP – J. Eisner 8p is a function on sets of “outcomes” weather’s clear Paul Revere winsAll Outcomes (races)p(win | clear)  p(win, clear) / p(clear)syntactic sugar predicate selectingraces where weather’s clearlogical conjunctionof predicatesp measures totalprobability of a set of outcomes(an “event”).600.465 – Intro to NLP – J. Eisner 9Required Properties of p (axioms) weather’s clear Paul Revere winsAll Outcomes (races)•p() = 0 p(all outcomes) = 1•p(X)  p(Y) for any X  Y•p(X) + p(Y) = p(X  Y) provided X  Y= e.g., p(win & clear) + p(win & clear) = p(win)most of thep measures totalprobability of a set of outcomes(an “event”).600.465 – Intro to NLP – J. Eisner 10Commas denote conjunctionp(Paul Revere wins, Valentine places, Epitaph shows | weather’s clear)what happens as we add conjuncts to left of bar ?•probability can only decrease•numerator of historical estimate likely to go to zero:# times Revere wins AND Val places… AND weather’s clear # times weather’s clear600.465 – Intro to NLP – J. Eisner 11Commas denote conjunctionp(Paul Revere wins, Valentine places, Epitaph shows | weather’s clear)p(Paul Revere wins | weather’s clear, ground is dry, jockey getting over sprain, Epitaph also in race, Epitaph was recently bought by Gonzalez, race is on May 17, … )what happens as we add conjuncts to right of bar ?•probability could increase or decrease•probability gets more relevant to our case (less bias)•probability estimate gets less reliable (more variance)# times Revere wins AND weather clear AND … it’s May 17 # times weather clear AND … it’s May 17600.465 – Intro to NLP – J. Eisner 12p(Paul Revere wins | weather’s clear, ground is dry, jockey getting over sprain, Epitaph also in race, Epitaph was recently bought by Gonzalez, race is on May 17, … )Simplifying Right Side: Backing Offnot exactly what we want but at least we can get a reasonable estimate of it!(i.e., more bias but less variance)try to keep the conditions that we suspect will have the most influence on whether Paul Revere wins600.465 – Intro to NLP – J. Eisner 13p(Paul Revere wins, Valentine places, Epitaph shows | weather’s clear)Simplifying Left Side: Backing OffNOT ALLOWED!but we can do something similar to help …600.465 – Intro to NLP – J. Eisner 14p(Revere, Valentine, Epitaph | weather’s clear) = p(Revere | Valentine, Epitaph, weather’s clear)* p(Valentine | Epitaph, weather’s clear)* p(Epitaph | weather’s clear)Factoring Left Side: The Chain RuleTrue because numerators cancel against denominatorsMakes perfect sense when read from bottom to topRVEW/W= RVEW/VEW * VEW/EW * EW/WEpitaph?Valentine?Revere?no 2/3yes 1/3no 4/5yes 1/5no 3/4yes 1/4Revere?Valentine?Revere?Revere?Epitaph, Valentine, Revere? 1/3 * 1/5 * 1/4600.465 – Intro to NLP – J. Eisner 15p(Revere, Valentine, Epitaph | weather’s clear) = p(Revere | Valentine, Epitaph, weather’s clear)* p(Valentine | Epitaph, weather’s clear)* p(Epitaph | weather’s clear)Factoring Left Side: The Chain RuleTrue because numerators cancel against denominatorsMakes perfect sense when read from bottom to topMoves material to right of bar so it can be ignored RVEW/W= RVEW/VEW * VEW/EW * EW/WIf this prob is unchanged by backoff, we say Revere was CONDITIONALLY INDEPENDENT of Valentine and Epitaph (conditioned on the weather’s being clear). Often we just ASSUME conditional independence to get the nice product above.600.465 – Intro to NLP – J. Eisner 16 p(Revere | Valentine, Epitaph, weather’s clear) Factoring Left Side: The Chain RuleIf this prob is unchanged by backoff, we say Revere was CONDITIONALLY INDEPENDENT of Valentine and Epitaph (conditioned on the


View Full Document

Johns Hopkins EN 600 465 - How to Use Probabilities

Download How to Use Probabilities
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view How to Use Probabilities and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view How to Use Probabilities 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?