DOC PREVIEW
UMass Amherst CMPSCI 591N - Probability

This preview shows page 1-2-3-23-24-25-26-46-47-48 out of 48 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Andrew McCallum, UMass AmherstProbabilityLecture #7Computational LinguisticsCMPSCI 591N, Spring 2006University of Massachusetts AmherstAndrew McCallumAndrew McCallum, UMass AmherstToday’s Main Points• Remember (or learn) about probability theory– samples, events, tables, counting– Bayes’ Rule, and its application– A little calculus?– random variables– Bernoulli and Multinomial distributions: the work-horses of Computational Linguistics.– Multinomial distributions from Shakespeare.Andrew McCallum, UMass AmherstProbability Theory• Probability theory deals with predicting howlikely it is that something will happen.– Toss 3 coins,how likely is it that all come up heads?– See phrase “more lies ahead”,how likely is it that “lies” is noun?– See “Nigerian minister of defense” in email,how likely is it that the email is spam?– See “Le chien est noir”,how likely is it that the correct translation is“The dog is black”?Andrew McCallum, UMass AmherstProbability and CompLing• Probability is the backbone of moderncomputational linguistics... because:– Language is ambiguous– Need to integrate evidence• Simple example (which we will revist later)– I see the first word of a news article: “glacier”– What is the probability the language is French?English?– Now I see the second word: “melange”.– Now what are the probabilities?Andrew McCallum, UMass AmherstExperiments and Sample Spaces• Experiment (or trial)– repeatable process by which observations are made– e.g. tossing 3 coins• Observe basic outcome fromsample space, Ω, (set of all possible basic outcomes), e.g.– one coin toss, sample space Ω = { H, T };basic outcome = H or T– three coin tosses, Ω = { HHH, HHT, HTH,…, TTT }– Part-of-speech of a word, Ω = { CC1, CD2, CT3, …, WRB36}– lottery tickets, |Ω| = 107– next word in Shakespeare play, |Ω| = size of vocabulary– number of words in your Ph.D. thesis Ω = { 0, 1, … ∞ }– length of time of “a” sounds when I said “sample”.discrete,countably infinitecontinuous,uncountably infiniteAndrew McCallum, UMass AmherstEvents and Event Spaces• An event, A, is a set of basic outcomes,i.e., a subset of the sample space, Ω.– Intuitively, a question you could ask about an outcome.– Ω = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}– e.g. basic outcome = THH– e.g. event = “has exactly 2 H’s”, A={THH, HHT, HTH}– A=Ω is the certain event, A=∅ is the impossible event.– For “not A”, we write A• A common event space, F, is the power set of thesample space, Ω. (power set is written 2Ω)– Intuitively: all possible questions you could ask about abasic outcome.Andrew McCallum, UMass AmherstProbability• A probability is a number between 0 and 1.– 0 indicates impossibility– 1 indicates certainty• A probability function, P, (or probability distribution)assigns probability mass to events in the eventspace, F.– P : F → [0,1]– P(Ω) = 1– Countable additivity: For disjoint events Aj in FP(∪j Aj) = Σj P(Aj)• We call P(A) “the probability of event A”.• Well-defined probability space consists of– sample space Ω– event space F– probability function PAndrew McCallum, UMass AmherstProbability (more intuitively)• Repeat an experiment many, many times.(Let T = number of times.)• Count the number of basic outcomes that area member of event A.(Let C = this count.)• The ratio C/T will approach (some unknown)but constant value.• Call this constant “the probability of event A”;write it P(A).Why is the probability this ratio of counts? Stay tuned! Maximum likelihood estimation at end.Andrew McCallum, UMass AmherstExample: Counting• “A coin is tossed 3 times.What is the likelihood of 2 heads?”– Experiment: Toss a coin three times,Ω = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}– Event: basic outcome has exactly 2 H’sA = {THH, HTH, HHT}• Run experiment 1000 times (3000 coin tosses)• Counted 373 outcomes with exactly 2 H’s• Estimated P(A) = 373/1000 = 0.373Andrew McCallum, UMass AmherstExample: Uniform Distribution• “A fair coin is tossed 3 times.What is the likelihood of 2 heads?”– Experiment: Toss a coin three times,Ω = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}– Event: basic outcome has exactly 2 H’sA = {THH, HTH, HHT}• Assume a uniform distribution over outcomes– Each basic outcome is equally likely– P({HHH}) = P({HHT}) = … = P({TTT})• P(A) = |A| / |Ω| = 3 / 8 = 0.375Andrew McCallum, UMass AmherstProbability (again)• A probability is a number between 0 and 1.– 0 indicates impossibility– 1 indicates certainty• A probability function, P, (or probability distribution)distributes probability mass of 1 throughout the eventspace, F.– P : F → [0,1]– P(Ω) = 1– Countable additivity: For disjoint events Aj in FP(∪j Aj) = Σj P(Aj)• The above are axioms of probability theory• Immediate consequences:– P(∅) = 0, P(A) = 1 - P(A), A ⊆ B -> P(A) ≤ P(B),Σa ∈Ω P(a) = 1, for a = basic outcome.Andrew McCallum, UMass AmherstVocabulary Summary• Experiment = a repeatable process• Sample = a possible outcome• Sample space = all samples for an experiment• Event = a set of samples• Probability = assigns a probability to each sampledistribution• Uniform = all samples are equi-probabledistributionAndrew McCallum, UMass AmherstCollaborative Exercise• !You roll a die, then roll it again. What is theprobability that you get the same numberfrom both rolls?• Explain in terms of event spaces and basicoutcomes.Andrew McCallum, UMass AmherstJoint and Conditional Probability• Joint probability of A and B:P(A ∩ B) is usually written P(A,B)• Conditional probability of A given B:P(A|B) = P(A,B) P(B)A BA ∩ BΩUpdated probability of an event given some evidenceP(A) = prior probability of AP(A|B) = posterior probability of A given evidence BAndrew McCallum, UMass AmherstJoint Probability Tablesun rain sleet snow10s 0.09 0.00 0.00 0.0120s 0.08 0.00 0.00 0.0230s 0.05 0.01 0.01 0.0340s 0.06 0.03 0.01 0.0050s 0.06 0.04 0.00 0.0060s 0.06 0.04 0.00 0.0070s 0.07 0.03 0.00 0.0080s 0.07 0.03 0.00 0.0090s 0.08 0.02 0.00 0.00100s 0.08 0.02 0.00 0.00P(precipitation, temperature)it takes 40 numbersWhat does it look like “under the hood”?Andrew McCallum, UMass AmherstConditional Probability Tablesun rain sleet snow10s 0.9 0.0 0.0 0.120s 0.8 0.0 0.0 0.230s 0.5 0.1 0.1 0.340s 0.6 0.3 0.1 0.050s


View Full Document

UMass Amherst CMPSCI 591N - Probability

Download Probability
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Probability and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Probability 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?