DOC PREVIEW
CMU CS 10601 - lecture

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University September 20, 2011 Today: • Probability • Bayes Rule • Estimating parameters • maximum likelihood • max a posteriori Readings: Probability review • Bishop Ch. 1 thru 1.2.3 • Bishop, Ch. 2 thru 2.2 • Andrew Moore’s online tutorial many of these slides are derived from William Cohen, Andrew Moore, Aarti Singh, Eric Xing, Carlos Guestrin. - Thanks! Probability Overview • Events – discrete random variables, continuous random variables, compound events • Axioms of probability – What defines a reasonable theory of uncertainty • Independent events • Conditional probabilities • Bayes rule and beliefs • Joint probability distribution • Expectations • Independence, Conditional independence2 Random Variables • Informally, A is a random variable if – A denotes something about which we are uncertain – perhaps the outcome of a randomized experiment • Examples A = True if a randomly drawn person from our class is female A = The hometown of a randomly drawn person from our class A = True if two randomly drawn persons from our class have same birthday • Define P(A) as “the fraction of possible worlds in which A is true” or “the fraction of times A holds, in repeated runs of the random experiment” – the set of possible worlds is called the sample space, S – A random variable A is a function defined over S A: S à {0,1} A little formalism More formally, we have • a sample space S (e.g., set of students in our class) – aka the set of possible worlds • a random variable is a function defined over the sample space – Gender: S à { m, f } – Height: S à Reals • an event is a subset of S – e.g., the subset of S for which Gender=f – e.g., the subset of S for which (Gender=m) AND (eyeColor=blue) • we’re often interested in probabilities of specific events • and of specific events conditioned on other specific events3 Visualizing A Sample space of all possible worlds Its area is 1 Worlds in which A is False Worlds in which A is true P(A) = Area of reddish oval The Axioms of Probability • 0 <= P(A) <= 1 • P(True) = 1 • P(False) = 0 • P(A or B) = P(A) + P(B) - P(A and B) [di Finetti 1931]: when gambling based on “uncertainty formalism A” you can be exploited by an opponent iff your uncertainty formalism A violates these axioms4 Interpreting the axioms • 0 <= P(A) <= 1 • P(True) = 1 • P(False) = 0 • P(A or B) = P(A) + P(B) - P(A and B) The area of A canʼt get any smaller than 0 And a zero area would mean no world could ever have A true Interpreting the axioms • 0 <= P(A) <= 1 • P(True) = 1 • P(False) = 0 • P(A or B) = P(A) + P(B) - P(A and B) The area of A canʼt get any bigger than 1 And an area of 1 would mean all worlds will have A true5 Interpreting the axioms • 0 <= P(A) <= 1 • P(True) = 1 • P(False) = 0 • P(A or B) = P(A) + P(B) - P(A and B) Theorems from the Axioms • 0 <= P(A) <= 1, P(True) = 1, P(False) = 0 • P(A or B) = P(A) + P(B) - P(A and B) è P(not A) = P(~A) = 1-P(A)6 Theorems from the Axioms • 0 <= P(A) <= 1, P(True) = 1, P(False) = 0 • P(A or B) = P(A) + P(B) - P(A and B) è P(not A) = P(~A) = 1-P(A) P(A or ~A) = 1 P(A and ~A) = 0 P(A or ~A) = P(A) + P(~A) - P(A and ~A) 1 = P(A) + P(~A) + 0 Elementary Probability in Pictures • P(~A) + P(A) = 1 A ~A7 Another useful theorem • 0 <= P(A) <= 1, P(True) = 1, P(False) = 0, P(A or B) = P(A) + P(B) - P(A and B) è P(A) = P(A ^ B) + P(A ^ ~B) A = [A and (B or ~B)] = [(A and B) or (A and ~B)] P(A) = P(A and B) + P(A and ~B) – P((A and B) and (A and ~B)) P(A) = P(A and B) + P(A and ~B) – P(A and B and A and ~B) Elementary Probability in Pictures • P(A) = P(A ^ B) + P(A ^ ~B) B A ^ ~B A ^ B8 Multivalued Discrete Random Variables • Suppose A can take on more than 2 values • A is a random variable with arity k if it can take on exactly one value out of {v1,v2, ... vk} • Thus… jivAvAPji≠==∧= if 0)(! P(A = v1" A = v2"..." A = vk) = 1Elementary Probability in Pictures 1)(1==∑=kjjvAPA=1 A=2 A=3 A=4 A=59 Definition of Conditional Probability P(A ^ B) P(A|B) = ----------- P(B) Corollary: The Chain Rule P(A ^ B) = P(A|B) P(B) Conditional Probability in Pictures A=1 A=2 A=3 A=4 A=5 picture: P(B|A=2)10 Independent Events • Definition: two events A and B are independent if Pr(A and B)=Pr(A)*Pr(B) • Intuition: knowing A tells us nothing about the value of B (and vice versa) Visualizing Probabilities Sample space of all possible worlds Its area is 1 B A A ^ B11 Definition of Conditional Probability P(A ^ B) P(A|B) = ----------- P(B) B A Definition of Conditional Probability P(A ^ B) P(A|B) = ----------- P(B) Corollary: The Chain Rule P(A ^ B) = P(A|B) P(B) P(C ^ A ^ B) = P(C|A ^ B) P(A|B) P(B)12 Independent Events • Definition: two events A and B are independent if P(A ^ B)=P(A)*P(B) • Intuition: knowing A tells us nothing about the value of B (and vice versa) Bayes Rule • let’s write 2 expressions for P(A ^ B) B A A ^ B13 P(B|A) * P(A) P(B) P(A|B) = Bayes, Thomas (1763) An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society of London, 53:370-418 …by no means merely a curious speculation in the doctrine of chances, but necessary to be solved in order to a sure foundation for all our reasonings concerning past facts, and what is likely to be hereafter…. necessary to be considered by any that would give a clear account of the strength of analogical or inductive reasoning… Bayes’ rule we call P(A) the “prior” and P(A|B) the “posterior” Other Forms of Bayes Rule )(~)|~()()|()()|()|(APABPAPABPAPABPBAP+=)()()|()|(XBPXAPXABPXBAP∧∧∧=∧14 Applying Bayes Rule € P(A |B) =P(B | A)P(A)P(B | A)P(A) + P(B |~ A)P(~ A)A = you have the flu, B = you just coughed Assume: P(A) = 0.05 P(B|A) = 0.80 P(B| ~A) = 0.2 what is P(flu | cough) = P(A|B)? what does all this have to do with function approximation?15 The Joint Distribution


View Full Document

CMU CS 10601 - lecture

Documents in this Course
lecture

lecture

40 pages

Problem

Problem

12 pages

lecture

lecture

36 pages

Lecture

Lecture

31 pages

Review

Review

32 pages

Lecture

Lecture

11 pages

Lecture

Lecture

18 pages

Notes

Notes

10 pages

Boosting

Boosting

21 pages

review

review

21 pages

review

review

28 pages

Lecture

Lecture

31 pages

lecture

lecture

52 pages

Review

Review

26 pages

review

review

29 pages

Lecture

Lecture

37 pages

Lecture

Lecture

35 pages

Boosting

Boosting

17 pages

Review

Review

35 pages

lecture

lecture

32 pages

Lecture

Lecture

28 pages

Lecture

Lecture

30 pages

lecture

lecture

29 pages

leecture

leecture

41 pages

lecture

lecture

34 pages

review

review

38 pages

review

review

31 pages

Lecture

Lecture

41 pages

Lecture

Lecture

15 pages

Lecture

Lecture

21 pages

Lecture

Lecture

38 pages

Notes

Notes

37 pages

Load more
Download lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?