Machine Learning 1010 701 15701 15 781 Fall 2006 Tutorial on Basic Probability Eric Xing f x Lecture 2 September 14 2006 x Reading Chap 1 2 CB Chap 5 6 TM Eric Xing ML CMU 1 What is this z Classical AI and ML research ignored this phenomena z The Problem an example z z you want to catch a flight at 10 00am from Pitt to SF can I make it if I leave at 7am and take a 28X at CMU z partial observability road state other drivers plans etc z noisy sensors radio traffic reports z uncertainty in action outcomes flat tire etc z immense complexity of modeling and predicting traffic Reasoning under uncertainty Eric Xing ML CMU 2 1 Basic Probability Concepts z A sample space S is the set of all possible outcomes of a conceptual or physical repeatable experiment S can be finite or infinite z E g S may be the set of all possible outcomes of a dice roll S 1 2 3 4 5 6 z z z of a DNA site S A T C G E g S may be the set of all possible positions time space positions o of a aircraft on a radar screen S 0 Rmax 0 360 0 An event A is the any subset S z z E g S may be the set of all possible nucleotides Seeing 1 or 6 in a roll observing a G at a site UA007 in space time interval X An event space E is the possible worlds the outcomes can happen z All dice rolls reading a genome monitoring the radar signal Eric Xing ML CMU 3 Visualizing Probability Space z z A probability space is a sample space of which for every subset s S there is an assignment P s S such that z 0 P s 1 z s SP s 1 P s is called the probability or probability mass of s Event space of all possible worlds Its area is 1 Worlds in which A is true Worlds in which A is false P a is the area of the oval Eric Xing ML CMU 4 2 Kolmogorov Axioms z All probabilities are between 0 and 1 z z P true 1 z z regardless of the event my outcome is true P false 0 z z 0 P X 1 no event makes my outcome true The probability of a disjunction is given by z P A B P A P B P A B A B A B B A B A Eric Xing ML CMU 5 Why use probability z z There have been attempts to develop different methodologies for uncertainty z Fuzzy logic z Qualitative reasoning Qualitative physics z Probability theory is nothing but common sense reduced to calculation z z In 1931 de Finetti proved that it is irrational to have beliefs that violate these axioms in the following sense z z Pierre Laplace 1812 If you bet in accordance with your beliefs but your beliefs violate the axioms then you can be guaranteed to lose money to an opponent whose beliefs more accurately reflect the true state of the world Here betting and money are proxies for decision making and utilities What if you refuse to bet This is like refusing to allow time to pass every action including inaction is a bet Eric Xing ML CMU 6 3 Random Variable z A random variable is a function that associates a unique numerical value a token with every outcome of an experiment The value of the r v will vary from trial to trial as the experiment is repeated X S z Discrete r v z z The outcome of a dice roll X z The outcome of reading a nt at site i Xi Binary event and indicator variable z Seeing an A at a site X 1 o w X 0 z This describes the true or false outcome a random event z Can we describe richer outcomes in the same way i e X 1 2 3 4 for being A C G T think about what would happen if we take average of X z Unit Base Random vector z Continuous r v Xi Xi A Xi T Xi G Xi C Xi 0 0 1 0 seeing a G at site i z The outcome of recording the true location of an aircraft Xtrue outcome of observing the measured location of an aircraft z The Eric Xing ML CMU Xobs 7 Discrete Prob Distribution z In the discrete case a probability distribution P on S and hence on the domain of X is an assignment of a non negative real number P s to each s S or each valid value of x such that s SP s 1 0 P s 1 z z z intuitively P s corresponds to the frequency or the likelihood of getting s in the experiments if repeated many times call s P s the parameters in a discrete probability distribution A probability distribution on a sample space is sometimes called a probability model in particular if several different distributions are under consideration z z z write models as M1 M2 probabilities as P X M1 P X M2 e g M1 may be the appropriate prob dist if X is from fair dice M2 is for the loaded dice M is usually a two tuple of dist family dist parameters Eric Xing ML CMU 8 4 Discrete Distributions z Bernoulli distribution Ber p 1 p for x 0 for x 1 p P x z P x p x 1 p 1 x Multinomial distribution Mult 1 z Multinomial indicator variable X1 X 2 X X 3 X4 X5 X 6 X j 0 1 and where X j 1 w p j X j 1 6 j 1 6 j j 1 1 p x j P X j 1 where j index the dice face j A A C C G G T x x x xT k xk x k 9 Eric Xing ML CMU Discrete Distributions z Multinomial distribution Mult n z Count variable x1 X M xK p x Eric Xing ML CMU where x j n j n n 1x 2x L K xK x x1 x 2 LxK x1 x 2 LxK 1 2 10 5 Continuous Prob Distribution z A continuous random variable X can assume any value in an interval on the real line or in a region in a high dimensional space z X usually corresponds to a real valued measurements of some property e g length position z It is not possible to talk about the probability of the random variable assuming a particular value P x 0 z Instead we talk about the probability of the random variable assuming a value within a given interval or half interval z P X x1 x 2 P X x P X x z Arbitrary Boolean combination of basic propositions 11 Eric Xing ML CMU Continuous Prob Distribution z The probability of the random variable assuming a value within some given interval …
View Full Document