15 251 Great Theoretical Ideas in Computer Science Notes on Probability Oscar Wilde One should always be a little improbable I am giddy expectation whirls me round Th imaginary relish is so sweet that it enchants my sense William Shakespeare Troilus and Cressida act 3 scene 2 1 Probability Spaces In the dice and coin flipping examples we ve looked at the probabilistic elements are fairly intuitive Probability is formalized in terms of sample spaces events and probability distributions There is also a very powerful but elementary calculus for probability that allows us to compute and approximate event probabilities A probabilistic model or probability space is comprised of 1 A sample space The sample space is thought of as the set of possible outcomes of an experiment appropriate for the problem being modeled 2 A probability distribution P 2 0 1 is a function which assigns to every set A of possible outcomes a probability P A which is a number between zero and one Subsets of are called events It is often quite helpful to think of as the outcomes of some experiment which can often be thought of as a physical process In many cases the experiment is artificial and can simply be thought of as the roll of a many sided die In other cases the physical process is natural to the problem for example in the case of arrivals to a web server The possible outcomes of the experiment must be chosen to be non overlapping that is mutually exclusive On the other hand events can be non exclusive since they are arbitrary subsets of possible outcomes 1 1 The Axioms of Probability A probability distribution P for a sample space is required to satisfy the following axioms 1 P A 0 for every event A 1 2 If A and B are events with A B then P A B P A P B More generally if Ai ni 1 is a finite sequence of disjoint events then n n X P P Ai Ai i 1 i 1 3 The probability of the sample space is one P 1 The most common important and intuitive way of constructing probability models is when the sample space 1 2 n is finite InP this case a probability distribution is simply a set of numbers pi P i satisfying 0 pi 1 and ni 1 pi 1 The probability of an event is obtained by simply adding up the probabilities of the outcomes contained in that event thus P a1 a2 am m X P aj 1 j 1 Such a distribution is easily seen to satisfy the axioms Example 1 1 Bernoulli trials For a sample space of size two we can set H T where the experimental outcomes correspond to the flip of a coin coming up heads or tails There are only four events A H T H T The probability distribution is determined by a single number the probability of heads P H This is because of the third axiom which requires that P H P T 1 The flip of a coin is often referred to as a Bernoulli trial in probability after Jacob Bernoulli who studied some of the foundations of probability theory between 1685 and 1689 Example 1 2 Multinomial probabilities This is the crucial example where the experiment is a single roll of an n sided die1 It includes the case of Bernoulli trials as a special case Consider the case of rolling a standard die In this case 1 6 where i corresponds to i dots showing on the top of the die There are 26 64 possible events one being for example the event the number of dots is even A fair die corresponds to the distribution P i 16 for each i 1 6 1 While such an experiment may be difficult to execute physically because of the difficulty of constructing such a die for large n it is conceptually useful to think in these terms 2 2 Random Variables A random variable is a numerical value associated with each experimental outcome Thus an r v can be thought of as a mapping X R from the sample space to the real line If you like it can be thought of as a measurement associated with an experiment Thus random variables are really random functions but what s random is the input to the function Example 2 1 Max of a roll Consider rolling a pair of distinct dice so the sample space is i j 1 i j 6 Let X i j max i j If X is a random variable and S R define P X S P X S P X 1 S where X 1 S X S In particular The probability mass function pmf of a random variable X is PX x P X x Some of its basic properties are A discrete finite r v takes on only a discrete finite set of values The probability of a set S is given by P X S X P X x x S since X 1 S S x S X 1 s The total probability is one X P X x 1 x by the normalization axiom 3 So we can picture a probability mass function as a bar graph with a bar over each of the possible values of the random variable where the sum of the heights of the bars is one For a given random variable X we can just work with the p m f P X x But when we do this hides the sample space We need to remember that the sample space is really there in the background Example 2 2 Tetrahedral die Consider the roll of a pair of distinguishable tetrahedral die The sample space is i j 1 i j 4 Let X be the random variable X i j max i j Then PX 1 PX 2 PX 3 PX 4 1 16 3 16 5 16 7 16 3 Some Important RVs One of the most central random variables is the Bernoulli Example 3 1 Bernoulli The Bernoulli random variable is simply the indicator function for a coin flip The Bernoulli random variable is X T 0 X H 1 for the sample space H T with probability distribution P H 1 P T p Thus the probability mass function of the random variable is PX 1 1 PX 0 p 4 A sum of Bernoullis is called a binomial random variable Here the sample space is the collection of all sequences such as HHT H T z n times that is the set of n flips of a biased coin assuming the flips are independent Let X be the random variable X number of heads in the sequence The pmf of a binomial random variable for n flips of a coin is n k PX k p 1 p n k k where p is the probability of heads Note that by the binomial theorem n X PX k n X n k p 1 p k k k 0 k 0 p 1 p n 1 …
View Full Document
Unlocking...