0 Chapter 1 Probability 1 1 Getting Started These notes will explore how the discipline of Statistics helps scientists learn about the world There are two main areas in which Statistics helps Validity Proper use of Statistics helps a scientist learn things that are true improper use can lead to a scientist learning things that are false Efficiency Proper use of Statistics can help a scientist learn faster or with less effort or at a lower cost The first tool you need to become a good statistician is a cursory understanding of probability I begin with one of my favorite quotes from a favorite source Predictions are tough especially about the future Yogi Berra Probability theory is used by mathematicians scientists and statisticians to quantify uncertainty about the future We begin with the notion of a chance mechanism This is a two word technical expression It is very important that we use technical expressions exactly as they are defined In every day life you may have several meanings for some of your favorite words for example phat but in this class technical expressions have a unique meaning In these notes the first occurrence of a technical expression term will be in bold faced type Both words in chance mechanism CM are meaningful The second word reminds us that the CM when operated produces an outcome The first word reminds us that the outcome cannot be predicted with certainty Several examples will help 1 CM A coin is tossed Outcome The face that lands up either heads or tails 2 CM A six sided die is cast Outcome The face that lands up either 1 2 3 4 5 or 6 1 3 CM A man with AB blood and a woman with AB blood have a child Outcome The blood type of the child either A B or AB 4 CM The next NFL Super Bowl game Outcome The winner of the game which could be any one of the 32 NFL teams The next idea is the sample space usually denoted by S The sample space is the collection of all possible outcomes of the CM Below are the sample spaces for each CM listed above 1 CM Coin S H T 2 CM Die S 1 2 3 4 5 6 3 CM Blood S A B AB 4 CM Super Bowl S A list of the 32 NFL teams An event is a collection of outcomes that is it is a subset of the sample space Events are typically denoted by upper case letters usually from the beginning of the alphabet Below are some events for each CM listed above 1 CM Coin A H B T 2 CM Die A 5 6 B 1 3 5 3 CM Blood C A B 4 CM Super Bowl A Vikings Packers Bears Lions Sometimes it is convenient to describe an event with words As examples of this For the die event A can described as the outcome is larger than 4 and event B can be described as the outcome is an odd integer For the Super Bowl event A can described as the winner is from the NFC North Division Here is where I am going with this Before a CM is operated nobody knows what the outcome will be In particular for any event A that is not the entire sample space we don t know whether the outcome will be a member of A After the CM is operated we can determine see whether the actual outcome is a member of an event A if it is we say that the event A has occurred if not we say that the event A has not occurred Below are some examples for our CM s above 1 CM Coin If the coin lands heads then event A has occurred and event B has not occurred 2 CM Die If the die lands 5 both A and B have occurred If the die lands 1 or 3 B has occurred but A has not If the die lands 6 A has occurred but B has not Finally if the die lands 2 or 4 both A and B have not occurred 3 CM Blood If the child has AB blood then the even C has not occurred 4 CM Super Bowl If the Packers win the Super Bowl then the event A has occurred 2 Before the CM is operated the probability of the event A denoted by P A is a number that measures the likelihood that A will occur This incredibly vague statement raises three questions that we will answer 1 How are probabilities assigned to events 2 What are the rules that these assignments must obey 3 If I say for example that P A 0 25 what does this mean First the assignment of probabilities to events always is based on assumptions about the operation of the world As such it is a scientific not a mathematical exercise There are always assumptions whether they are expressed or tacit implicit or explicit My advice is to always do your best to be aware of any assumptions you make This is I believe good advice for outside the classroom too The most popular assumption for a CM is the assumption of the equally likely case ELC As the name suggests in the ELC we assume that each possible outcome is equally likely to occur Another way to say this is that it is impossible to find two outcomes such that one outcome is more likely to occur than the other I will discuss the ELC for each CM we have considered in this section 1 CM Coin If I select an ordinary coin from my pocket and plan to toss it I would assume that the two outcomes heads and tails are equally likely to occur This seems to be a popular assumption in our culture because tossing a coin is often used as a way to decide which of two persons teams is allowed to make a choice For example football games typically begin with a coin toss and the winner gets to make a choice involving direction of attack or initial possession of the ball Note however that I would not make this assumption without thinking about it In particular the path of a coin is governed by the laws of physics and presumably if I could always apply exactly the same forces to the coin it would always land the same way I am an extremely minor acquaintance of a famous person named Persi Diaconis Persi has been a tenured professor at Stanford Harvard Cornell and Stanford again and he was a recipient of a MacArthur Foundation no strings attached genius fellowship a number of years ago More relevant for this discussion is that while a teenager Persi worked as a small acts magician Thus it is no surprise to learn that Persi has unusually good control of his hands and reportedly can make heads much more likely than tails when he tosses a coin My willingness to assume that heads and tails are equally likely when I toss a coin reflects …
View Full Document