PROBABILITY 1 19 Probability proportion of outcomes percentage of data and is represented by area under a probability histogram or area under a density curve The probability that a certain outcome will occur the proportion of times that the outcome is known or expected to occur the percentage of the time that that outcome is known or expected to occur A complete set of proportions always adds up to 1 therefore a complete set of probabilities will always add up to 1 just as a complete set of percentages always adds up to 100 Probability of 0 outcome will never occur Probability of 1 outcome will definitely occur Population sample Universal set subset Sample space event A probability distribution shows a list of outcomes along with the probability for each outcome It can be laid out in a probability distribution table a Venn diagram a tree diagram a probability histogram or a density curve The total area under a probability histogram or a density curve is always 1 Sample space set of all possible outcomes denoted S P S 1 Event subset of the sample space A is read A is the event that P A probability of event A P AB probability of A and B probability of event A and event B happening P B A probability of B given A probability of event B given that event A has occurred Equally likely outcomes outcomes that have the same probability of occurrence When outcomes are equally likely P A number of outcomes in A total number of outcomes Independent events events that do not alter each other s probability of occurrence A and B are independent if and only if P A B P A and P B A P B drawing with replacement produces independent events while drawing without replacement produces dependent events Disjoint mutually exclusive events events that have no outcomes in common that is events that cannot occur together disjoint events are always dependent dependent events have the ability to effect each others probability example having statistics and English at the same time 2010 Radha Bose Florida State University Department of Statistics PROBABILITY 2 19 Complement of event A not in original event other outcome set of all outcomes that do not belong to A denoted AC or A an event and its complement are always disjoint the categories of a categorical variable are always disjoint event complement sample space Random process a series of independent trials trials can consist of events or selections where the outcome of each trial is unpredictable but in the long run a pattern of outcomes emerges that is a distribution emerges That distribution gives us a probability model that can be used in data analysis Example Tossing a coin repeatedly and recording which side shows up Each toss is a trial and the outcome of any given toss does not affect the outcome of any other toss the trials are independent Before each toss before each trial takes place we cannot predict which side of the coin will show up we cannot predict what the outcome of the trial will be but in the long run after very many tosses have occurred after very many trials have occurred we may notice a pattern emerging that is we may notice that Heads show up a certain percent of the time while Tails show up a certain percent of the time we may notice that each possible outcome occurs a certain percent of the time These percentages allow us to create a probability distribution for the coin toss a probability distribution for the random process where there is a probability assigned to each side of the coin there is probability assigned to each possible outcome Example Repeatedly carrying out the process of selecting a someone at random and asking them which candidate they voted for Each selection is a trial and because we are randomly selecting persons we may assume that the vote of any given person does not affect the vote of any other person the trials are independent Before each person selected before each trial takes place we cannot predict which candidate they voted for we cannot predict what the outcome of the trial will be but in the long run after very many persons have been polled after very many trials have occurred we may notice a pattern emerging that is we may notice that a certain percent of the people voted for Candidate A a certain percent voted for Candidate B etc we may notice that each possible outcome 2010 Radha Bose Florida State University Department of Statistics PROBABILITY 3 19 occurs a certain percent of the time These percentages allow us to create a probability distribution for the candidates a probability distribution for the random process where there is a probability assigned to each candidate there is probability assigned to each possible outcome Equally likely outcomes the bedrock of probability R a randomly selected student in the class likes red If we consider that the students in the class are equally likely to be selected then P R number of students who like red total number of students 4 19 S B R B R B B B B G B R G B G B G B R B Classifying events as disjoint or not and independent or dependent If A happens does it prevent B from happening No Yes NOT DISJOINT If A happens does it raise or lower the probability of B happening No Yes INDEPENDENT DISJOINT therefore DEPENDENT DEPENDENT Examples Randomly select a person and let events A and B be as described below A they are taller than 5 B they are shorter than 4 Disjoint dependent A they are taller than 5 B they are taller than 4 Not disjoint dependent A they are taller than 5 B they are an FSU student Not disjoint independent 2010 Radha Bose Florida State University Department of Statistics PROBABILITY A note on drawing without replacement Replacement putting back in pool 4 19 When sampling without replacement from populations of known size then it is generally safe i to assume independence if the sample size is less than 10 of the population size because the probabilities do not change significantly However we use independence methods of calculation only if the population is very large ii When sampling without replacement from populations of unknown size it is generally safe to assume independence if random sampling is employed Also in general if the population size is unknown then it is usually large enough so that the sample size would be less than 10 of the population size There are other factors that affect independence apart from replacement but there is no easy way of generalizing them Each situation needs to be carefully considered and a decision made as to
View Full Document