Chapter 4Probability TerminologyObtaining Event ProbabilitiesBasic Probability and RulesConditional Probability and IndependenceJohn Snow London Cholera Death StudySlide 7Slide 8Bayes’s Rule - Updating ProbabilitiesNorthern Army at GettysburgExample - OJ Simpson TrialSlide 12Random Variables/Probability DistributionsDiscrete Random VariablesExample: Wars Begun by Year (1482-1939)Masters Golf Tournament 1st Round ScoresMeans and Variances of Random VariablesRules for MeansExample: Masters Golf TournamentVariance of a Random VariableExamples - Wars & Masters GolfBinomial Distribution for Sample CountsBinomial Distributions and SamplingExample - Diagnostic TestBinomial Mean & Standard DeviationContinuous Random VariablesNormal DistributionTwo Normal DistributionsSlide 29Example - Heights of U.S. AdultsStandard Normal (Z) DistributionSlide 32Slide 33Slide 34Slide 35Finding Probabilities of Specific RangesExample - Adult Female HeightsFinding Percentiles of a DistributionExample - Adult Male HeightsAssessing Normality and TransformationsSampling DistributionsSampling Distribution of a Sample MeanCentral Limit TheoremSample ProportionsSampling Distributions for Counts & ProportionsSampling Distribution for Y~B(n=1000,p=0.2)Using Z-Table for Approximate ProbabilitiesChapter 4Basic Probability and Probability DistributionsProbability Terminology•Classical Interpretation: Notion of probability based on equal likelihood of individual possibilities (coin toss has 1/2 chance of Heads, card draw has 4/52 chance of an Ace). Origins in games of chance.–Outcome: Distinct result of random process (N= # outcomes)–Event: Collection of outcomes (Ne= # of outcomes in event)–Probability of event E: P(event E) = Ne/N•Relative Frequency Interpretation: If an experiment were conducted repeatedly, what fraction of time would event of interest occur (based on empirical observation)•Subjective Interpretation: Personal view (possibly based on external info) of how likely a one-shot experiment will end in event of interestObtaining Event Probabilities•Classical Approach–List all N possible outcomes of experiment–List all Ne outcomes corresponding to event of interest (E)–P(event E) = Ne/N•Relative Frequency Approach–Define event of interest–Conduct experiment repeatedly (often using computer)–Measure the fraction of time event E occurs•Subjective Approach–Obtain as much information on process as possible–Consider different outcomes and their likelihood–When possible, monitor your skill (e.g. stocks, weather)Basic Probability and Rules•A,B Events of interest •P(A), P(B) Event probabilities•Union: Event either A or B occurs (A B)•Mutually Exclusive: A, B cannot occur at same time–If A,B are mutually exclusive: P(either A or B) = P(A) + P(B)•Complement of A: Event that A does not occur (Ā)–P(Ā) = 1- P(A) That is: P(A) + P(Ā) = 1•Intersection: Event both A and B occur (A B or AB)•P (A B) = P(A) + P(B) - P(AB)Conditional Probability and Independence•Unconditional/Marginal Probability: Frequency which event occurs in general (given no additional info). P(A)•Conditional Probability: Probability an event (A) occurs given knowledge another event (B) has occurred. P(A|B)•Independent Events: Events whose unconditional and conditional (given the other) probabilities are the same( ) ( )( | )( ) ( )( ) ( )( | )( ) ( )( ) ( ) ( ) ( | ) ( ) ( | ), independent ( ) ( | ) & ( ) ( | )P A B P ABP A BP B P BP A B P ABP B AP A P AP A B P AB P A P B A P B P A BA B P A P A B P B P B A�= =�= =� = = =� = =John Snow London Cholera Death Study•2 Water Companies (Let D be the event of death): –Southwark&Vauxhall (S): 264913 customers, 3702 deaths–Lambeth (L): 171363 customers, 407 deaths–Overall: 436276 customers, 4109 deathspeople) 10000per (24 0024.171363407)|(people) 10000per (140 0140.2649133702)|(people) 10000per (94 0094.4362764109)(LDPSDPDPNote that probability of death is almost 6 times higher for S&V customers than Lambeth customers (was important in showing how cholera spread)John Snow London Cholera Death StudyCholera DeathWaterCompanyYes No TotalS&V 3702(.0085)261211(.5987)264913(.6072)Lambeth 407(.0009)170956(.3919)171363(.3928)Total 4109(.0094)432167(.9906)436276(1.0000)(Contingency Table with joint probabilities (in body of table) and marginal probabilities (on edge of table)John Snow London Cholera Death StudyWaterUserS&VL.6072.3928CompanyDeathD (.0085).0140.9860DC (.5987).0024.9976D (.0009)DC (.3919)Tree Diagram obtaining joint probabilities by multiplication ruleBayes’s Rule - Updating Probabilities•Let A1,…,Ak be a set of events that partition a sample space such that (mutually exclusive and exhaustive):–each set has known P(Ai) > 0 (each event can occur)–for any 2 sets Ai and Aj, P(Ai and Aj) = 0 (events are disjoint)–P(A1) + … + P(Ak) = 1 (each outcome belongs to one of events)•If C is an event such that –0 < P(C) < 1 (C can occur, but will not necessarily occur)–We know the probability will occur given each event Ai: P(C|Ai)•Then we can compute probability of Ai given C occurred:)() and ()()|()()|()()|()|(11CPCAPAPACPAPACPAPACPCAPikkiiiNorthern Army at GettysburgRegiment Label Initial # Casualties P(Ai) P(C|Ai) P(C|Ai)*P(Ai) P(Ai|C)I Corps A1 10022 6059 0.1051 0.6046 0.0635 0.2630II Corps A2 12884 4369 0.1351 0.3391 0.0458 0.1896III Corps A3 11924 4211 0.1250 0.3532 0.0442 0.1828V Corps A4 12509 2187 0.1312 0.1748 0.0229 0.0949VI Corps A5 15555 242 0.1631 0.0156 0.0025 0.0105XI Corps A6 9839 3801 0.1032 0.3863 0.0399 0.1650XII Corps A7 8589 1082 0.0901 0.1260 0.0113 0.0470Cav Corps A8 11501 852 0.1206 0.0741 0.0089 0.0370Arty Reserve A9 2546 242 0.0267 0.0951 0.0025 0.0105Sum 95369 23045 1 0.2416 1.0002P(C)• Regiments: partition of soldiers (A1,…,A9). Casualty: event C• P(Ai) = (size of regiment) / (total soldiers) = (Column 3)/95369• P(C|Ai) = (# casualties) / (regiment size) = (Col 4)/(Col 3)• P(C|Ai) P(Ai) = P(Ai and C) = (Col 5)*(Col 6) •P(C)=sum(Col 7)• P(Ai|C) = P(Ai and C) / P(C) = (Col 7)/.2416Example - OJ Simpson Trial•Given Information on Blood Test (T+/T-)–Sensitivity: P(T+|Guilty)=1–Specificity: P(T-|Innocent)=.9957 P(T+|I)=.0043•Suppose you have a prior belief of guilt: P(G)=p*•What is “posterior” probability of guilt after seeing evidence that blood matches: P(G|T+)?( ) ( ) ( ) ( ) ( | ) ( )
View Full Document