Chapter 13 Two Dichotomous Variables 13 1 Populations and Sampling A population model for two dichotomous variables can arise for a collection of individuals a finite population or as a mathematical model for a process that generates two dichotomous variables per trial Here are two examples 1 Consider the population of students at a small college The two variables are sex with possible values female and male and the answer to the following question with possible values yes and no Do you usually wear corrective lenses when you attend lectures 2 Recall the homework data on Larry Bird shooting pairs of free throws If we view each pair as a trial then the two variables are the outcome of the first shot and the outcome of the second shot We begin with terminology and notation With two responses per subject trial it sometimes will be too confusing to speak of successes and failures Instead we proceed as follows The first variable has possible values A and Ac The second variable has possible values B and B c In the above example of a finite population A could denote female Ac could denote male B could denote the answer yes and B c could denote the answer no In the above example of trials A could denote that the first shot is made Ac could denote that the first shot is missed B could denote that the second shot is made and B c could denote that the second shot is missed It is important that we now consider finite populations and trials separately We begin with finite populations 149 Table 13 1 The Table of Population Counts B A NAB c A NAc B Total NB Bc NABc NAc Bc NBc Total NA NAc N Table 13 2 Hypothetical Population Counts for Study of Sex and Corrective Lenses Female A Male Ac Total Yes B 360 140 500 No B c 240 260 500 Total 600 400 1000 13 1 1 Finite Populations Table 13 1 presents our notation for population counts for a finite population Remember that in practice only Nature would know these numbers This notation is fairly simple to remember all counts are represented by N with or without subscripts The symbol N without subscripts represents the total number of subjects in the population An N with subscripts counts the number in the population with the feature s given by the subscripts For example NAB is the number of population members with variable values A and B NAc is the number of population members with value Ac on the first variable i e for this we don t care about the second variable Note also that these guys sum in the obvious way NA NAB NABc In words if you take the number of subjects whose variable values are A and B and add to it the number of subjects whose variable values are A and B c then you get the number of subjects whose value on the first variable is A It might help if we have some hypothetical values for the population counts I put some in Table 13 2 If we take the table of population counts and divide each entry by N we get the table of population proportions I do this in Tables 13 3 and 13 4 for the general notation and our particular hypothetical data Now we must face a notational annoyance Consider the symbol pAB which equals 0 36 for our hypothetical population This says literally that the proportion of the population members whose variable values are A and B is 0 36 the proportion of our hypothetical students who are both female and would answer yes is 0 36 We use the lower case p for this notion because we use lower case p s to represent population proportions But consider our most commonly used Chance Mechanism when studying a finite population Select a member of the population at random For this CM it is natural to view pAB 0 36 as the 150 Table 13 3 The Table of Population Proportions B A pAB c A pA c B Total pB Bc pABc pA c B c pB c Total pA pA c 1 Table 13 4 Hypothetical Population Proportions for Study of Sex and Corrective Lenses Female A Male Ac Total Yes B 0 36 0 14 0 50 No B c 0 24 0 26 0 50 Total 0 60 0 40 1 00 probability of selecting a person who is female and would answer yes But we tend to use upper case P to denote the word probability Hence it is more natural to write this as P AB 0 36 The point of all this is Well in this chapter pAB P AB and the one we use will depend on whether we feel it is more natural to talk about proportions or probabilities 13 1 2 Conditional Probability Conditional probability allows us to investigate one of the most basic questions in science How do we make use of partial information Consider again the hypothetical population presented in Table 13 2 and 13 4 Consider the CM of selecting one person at random from this population We see that P A 0 60 In words the probability is 60 that we will select a female But suppose we are given the partial information that the person selected answered yes to the question Given this information what is the probability the person selected is a female We write this symbolically as P A B The literal reading of this is The probability that A occurs given that B occurs How do we compute this We reason as follows Given that B occurs we know that the selected person is among the 500 in column B of Table 13 2 Of these 500 persons reading up the column we see that 360 of them are female Thus by direct reasoning P A B 360 500 0 72 which is different than P A 0 60 In words knowledge that the person usually wears corrective lenses in lecture increases the probability that the person is female We now repeat the above reasoning but using symbols instead of numbers Refer to Table 13 1 Given that B occurs we know that the selected subject is among the NB subjects column B Of these NB subjects reading up the column we see that NAB of them have property A Thus by direct reasoning we obtain the following equation P A B NAB NB 151 13 1 Table 13 5 Conditional Probabilities of the B s Given the A s in the Hypothetical Study of Sex and Lenses Yes B Female A 0 60 c Male A 0 35 Unconditional 0 50 No B c Total 0 40 1 00 0 65 1 00 0 50 1 00 Now this is a perfectly good equation relating the conditional probability of A given B to population counts Most statisticians however prefer a modification of this equation On the right side of the equation divide both the numerator and denominator by N This of course does not change the value …
View Full Document