Introduction to Likelihood

This presentation is made available through a Creative Commons Attribution-Noncommercial license. Details of the license and permitted uses are available at http://creativecommons.org/licenses/by-nc/3.0/ © 2010 Steve Bellan and the Meaningful Modeling of Epidemiological Data Clinic Details of the license and permitted uses are available at http://creativecommons.org/licenses/by-nc/3.0/ © 2010 Steve Bellan and the Meaningful Modeling of Epidemiological Data ClinicIntroduction to LikelihoodMeaningful Modeling of Epidemiologic Data, 2012AIMS, Muizenberg, South AfricaSteve Bellan, MPH, PhDDepartment of Environmental Science, Policy & ManagementUniversity of California at Berkeleybarplot(dbinom(x = 0:100, size = 100, prob = .3), names.arg = 0:size)In a population of 1,000,000 people with a true prevalence of 30%, the probability distribution of number of positive individuals if 100 are sampled: f (x) =100xæ è ç ö ø ÷ (0.3)x(0.7)100- xIn a population of 1,000,000 people with a true prevalence of 30%, the probability distribution of number of positive individuals if 100 are sampled: f (x) =100xæ è ç ö ø ÷ (0.3)x(0.7)100- x> rbinom(n = 1, size = 100, prob = .3) 28We sample 100 people once and 28 are positive:> rbinom(n = 1, size = 100, prob = .3) 28We sample 100 people once and 28 are positive:We don’t know the true prevalence!But we can calculate the probability of 28 or a more extreme value occurring for a given prevalence.0 2 4 6 8 11 14 17 20 23 26 29 32 35 38 41 44 47 50 53 56 59 62 65 68 71 74 77 80 83 86 89 92 95 98number HIV+probability0.00 0.02 0.04 0.06 0.08We sample 100 people once and 28 are positive.p-value = 0.74> 2*pbinom(28,100,.3)) 0.7535564Cumulative Probability & P Valuesfor 30% prevalence:p(28 or a more extreme value occurring) =0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99If true prevalence were 15%, then p(28 or more extreme) isnumber HIV+probability0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.142*pbinom(28−1, 100, 0.15, lower.tail = FALSE)p = 0.00123x20 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99If true prevalence were 20%, then p(28 or more extreme) isnumber HIV+probability0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.142*pbinom(28−1, 100, 0.2, lower.tail = FALSE)p = 0.0683x20 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99If true prevalence were 25%, then p(28 or more extreme) isnumber HIV+probability0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.142*pbinom(28−1, 100, 0.25, lower.tail = FALSE)p = 0.555x20 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99If true prevalence were 30%, then p(28 or more extreme) isnumber HIV+probability0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.142*pbinom(28, 100, 0.3, lower.tail = TRUE)p = 0.754x20 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99If true prevalence were 35%, then p(28 or more extreme) isnumber HIV+probability0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.142*pbinom(28, 100, 0.35, lower.tail = TRUE)p = 0.17x20 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99If true prevalence were 40%, then p(28 or more extreme) isnumber HIV+probability0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.142*pbinom(28, 100, 0.4, lower.tail = TRUE)p = 0.0169x20 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 15 %number HIV+probability0.00 0.04 0.08 0.12p = 0.001230 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 20 %number HIV+probability0.00 0.04 0.08 0.12p = 0.06830 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 25 %number HIV+probability0.00 0.04 0.08 0.12p = 0.5550 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 30 %number HIV+probability0.00 0.04 0.08 0.12p = 0.7540 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 35 %number HIV+probability0.00 0.04 0.08 0.12p = 0.170 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 40 %number HIV+probability0.00 0.04 0.08 0.12p = 0.0169x2x2x2x2x2x2Which hypotheses do we reject?IF GIVEN THE HYPOTHESIS p value < cutoff THEN REJECT HYPOTHESISCutoff usually chosen as α = 0.050 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 15 %number HIV+probability0.00 0.04 0.08 0.12p = 0.001230 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 20 %number HIV+probability0.00 0.04 0.08 0.12p = 0.06830 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 25 %number HIV+probability0.00 0.04 0.08 0.12p = 0.5550 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 30 %number HIV+probability0.00 0.04 0.08 0.12p = 0.7540 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 35 %number HIV+probability0.00 0.04 0.08 0.12p = 0.170 5 11 18 25 32 39 46 53 60 67 74 81 88 95hypothetical prevalence: 40 %number HIV+probability0.00 0.04 0.08 0.12p = 0.0169Which hypotheses do we reject?0.0 0.2 0.4 0.6 0.8 1.00.0 0.2 0.4 0.6 0.8 1.0hypothetical prevalencep−value0.195 0.37895% CI includes HIV prevalences of 19.5% to 37.8%Which hypotheses do we NOT reject: CONFIDENCE INTERVALWe don’t know the true prevalence, but the probability that we had exactly 28/100 with 30% prevalence is:> dbinom(x = 28, size = 100, prob = .3) 0.08041202> rbinom(n = 1, size = 100, prob = .3) 28We sample 100 people once and 28 are positive:Let’s take another approachWhich prevalence gives the greatest probability of observing exactly 28/100?Which of these prevalence values is most likely given our data?Maximum Likelihood Estimate parameter value giving greatest probability of the data having occurred.MLE = 28/100 = 0.28What do you think is the MLE here?true unknown value = 0.30different null hypothesesDefining Likelihood• L(parameter | data) = p(data | parameter)•Not a probability distribution.•Probabilities taken from many different distributions. f (x | p) =nxæ è ç ö ø ÷ px(1- p)n- xfunction of xPDF: L( p| x) =nxæ è Unlocking...