122S:138The Likelihood PrincipleLecture 21Nov. 13, 2009Kate Cowles374 SH, [email protected] likelihood principle• Suppose that two different experiments mayinform about an unknown parameter θ• Suppose the outcomes of the experiementsare respectively y∗and z∗• Suppose the likelihoods for θ resulting fromthe two experiements are proportional; thatisp(y∗; θ) = c p(z∗; θ)where c is a constant• Then the information about θ contained inboth experiments is equivalent3Another way to s tate the likelihood prin-ciple• For a given sample of data, any two proba-bility models p(y|θ) that have the same like-lihood function yield the same inference forθ.• With regard to the information contained inthe data about the unknown parameter(s),only the actua l observed data y is relevant.– No other possible outcomes∗ Contrast this w ith the frequents p-valuethe probability assuming H0is true, ofgetting a test statistic as extreme as, ormore extreme than, th e value that wasactually obtained– Not the researchers’ intentions4Example• We are given a coin. We are interested inestimating θ, the probabi lity of obtaining ahead on a single flip.• We want to test the hypotheses:H0: θ =12Ha: θ >12• Experiment consists of fli pping coin 12 timesindependently.• Result is 9 heads and 3 tails.5Example, continued• There are (at least) two possible ways theexperiment might have been conducted:– Design 1: do 12 flips. The random variableY is the number of heads obtained in n =12 flips.– Design 2: Flip the coin until 9 heads a reobtained. Random variable Y is the num-ber of tails that are obtained before theninth head.• Frequentist inference for θ would be d ifferentdepending on which design is used.• Bayesian inferen c e wou ld be the same underboth designs because the likelihoods are pro-portional.• The negative binomial distribution– Y = the number of failures observed ina sequence of independent Bernoulli trial s6before the kthsuccess– Y ∼ NB(k, p)– p(y|p) =k + y − 1ypk(1 − p)y– E(Y ) =k(1−p)p7Implications of the likelihood principle• the stopping rule pri nciple• the likelihood principle and reference priors8“Stopping rules” are often used in de-signing frequentist statistical studies• instead of a fixed sample size• to make it possible to stop a study early ifthe results are in• particularly common in clinical trials– reducing the size a nd duration of a clinicaltrial reduces the number of patients whoare exposed to the treatment that will befound to be inf eri or and speeds up the dis-semination of the results to the medicalcommunity• Frequentist statisticians must choose the stop-ping rule before the experiment is con ductedand adhere to it exactly– deviations can produce serious errors if afrequentist analysis i s used9• Large frequentist literature on how to controlthe overall probability of Type I error whileallowing for more than one anal ysis of thedata10Stopping Rule Principle• In a sequential experiment, the evidence p ro-vided by the experiment about th e value ofthe unknown parameter(s) θ should not de-pend on the stopping rule• follows directly from the likelihood principle11Jeffreys’ priors and the likelihood prin-ciple• recall Jeffreys’ prior– “reference” prior– noninformative– invariant to transformations of parameters– p(θ) ∝ [I(θ)]12where I(θ) is the expected Fisher informa-tion for θ• Jeffreys’ prior when likelihood is Bino mia l(n,θ)p(θ) ∝ θ−12(1 − θ)−12= Beta(12,12)• Jeffreys’ prior when likelihood is nega tive binomial(k,p(θ) ∝ θ−1(1 − θ)−12= Beta(0,12)12• So use of Jeffreys’ prior in some cases canviolate the li kelihoo d
View Full Document