Simulation of Probability ModelsDiscrete Predictive SimulationSimulation for InferenceBret LargetDepartments of Botany and of StatisticsUniversity of Wisconsin—MadisonMarch 25, 20081 / 12OverviewWe will use simulation to assess uncertainty in predictions and inestimated regression coefficients.This is in contrast to deriving formulas for standard errors.One advantage is that we can assess uncertainty in quantities forwhich there are not formulas.A second advantage is we do not need to learn how to derive formulas.However, we will require learning to write small programs in R.2 / 12How many girls?How many girls do we expect in 500 births?The population probability of being a girl is 0.488.This is a relatively simple problem for which you learned a formulalast semester.We will compare answers with simulation and the formula.Simulation of Probability Models Discrete Predictive Simulation 3 / 12Formula AnswerWe can model the number of girls with a binomial distribution.The mean isµ = np = 500(0.488) = 244The standard deviation of the binomial distribution isσ =pnp(1 − p) =p500(0.488)(0.512).= 11.2With a normal approximation, we expect about a 95% chance that thenumber of girls would be between 244 ± 2 × 11 or from 222 to 266.A precise calculation of this probability is> sum(dbinom(222:266, 500, 0.488))[1] 0.9559977Simulation of Probability Models Discrete Predictive Simulation 4 / 12Simulation ApproachWe begin by writing a function in R to simulate the number of girls in500 births from the binomial distribution.The function rbinom() simulates draws from the binomialdistribution.The arguments are, in order, the number of draws, n, and p.Unfortunately, these are named n, size, and prob.> girls.sim = function(n, p) {+ return(rbinom(n = 1, size = n, prob = p))+ }> girls.sim(500, 0.488)[1] 244Simulation of Probability Models Discrete Predictive Simulation 5 / 12Simulation Approach (cont.)Use the replicate() function to repeat this many times, say 1000.The object n.girls stores the answers.> n.girls = replicate(1000, girls.sim(500, 0.488))> mean(n.girls)[1] 244.276> sd(n.girls)[1] 10.58394> sum((n.girls >= 222) & (n.girls <= 266))/1000[1] 0.962Simulation of Probability Models Discrete Predictive Simulation 6 / 12A Simpler VersionFor this particular example, we could have done this simulation with asingle R command without writing a function.However, the paradigm of writing a function to do one simulation andthen using replicate() to repeat this will be helpful for morecomplicated examples.> n.girls2 = rbinom(1000, 500, 0.488)> mean(n.girls2)[1] 243.681> sd(n.girls2)[1] 11.01583Simulation of Probability Models Discrete Predictive Simulation 7 / 12Sibling ExampleWhat if we wanted a more complicated example where we modeledthe number of girls from a Beta-Binomial distribution withoverdisp ersion similar to that observed?Finding a formula in this situation would be a lot of work, but asimulation is not too bad.The Beta distribution is for continuous numbers between 0 and 1 andis defined by has two parameters, α and β.Setting α = 9(0.488) and β = 9(0.512) fits the data well.Simulation of Probability Models Discrete Predictive Simulation 8 / 12Sibling Example (cont.)We use the function rbetabin.ab() from the VGAM library for theBeta-Binomial distribution.> siblings = read.table("siblings.txt", header = T)> family.size = with(siblings, boy + girl)> sum(family.size)[1] 145> with(siblings, sum(girl))[1] 72> library(VGAM)> girls.sim = function(sizes, p, M) {+ return(sum(rbetabin.ab(length(sizes), sizes, p *+ M, (1 - p) * M)))+ }> g = replicate(1000, girls.sim(sizes = family.size, p = 0.488,+ M = 9))Simulation of Probability Models Discrete Predictive Simulation 9 / 12Sibling Example (cont.)> mean(g)[1] 70.87> sd(g)[1] 6.801129> quantile(g, c(0.025, 0.05, 0.5, 0.95, 0.975))2.5% 5% 50% 95% 97.5%57 60 71 82 84Simulation of Probability Models Discrete Predictive Simulation 10 / 12Sibling Example (cont.)gPercent of Total051015202550 60 70 80 90Simulation of Probability Models Discrete Predictive Simulation 11 / 12Detach VGAMIf you use the VGAM library, it will mess up the predict() function.This command will unload a library.> detach("package:VGAM")Simulation of Probability Models Discrete Predictive Simulation 12 /
View Full Document