UW-Madison STAT 371 - Simulation Studies - D2339305

Home> Schools> University of Wisconsin, Madison> Statistics (STAT) > STAT 371> Simulation Studies

DOC PREVIEW

UW-Madison STAT 371 - Simulation Studies

School name University of Wisconsin, Madison

Course Stat 371- Intro to Statistics

Pages 23

This preview shows page 1-2-22-23 out of 23 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 23 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Simulation Studies Spring 2011 Outline 1 Introduction 2 Confidence Intervals 3 Hypothesis Tests Outline 1 Introduction 2 Confidence Intervals 3 Hypothesis Tests Latin Terms for Some Different Types of Biological Experiments 1 in vivo within the living e g animal testing clinical trials experimentation using a whole living organism 2 in vitro within the glass e g test tube or petri dish experimentation using components of an organism that have been isolated and studied in a controlled biological environment 3 in silico in the computer fake Latin playing on silicon experimentation performed on computer or via computer simulation See Wikipedia Simulation Experiment in silico 1 Almost free compared to very expensive human or animal testing 2 Complete control the biologist and modeler set exactly the parameters of the system 3 A good first approximation for studying a biological system 4 Limited by computer precision and human knowledge how realistically can you actually model a biological system 5 NOT meant to replace in vivo or in vitro experiments Instead use computer simulation to complement these experiments Let s Focus 1 The above description of in silico experiments is meant as background 2 The topic of modeling and simulating complex biological systems could be studied in a year long course sequence 3 We will focus on a specific type of simulation study commonly used in statistics Monte Carlo experiments 1 2 4 rely on repeated random sampling Named after the Monte Carlo casino please do NOT be fooled into thinking you can beat the house the house always wins Usually when a statistician says Simulation Study he is referring to a Monte Carlo experiment we will use this meaning of simulation throughout Simulation Experiments in a typical statistics study 1 Confidence intervals and hypothesis testing have a repeated sampling interpretation 2 We do not actually want to collect 1000 different random samples from our target population remember we hope that a 95 CI would contain the true mean about 950 out of the 1000 times 3 Also we will never know the true mean in a real experimental situation But you know it in a computer experiment because you set the mean 4 You use the computer to generate pseudo random samples Game Plan We will explore simulation studies for one sample problems in the context of both 1 Confidence intervals 2 Hypothesis tests Note future work We are focusing on two procedures that we know confidence intervals hypothesis tests The real power of simulation studies comes in exploring the performance of a statistical procedure e g confidence interval in complex settings where there the distributional theory is unknown e g we cannot say the procedure is based on normality This is important to remember for those of you who continue with quantitative research e g anyone doing a research based masters or PhD most scientific fields are quantitative Outline 1 Introduction 2 Confidence Intervals 3 Hypothesis Tests Simulation Studies We will explore simulation studies in the context of 1 Confidence intervals 2 Hypothesis tests Example 1 Let s simulate one data set using R 2 We draw a pseudo random sample of size n 10 from the N 0 4 population rnorm And then use R to compute a confidence interval t test 3 t test spits out a lot of information but in particular it spits out the 95 confidence interval for the mean Example Continued x rnorm n 10 mean 0 sd 4 x 3 7032395 1 3803970 1 9454320 3 0871619 2 1036865 5 8024249 0 2351963 1 5853121 8 4006722 4 7959016 t test x 95 percent confidence interval 1 509859 4 227096 Extend the Example to a Simulation Experiment Repeat the above procedure 1000 times And check how many times the confidence interval contains the true mean We know the true mean is 0 because we can control everything in a computer experiment here we are drawing samples from a N 0 4 population 0 and 4 Continuous data t intervals N 1000 number of simulation count 0 counting the number of CI contain 0 for i in 1 N n 10 x rnorm n mean 0 sd 4 x bar mean x s sd x l x bar qt 0 975 n 1 s sqrt n u x bar qt 0 975 n 1 s sqrt n if l 0 u 0 count count 1 count N 1 0 952 Continuous Data z intervals If we use critical value of 1 96 from N 0 1 instead of t distribution we will get a worse result N 1000 number of simulation count 0 counting the number of CI contain 0 for i in 1 N n 10 x rnorm n mean 0 sd 4 x bar mean x s sd x l x bar 1 96 s sqrt n u x bar 1 96 s sqrt n if l 0 u 0 count count 1 count N 1 0 915 Continuous Data Uniform distribution Instead of use normal distribution we use the U 10 10 to generate data N 1000 number of simulation count 0 counting the number of CI contain 0 count z 0 counting z using 1 96 for i in 1 N n 10 x runif n min 10 max 10 x bar mean x s sd x l x bar qt 0 975 n 1 s sqrt n u x bar qt 0 975 n 1 s sqrt n l z x bar 1 96 s sqrt n u z x bar 1 96 s sqrt n if l 0 u 0 count count 1 if l z 0 u z 0 count z count z 1 count N 1 0 936 count z N 1 0 903 Uniform distribution increase n When number of observation n increases the central limit theorem works better N 1000 number of simulation count 0 counting the number of CI contain 0 count z 0 counting z using 1 96 for i in 1 N n 30 x runif n min 10 max 10 x bar mean x s sd x l x bar qt 0 975 n 1 s sqrt n u x bar qt 0 975 n 1 s sqrt n l z x bar 1 96 s sqrt n u z x bar 1 96 s sqrt n if l 0 u 0 count count 1 if l z 0 u z 0 count z count z 1 count N 1 0 949 count z N Discrete Data Binomial distribution Consider the confidence interval of the population proportion p Assume x B 50 p where we choose p 0 01 0 5 0 99 x We compare the covering probability by p x 2 and p n 4 n p cover prob p cover prob p 0 01 0 99 0 4 0 5 0 928 0 928 0 99 0 983 0 397 R code N 1000 number of simulation for p in c 0 01 0 5 0 99 count 0 counting the number of CI contain 0 count hat 0 for i in 1 N n 50 x rbinom 1 size n p p hat x n p tilde …

View Full Document