Outline 1 The traditional approach 2 The Mean Squares approach for the Completely randomized design CRD CRD and one way ANOVA Variance components and the F test Inference about the intercept Sample vs subsample size determination 3 The Mean Squares approach for CRD with subsampling Example and Model ANOVA table F tests Pairwise treatment comparisons with LSD Sample and subsample size determination Traditional methods for experimental studies An experimental design is chosen from a finite set catalog of experimental designs Designs are balanced at least in some way Designs are named Each design is associated to a particular model The traditional analysis of these models uses F tests as in ANOVA Because there are formulas for these F tests on nicely balanced data sets Remember At the time formulas were necessary for the calculations to be computationally tractable Traditional methods You already know Many of these designs we just need to name them All their associated models just need to recognize them The model for a particular traditional design is either a linear model or a random effect or a mixed effect model with a particular set of predictors and a particular set of interactions between them The models we have already covered are not restricted to catalog What we will cover the names of designs in the catalog and the predictors in associated models The F tests associated with these nicely balanced designs Traditional methods Benefits Communication names do help Organizing one s thoughts a catalog may help The F test and the likelihood ratio test are often similarly powerful but the F test can be more powerful Overall comparison of the F test and LRT F test restricted in use LRT always available ex logistic On fixed effect models the F test is exact the 2k based LRT is approximate On mixed effect models both tests are approximate The F test assumes that all variance components are known for sure The 2k based LRT might be so conservative or so anti conservative that we need a parametric bootstrap based p value With unbalanced data it is unclear what F distribution the F value should be compared with Use approximate denominator df The F test can be more powerful Outline 1 The traditional approach 2 The Mean Squares approach for the Completely randomized design CRD CRD and one way ANOVA Variance components and the F test Inference about the intercept Sample vs subsample size determination 3 The Mean Squares approach for CRD with subsampling Example and Model ANOVA table F tests Pairwise treatment comparisons with LSD Sample and subsample size determination Completely Randomized Design and one way ANOVA Yi j i ei with ei N 0 e2 independent Balanced design each of the k treatments is sampled n times nj n Example loblolly pine needles and stomata density k 10 needles n 4 rows from each needle Example corn yield k 8 sites n 8 plots from each site Fixed effects we will constrain the k deviations from the overall mean j to sum up to 0 In R we will use the sum contrast The default treatment contrast uses one reference level and constrains its ref i to be 0 Random effects assume j N 0 2 and independent Analysis in R default treatment contrast option corn read table corn txt header T getOption contrasts unordered ordered contr treatment contr poly lm ears site data corn Intercept siteLFAN siteNSAN 44 38 1 13 17 88 siteTEAN siteWEAN siteWLAN 6 63 0 25 6 12 with corn mean ears site DBAN 1 44 38 44 38 6 12 last site WLAN 1 50 5 with corn mean ears site WLAN 1 50 5 siteORAN 1 13 siteOVAN 4 88 Analysis in R using the sum contrast option oldoptions options contrasts c contr sum contr poly getOption contrasts 1 contr sum contr poly oldoptions contrasts saved in case we want to go back unordered ordered contr treatment contr poly fit lm lm ears site data corn re fit same model coef fit lm Intercept site1 site2 site3 site4 41 22 3 16 2 03 14 72 2 03 site5 site6 site7 1 72 3 47 3 41 with corn mean ears overall mean 1 41 22 how much does the last site deviate from the overall mean alpha 8 sum coef fit lm 2 8 1 9 28 41 22 9 28 1 50 5 indeed this is the mean of last site WLAN The F test Yi j i ei Model F Test with ei N 0 e2 independent Fixed Effects Pk i j i 0 H0 j 0 for all j Random Effects j iid N 0 2 H0 2 0 ANOVA table with an extra column expected Mean Squares Source Trt Error Total df k 1 k n 1 kn 1 SS SSTrt SSErr SSTot MS MSTrt MSErr IE MS fixed e2 e2 n Pk 2 i 1 i k 1 random e2 n 2 e2 MSTrt F test uses the fact that F MSErr Fk 1 k n 1 if H0 is true no needle effect in both models Variance component estimation with Mean Squares Random effect model Source Trt Error Total df k 1 k n 1 kn 1 SS SSTrt SSErr SSTot MS MSTrt MSErr IE MS e2 n 2 e2 We can use E MS to estimate e2 MSErr and 2 MSTrt MSErr n These estimates may differ from the ML or REML estimates This 2 calculation may give a negative value In this case take 2 0 Variance component estimation with Mean Squares With a balanced design the REML estimates of e2 and 2 and the MS based estimates are equal fit lm lm ears site data corn anova fit lm Df Sum Sq Mean Sq F value Pr F site 7 2780 7 397 24 18 230 2 205e 12 Residuals 56 1220 2 21 79 397 24 21 79 8 1 46 93125 n 8 plots per site that is s 2 site Mean Square estimation compare with REML fit lmer lmer ears 1 site data corn fit lmer Random effects Groups Name Variance Std Dev site Intercept 46 931 6 8506 Residual 21 790 4 6680 Number of obs 64 groups site 8 Inference about with Mean Squares Assuming we have a balanced design nj n the intercept estimate is the grand mean P P P ei j j i yi y i nk k nk so its variance is var 2 2 n 2 e2 e k nk nk Since IE MSTrt n 2 e2 we estimate var by r SE of MSTrt and nk MSTrt on dfMSTrt k 1 nk Confidence intervals obtained using the t distribution df k 1 Inference about with Mean Squares anova fit lm Df Sum Sq Mean Sq F value Pr F site 7 2780 7 397 24 18 230 2 205e 12 Residuals 56 1220 2 21 79 sqrt 397 24 8 8 1 2 49136 SE for mu sqrt of MSTrt nk summary fit lmer Fixed effects Estimate Std Error t value Intercept 41 219 2 491 16 55 T test for mu 0 in the random effect model 2 …
View Full Document