Data Description We consider a subset of a larger data set on corn grown on the island Antigua Multilevel Structures The response variable we consider is the harvest weight harvwt per plot units unknown Bret Larget There are eight sites with eight separate plots within each site where the corn is grown under the same treatment conditions We can ask if the site has an effect on the harvest weight Departments of Botany and of Statistics University of Wisconsin Madison In a standard regression framework we could analyze the data as a one way ANOVA with eight fixed parameters for the expected values an intercept which is the mean of a reference group and seven differences in means between the other groups and the reference and a single plot level source of error April 17 2008 In a multilevel model we can have covariates and error associated with the plot level and separate covariates and error associated with the site level 1 15 Models Multilevel Models Corn Example 2 15 Data Standard ANOVA model yi 1 2 1 site 2 8 1 site 8 ei corn read table corn txt header T summary corn where i 1 64 indexes the observation I I ei iid N 0 2 j j 1 8 and 2 are fixed parameters site DBAN 8 LFAN 8 NSAN 8 ORAN 8 OVAN 8 TEAN 8 Other 16 In a multilevel model we may have yi j i ei where i 1 64 indexes the observation and j i 1 8 indicates which of the eight sites contains the ith observation I I I j j 1 8 N 2 are random effects for the sites ei iid N 0 2 and 2 are fixed and unknown block I 16 II 16 III 16 IV 16 ears Min 13 00 1st Qu 37 75 Median 43 00 Mean 41 22 3rd Qu 46 00 Max 58 00 harvwt Min 1 280 1st Qu 2 935 Median 4 300 Mean 4 292 3rd Qu 5 442 Max 7 530 Notice here that we have a regression model for the response and also a regression model for the coeficients of the first regression model Multilevel models include sources of variation at more than one level Multilevel Models Corn Example 3 15 Multilevel Models Corn Example 4 15 Plot of Data Standard Regression Model corn lm lm harvwt site data corn display corn lm harvwt 4 lm formula harvwt site data corn coef est coef se Intercept 4 89 0 31 siteLFAN 0 68 0 44 siteNSAN 2 79 0 44 siteORAN 2 03 0 44 siteOVAN 0 05 0 44 siteTEAN 1 85 0 44 siteWEAN 0 64 0 44 siteWLAN 2 04 0 44 n 64 k 8 residual sd 0 87 R Squared 0 77 6 2 OVAN TEAN DBAN LFAN NSAN ORAN WEAN WLAN site Multilevel Models Corn Example 5 15 Multilevel Model Multilevel Models Corn Example 6 15 Comparing Models corn lmer lmer harvwt 1 site data corn display corn lmer lmer formula harvwt 1 site data corn coef est coef se Intercept 4 29 0 56 Discuss the different parameter estimates on the board Error terms Groups Name Std Dev site 1 55 Residual 0 87 number of obs 64 groups site 8 AIC 192 9 DIC 190 3 deviance 189 6 Multilevel Models Corn Example 7 15 Multilevel Models Corn Example 8 15 Linear Mixed Effects Models using lmer Residual Plot 2 The function to use instead of lm is named lmer A model formula with a random effect in lmer differs from lm by including a term of the form a b where a is a model matrix often the intercept 1 for the scope of the random effect and b is the group to which the random effect applies 1 0 1 residuals corn lmer The most recently developed R package for fitting linear models with random effects is in the library lme4 2 3 2 3 4 5 6 7 fitted corn lmer Computation lmer 9 15 Compare Fitted Values LFAN 4 21 4 21 4 21 NSAN 2 09 2 09 2 17 ORAN 6 92 6 92 6 82 lmer 10 15 Other Examples means with corn sapply split harvwt site mean fitted lm with corn sapply split fitted corn lm site mean fitted lmer with corn sapply split fitted corn lmer site mean signif rbind means fitted lm fitted lmer 3 DBAN means 4 88 fitted lm 4 88 fitted lmer 4 86 Computation OVAN 4 83 4 83 4 81 TEAN 3 04 3 04 3 08 WEAN 5 53 5 53 5 48 Multilevel models are often used in these situations I I WLAN 2 84 2 84 2 90 I The overall mean is 4 29 Repeated measures when a single individual is measured multiple times it is often appropriate to model two levels of variation one for individuals and one for measurements Split plot designs in agricultural or ecological studies it is often the case that sites are broken into plots and possibly subplots Variables can be measured at the site plot subplot or individual measurement level Multilevel models are also appropriate for non nested variables For example measurements could be clustered by year and by site if a single site is measured over multiple years The multilevel model shrinks the estimates toward the overall mean Computation lmer 11 15 Computation lmer 12 15 Summary of Classical Regression Motivations for Multilevel Models Prediction for continuous and discrete outcomes Fitting nonlinear relationships using transformations Accounting for both individual and group level variation in estimating group level effects Inclusion of categorical predictors with indicator random variables Modeling individual level regression coefficients Modeling interactions Estimation of effects for subgroups Causal inference Computation lmer 13 15 When is it worth fitting multilevel models If the group size is small there may not be much data to estimate random effects and there is little to gain The complexity of multilevel models is greater than classical regression The added complexity is often worthwhile but perhaps not when there are only a small number say less than five individuals in a group Computation lmer 15 15 Computation lmer 14 15
View Full Document