Multilevel Structures Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison April 17 2008 1 15 Data Description We consider a subset of a larger data set on corn grown on the island Antigua The response variable we consider is the harvest weight harvwt per plot units unknown There are eight sites with eight separate plots within each site where the corn is grown under the same treatment conditions We can ask if the site has an effect on the harvest weight In a standard regression framework we could analyze the data as a one way ANOVA with eight fixed parameters for the expected values an intercept which is the mean of a reference group and seven differences in means between the other groups and the reference and a single plot level source of error In a multilevel model we can have covariates and error associated with the plot level and separate covariates and error associated with the site level Multilevel Models Corn Example 2 15 Models Standard ANOVA model yi 1 2 1 site 2 8 1 site 8 ei where i 1 64 indexes the observation I I ei iid N 0 2 j j 1 8 and 2 are fixed parameters In a multilevel model we may have yi j i ei where i 1 64 indexes the observation and j i 1 8 indicates which of the eight sites contains the ith observation I I I j j 1 8 N 2 are random effects for the sites ei iid N 0 2 and 2 are fixed and unknown Notice here that we have a regression model for the response and also a regression model for the coeficients of the first regression model Multilevel models include sources of variation at more than one level Multilevel Models Corn Example 3 15 Data corn read table corn txt header T summary corn site DBAN 8 LFAN 8 NSAN 8 ORAN 8 OVAN 8 TEAN 8 Other 16 Multilevel Models block I 16 II 16 III 16 IV 16 ears Min 13 00 1st Qu 37 75 Median 43 00 Mean 41 22 3rd Qu 46 00 Max 58 00 Corn Example harvwt Min 1 280 1st Qu 2 935 Median 4 300 Mean 4 292 3rd Qu 5 442 Max 7 530 4 15 Plot of Data 6 harvwt 4 2 OVAN TEAN DBAN LFAN NSAN ORAN WEAN WLAN site Multilevel Models Corn Example 5 15 Standard Regression Model corn lm lm harvwt site data corn display corn lm lm formula harvwt site data corn coef est coef se Intercept 4 89 0 31 siteLFAN 0 68 0 44 siteNSAN 2 79 0 44 siteORAN 2 03 0 44 siteOVAN 0 05 0 44 siteTEAN 1 85 0 44 siteWEAN 0 64 0 44 siteWLAN 2 04 0 44 n 64 k 8 residual sd 0 87 R Squared 0 77 Multilevel Models Corn Example 6 15 Multilevel Model corn lmer lmer harvwt 1 site data corn display corn lmer lmer formula harvwt 1 site data corn coef est coef se Intercept 4 29 0 56 Error terms Groups Name Std Dev site 1 55 Residual 0 87 number of obs 64 groups site 8 AIC 192 9 DIC 190 3 deviance 189 6 Multilevel Models Corn Example 7 15 Comparing Models Discuss the different parameter estimates on the board Multilevel Models Corn Example 8 15 Linear Mixed Effects Models using lmer The most recently developed R package for fitting linear models with random effects is in the library lme4 The function to use instead of lm is named lmer A model formula with a random effect in lmer differs from lm by including a term of the form a b where a is a model matrix often the intercept 1 for the scope of the random effect and b is the group to which the random effect applies Computation lmer 9 15 Residual Plot 2 residuals corn lmer 1 0 1 2 3 2 3 4 5 6 7 fitted corn lmer Computation lmer 10 15 Compare Fitted Values means with corn sapply split harvwt site mean fitted lm with corn sapply split fitted corn lm site mean fitted lmer with corn sapply split fitted corn lmer site mean signif rbind means fitted lm fitted lmer 3 DBAN LFAN NSAN means 4 88 4 21 2 09 fitted lm 4 88 4 21 2 09 fitted lmer 4 86 4 21 2 17 ORAN 6 92 6 92 6 82 OVAN 4 83 4 83 4 81 TEAN 3 04 3 04 3 08 WEAN 5 53 5 53 5 48 WLAN 2 84 2 84 2 90 The overall mean is 4 29 The multilevel model shrinks the estimates toward the overall mean Computation lmer 11 15 Other Examples Multilevel models are often used in these situations I I I Repeated measures when a single individual is measured multiple times it is often appropriate to model two levels of variation one for individuals and one for measurements Split plot designs in agricultural or ecological studies it is often the case that sites are broken into plots and possibly subplots Variables can be measured at the site plot subplot or individual measurement level Multilevel models are also appropriate for non nested variables For example measurements could be clustered by year and by site if a single site is measured over multiple years Computation lmer 12 15 Summary of Classical Regression Prediction for continuous and discrete outcomes Fitting nonlinear relationships using transformations Inclusion of categorical predictors with indicator random variables Modeling interactions Causal inference Computation lmer 13 15 Motivations for Multilevel Models Accounting for both individual and group level variation in estimating group level effects Modeling individual level regression coefficients Estimation of effects for subgroups Computation lmer 14 15 When is it worth fitting multilevel models If the group size is small there may not be much data to estimate random effects and there is little to gain The complexity of multilevel models is greater than classical regression The added complexity is often worthwhile but perhaps not when there are only a small number say less than five individuals in a group Computation lmer 15 15
View Full Document