Multiple Linear Regression Case Study Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 5 2008 1 15 Birds and Bats Birds and bats must expend considerable energy to fly Some bats use echolocation in flight which also requires energy Other bats eat fruit and do not have the ability to echolocate Scientists studied energy use of several species of birds and bats to examine the relationship between mass and energy expenditure during flight to see if echolocating bats had a higher cost Variables are mass grams type factor with levels bird eBat and nBat latter two for echolocating and non echolocating and the response energy Watts Case Study Birds and Bats 2 15 Data bats read table bats txt header T bats species 1 PteropusGouldi 2 PteropusPoliocephalus 3 HypsignathusMonstrosus 4 EidolonHelvum 5 MeliphagaVirescens 6 MelipsittacusUndulatus 7 SturmisVulgaris 8 FalcoSpaverius 9 FalcoTinnunculus 10 CorvusOssifragus 11 LarusAtricilla 12 ColumbaLivia 13 ColumbaLivia 14 ColumbaLivia 15 ColumbaLivia 16 CorvusCrytoleucos 17 PhyllostomasHastatus 18 PlecotusAuritus 19 PipistrellusPipistrellus 20 PlecotusAuritus Case Study mass 779 0 628 0 258 0 315 0 24 3 35 0 72 8 120 0 213 0 275 0 370 0 384 0 442 0 412 0 330 0 480 0 93 0 8 0 6 7 7 7 Notice that both mass and energy span different orders of magnitude type energy nBat 43 70 nBat 34 80 nBat 23 30 nBat 22 40 bird 2 46 bird 3 93 bird 9 15 bird 13 80 bird 14 60 bird 22 80 bird 26 20 bird 25 90 bird 29 50 bird 43 70 bird 34 00 bird 27 80 eBat 8 83 eBat 1 35 eBat 1 12 eBat 1 02 The two bat types are quite different in mass Birds fill the gap Each observation corresponds to a single study Some studies are on the same species Birds and Bats 3 15 Box and Whisker Plots 800 40 600 30 energy mass 400 20 200 10 0 bird Case Study eBat 0 nBat bird Birds and Bats eBat nBat 4 15 Scatterplot bird eBat nBat 40 30 energy 20 10 0 0 200 400 600 800 mass Case Study Birds and Bats 5 15 Observations The scatterplot reveals potential problems with fitting a standard regression model I I I Two bird observations appear to be potential outliers There is some apparent curvature Points with high mass have more variable energy measurements than points with low mass We will however fit a few models to illustrate the method and to show how these potential problems can be identified more readily with residual plots Case Study Birds and Bats 6 15 Fitting Models fit0 is a simple linear regression of energy on mass fit1 adds type as an input variable This has the effect of allowing the intercept to be different for each type fit0 lm energy mass data bats fit1 lm energy mass type data bats fit2 lm energy mass type data bats Case Study fit2 has mass and type and an interaction between them This has allows each type to have its own slope and intercept Birds and Bats 7 15 Plots of Fitted Models bird eBat nBat bird eBat nBat 40 30 30 20 20 10 energy energy 20 10 0 200 400 mass Case Study 600 800 0 0 10 nBat 40 30 eBat 40 energy bird 0 0 200 400 mass Birds and Bats 600 800 0 200 400 600 800 mass 8 15 Estimated Coefficients coef fit0 Intercept 4 09991727 mass 0 05869642 fit0 shows the intercept and parameter for mass which is the slope Case Study Birds and Bats 9 15 Estimated Coefficients coef fit1 Intercept 6 02197707 mass typeeBat typenBat 0 05749542 4 60071984 3 43220829 fit1 shows an intercept for all predictions a parameter for mass which is the common slope and then adjustments to be made if the type is eBat or nBat In effect these are estimated differences of the intercept relative to bird For birds the intercept is 6 02 For echolocating bats the intercept is 6 02 4 6 1 42 For non echolocating bats the intercept is 6 02 3 43 2 59 The three lines are parallel and share the common slope 0 0575 Case Study Birds and Bats 10 15 Estimated Coefficients coef fit2 Intercept mass 3 31674159 0 06777464 mass typeeBat mass typenBat 0 02186199 0 02772895 typeeBat 2 82275855 typenBat 7 91064213 fit2 shows six estimated coefficients the intercept and slope for bird and then adjustments to each of these for the other types For birds the intercept is 3 32 and the slope is 0 0678 For echolocating bats the intercept is 3 32 2 82 0 494 and the slope is 0 0678 0 0219 0 0896 For non echolocating bats the intercept is 3 32 7 91 11 2 and the slope is 0 0678 0 0277 0 04 Case Study Birds and Bats 11 15 Interpretation of Coefficients coef fit2 Intercept mass 3 31674159 0 06777464 mass typeeBat mass typenBat 0 02186199 0 02772895 typeeBat 2 82275855 typenBat 7 91064213 The intercept is the predicted energy of a bird at mass 0 no biological relevance The third coefficient is the estimated difference between the predicted energies for echolocating bats and birds at mass 0 Notice that the predicted difference is not the same at all masses This parameter has no biological significance also Similar comments can be made about the non echolocating bats in particular even though the intercept for non echolocating bats is higher than for birds at the range of mass where there are both birds and non echolocating bats the bird line is higher Case Study Birds and Bats 12 15 Residual Plot 10 plot xyplot residuals fit2 fitted fit2 pch 16 Notice the fan shaped pattern Residuals are larger for large mass residuals fit2 Residual plot from last fit 5 0 5 A transformation may help 0 10 20 30 40 fitted fit2 Case Study Birds and Bats 13 15 Log Transformed Data bird eBat nBat 3 log energy Log transformation of both variables leads to data that better fits linear model assumptions 2 1 0 2 3 4 5 6 log mass Case Study Birds and Bats 14 15 More Analysis Do remaining analysis live in R Case Study Birds and Bats 15 15
View Full Document