ExamplesCorrelationInference on the slope and the interceptResiduals, R commands and residual plotsR commands and Example: tree age in the Amazon rain forestChapter 12 Correlation and RegressionFall 2011Example: Energy expenditure in African Mole-rats●●●●●●●●●●●●●●●●0 50 100 150 2000 50 100 150 200Body mass (g)Daily energy expenditure (kJ/day)Example: snow fall and time to clean the streets●●●●●●●0 2 4 6 8 10 120 2 4 6 8 10 12Snow fall (in)Time to clean the streets (h)Correlation: examples●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●1 2 3 4 5 62 4 6 8 10 12 14r=0.95xy●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●1 2 3 4 5−20 −10 0 10 20 30r=0.17xyCorrelation: examples●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●1 2 3 4 5−10 −8 −6 −4 −2 0r=−0.94xy●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0−40 −30 −20 −10 0 10 20r=−0.32xyCorrelation: examples●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●1 2 3 4 52 4 6 8 10r=1xy●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●1 2 3 4 5−8 −6 −4 −2 0r=−1xyCorrelation: examples●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●1 2 3 4 50 1 2 3 4r=0xy●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●1 2 3 4 50 5 10 15r=−0.97xy●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●1 2 3 4 50 5 10 15r=0.97xyInference for the slope β1H0: Y is not linearly related to X, or β1= 0 versusHA: Y is linearly related to X, or β16= 0.Anova table and F test.Source df SS MS FRegression 1 b21P(xi−¯x)2SS/df MS(Reg)/MS(Res)Residual n − 2 ... SS/dfTotal n −1P(yi−¯y)2p-value: from F-distribution, df are 1 and n − 2.Residual standard deviation:se=pMS(res) =sP(yi−ˆyi)2n − 2Inference for the slope β1Or t-test.SEb1=sepP(xi−¯x)2t =b1SEb1on df = n − 2n = number of pairs (# of animals, etc.)We can also get confidence intervals for the true slope β1.b1= rsysxH0also means that the true correlation ρ = 0.Inference for the intercept β0SEb0= ses1n+¯x2P(xi−¯x)2Then we use a t-test witht =b0SEb0on df = n − 2 n = # of pairsSnowfall: se=pMS(res) =√0.23 = 0.48 hours, n = 7 days,¯x = 3.48 in,P(xi−¯x)2= 22.3 and b0= 0.31 hours.SEb0= 0.48q17+3.48222.3= 0.39 hours.A 95% confidence interval for β0is0.31 ± 2.571 ∗0.39 i.e. (−0.69, 1.31) hours.0 lies in the interval. Good!Residualsx yˆy = .31 + 1.38x r = y −ˆy3.2 4.9 4.73 0.171.4 2.4 2.24 0.162.6 4.4 3.90 0.506.9 9.6 9.83 -0.233.6 4.8 5.28 -0.481.7 2.1 2.66 -0.565.0 7.7 7.21 0.49Residual plots: snow fall data●●●●●●●0 2 4 6 8 10 120 2 4 6 8 10 12Snow fall (in)Time to clean the streets (h)●●●●●●●2 4 6 8 10−0.6 −0.4 −0.2 0.0 0.2 0.4Predicted valuesResidualsResidual plotsWe look at the residual plot to see if the assumptions of thelinear model are met2The relationship between X and Y is linear4The residual standard deviation σedoes not depend on X.(homogeneity of variance), Nice cloud of points without any patternResidual plots: mole-rat data●●●●●●●●●●●●●●●●0 50 100 150 2000 50 100 150 200Body mass (g)Daily energy expenditure (kJ/day)●●●●●●●●●●●●●●●●40 60 80 100 120 140−30 −10 0 10 20 30Predicted valuesResidualsR commands for regression> bodymass = c(42,57,70,74,65,79,82,...,158,165)> energy = c(40,43,53,60,72,69,70,...,105,168)> plot(bodymass,energy,pch=16)> cor(energy,bodymass)> fit=lm(energy~bodymass)> summary(fit)> anova(fit)> abline(fit)> residuals(fit)> plot(bodymass,residuals(fit), pch=16)> abline(h=0)> qqnorm(residuals(fit), pch=16)> plot(fit)Example: tree age in the Amazon rain forestExercise 12.4320 trees. X = diameter (cm) and Y = age (yr) using Carbondating.Analysis of Variance TableDf Sum Sq Mean Sq F value Pr(>F)diameter 1 423561 423561 5.0824 0.03687*Residuals 18 1500095 83339Coefficients:Estimate Std. Error t value Pr(>|t|)(Intercept) -18.770 265.148 -0.071 0.9443diameter 4.392 1.948 2.254 0.0369*r =
View Full Document