Jaime FradeStat Apps 2: HW4Problem 5.5Problem 5.5.1Draw Scatter plot matrixComment:From the scatter plot there seems to be -high linear relationship between “photo vs obs1,” “photo vs obs2,” and “obs1 vs obs2”-Obs1 and Obs2 have a high linear relationship which may suggest that the linear regression model might be appropriate for the regression of Photo on either one of the observer counts.For simple regression model of Photo on Obs1, the error measures the count error of observer one.In this problem, want to find out the linear relationship between the exact number of birds and the observations. The Photo gives the exact number of the snow geese. So it is appropriate to fit the regression of Photo on Obs1 rather than the regression of Obs1 on Photo.CODE: (R-Code)#install.packages("alr3")library(alr3)data(snowgeese)attributes(snowgeese)pairs(snowgeese)Problem 5.5.2Jaime FradeStat Apps 2: HW4Problem 5.5Fitting a regression of Photo on obs1Call:lm(formula = photo ~ obs1, data = snowgeese)Residuals: Min 1Q Median 3Q Max -125.928 -18.713 -9.033 11.699 161.711 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 26.64957 8.61448 3.094 0.00347 ** obs1 0.88256 0.07764 11.367 1.54e-14 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 44.41 on 43 degrees of freedomMultiple R-squared: 0.7503, Adjusted R-squared: 0.7445 F-statistic: 129.2 on 1 and 43 DF, p-value: 1.537e-14TestingJaime FradeStat Apps 2: HW4Problem 5.5H0: E(Y|X=x) = X (1)HA: E(Y|X=x) = β0 + β1 X (2)Compute the RSS1 = 104390 with df(1) = 45RSS2 = 84806.67 with df(2) = 43Using F-test,F= [(RSS1- RSS2)/ (45-43) ] / (RSS2/43) = 4.96 > F0.05, 2,43Reject H0 and conclude that reduced model is not adequate. The observer is not reliable. Reliable means that we can use the observer one’s counting as the actual number of the snow geese. In the null hypothesis, it can be interpreted as the photo method is similar to the old observation. The alternative has both outcomes are linearly correlated but one is consistently different (improve or not improve) than the other.CODE: (R-Code)fit1 = lm(photo~obs1, snowgeese)summary(fit1)par(mfrow=c(2,2))plot(fit1)Problem 5.5.3Jaime FradeStat Apps 2: HW4Problem 5.5Regression of the square root of Photo on square root of obs1Call:lm(formula = sqrt(photo) ~ sqrt(obs1), data = snowgeese)Residuals: Min 1Q Median 3Q Max -3.9532 -0.7922 -0.1478 0.8080 3.8801 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.61030 0.53529 3.008 0.00438 ** sqrt(obs1) 0.93182 0.06353 14.668 < 2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.632 on 43 degrees of freedomMultiple R-squared: 0.8334, Adjusted R-squared: 0.8296 F-statistic: 215.2 on 1 and 43 DF, p-value: < 2.2e-16TestingH0: E(sqrt(Y)|sqrt(X)=x) = X (1)HA: E(sqrt(Y)|sqrt(X)=x) = β0 + β1 X (2)Jaime FradeStat Apps 2: HW4Problem 5.5Compute the RSS1 = 171.8383 with df(1) = 45RSS2 = 114.52 with df(2) = 43Using F-test,F= [(RSS1- RSS2)/ (45-43) ] / (RSS2/43) = 10.76 > F0.05, 2,43Reject H0 and conclude that reduced model is not adequate. Interpretation of hypothesis are similar to previous, however, there is a transformation here. R-squared is better in this model than the previous and the normality plots assumptions seem to improve. The error from the variance has been significantly decreased. CODE: (R-Code)fit2 = lm(sqrt(photo)~sqrt(obs1), snowgeese)summary(fit2)par(mfrow=c(2,2))plot(fit2)Problem 5.5.4Call:lm(formula = photo ~ obs1, data = snowgeese, weights = 1/obs1)Residuals: Min 1Q Median 3Q Max -10.3418 -2.1016 -0.1185 1.8679 7.4482Jaime FradeStat Apps 2: HW4Problem 5.5Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 9.21872 4.29810 2.145 0.0377 * obs1 1.12806 0.09015 12.514 6.3e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.776 on 43 degrees of freedomMultiple R-squared: 0.7846, Adjusted R-squared: 0.7796 F-statistic: 156.6 on 1 and 43 DF, p-value: 6.297e-16TestingH0: E(Y|X=x) = X (1)HA: E(Y|X=x) = β0 + β1 X (2)Compute the RSS1 = ∑(yi – xi) = 104390 with df(1) = 45RSS2 = 613.1 with df(2) = 43Using F-test,Jaime FradeStat Apps 2: HW4Problem 5.5F= [(RSS1- RSS2)/ (45-43) ] / (RSS2/43) = 3639.2 > F0.05, 2,43Reject H0 and conclude that reduced model is not adequate. The reduced model is very bad for description of theproblem. From previous problems, the RSS is decreasing and the F-statistic is becoming larger, showing more accuracy in results. CODE: (R-Code)fit3 = lm(photo~obs1,weights=1/obs1, snowgeese)summary(fit3)par(mfrow=c(2,2))plot(fit3)Problem 5.5.5Apply OLS of average and difference of obs1 and obs2Call:lm(formula = photo ~ avg + diff, data = snowgeese)Residuals: Min 1Q Median 3Q Max -69.917 -12.196 -1.831 8.239 159.884 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 16.2227 6.8833 2.357 0.0232 * avg 0.7914 0.0619 12.784 4.54e-16 ***diff -0.3052 0.1378 -2.214 0.0323 * ---Jaime FradeStat Apps 2: HW4Problem 5.5Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 34.13 on 42 degrees of freedomMultiple R-squared: 0.8559, Adjusted R-squared: 0.849 F-statistic: 124.7 on 2 and 42 DF, p-value: < 2.2e-16TestingH0: E(Y|X=x) = Average (1)HA: E(Y|X=x) = β0 + β1 avg + β2 diff (2)Compute the RSS1 = 64532.75 with df(1) = 45RSS2 = 48924 with df(2) = 42Using F-test,F= 4.467 > F0.05, 2,42Reject H0 CODE: (R-Code)snowgeese$avg = (snowgeese$obs1+snowgeese$obs2)/2snowgeese$diff = (snowgeese$obs1-snowgeese$obs2)Jaime FradeStat Apps 2: HW4Problem 5.5fit4 = lm(photo~avg+diff,snowgeese)summary(fit4)par(mfrow=c(2,2))plot(fit4)Apply WLS, using 1/average as weightsCall:lm(formula = photo ~ avg + diff, data = snowgeese, weights = 1/avg)Residuals: Min 1Q Median 3Q Max -5.9953 -1.3944 0.1673 1.2159 8.2148 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8.0352 3.3979 2.365 0.0227 * avg 0.9361 0.0775 12.079 2.99e-15 ***diff -0.1465 0.1486
View Full Document