Chapter 3 homework3.4 Here is my code and the corresponding output. > source("C:/chris/UNL/STAT870/Chapter3/examine.mod.simple.R")> ############################################################################> # NAME: Chris Bilder #> # DATE: 7-21-06 #> # PURPOSE: 3.4 in KNN #> # #> # NOTES: 1) #> # #> ############################################################################ > #Read in the data> copier<-read.table(file = "C:\\chris\\UNL\\STAT870\\Instructor_CD\\Data Sets\\Chapter 1 Data Sets\\CH01PR20.txt", header = FALSE, col.names = c("minutes", "copiers"), sep = "")> #Check first few observations> head(copier) minutes copiers1 20 22 60 43 46 34 41 25 12 16 137 10> ################################################################################> # 3.4 > #Fit the simple linear regression model and save the results to mod.fit> mod.fit<-lm(formula = minutes ~ copiers, data = copier)> sum.fit<-summary(mod.fit)> sum.fitCall:lm(formula = minutes ~ copiers, data = copier)Residuals: Min 1Q Median 3Q Max -22.7723 -3.7371 0.3334 6.3334 15.4039 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.5802 2.8039 -0.207 0.837 copiers 15.0352 0.4831 31.123 <2e-16 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 8.914 on 43 degrees of freedomMultiple R-Squared: 0.9575, Adjusted R-squared: 0.9565 F-statistic: 968.7 on 1 and 43 DF, p-value: < 2.2e-16 > #Again, just because this function makes a student's life simpler,> # you are still responsible for knowing all the code within it! > save.it<-examine.mod.simple(mod.fit.obj = mod.fit, var.test = TRUE)> save.it$sum.data Y X 1Min. : 3.00 Min. : 1.000 1st Qu.: 36.00 1st Qu.: 2.000 Median : 74.00 Median : 5.000 Mean : 76.27 Mean : 5.111 3rd Qu.:111.00 3rd Qu.: 7.000 Max. :156.00 Max. :10.000 $semi.stud.resid 1 2 3 4 5 6 7 8 9 10 11 12 13 -1.06 0.05 0.17 1.29 -0.28 -1.43 -0.74 1.62 -1.17 0.28 1.04 0.70 0.38 14 15 16 17 18 19 20 21 22 23 24 25 26 -0.96 1.40 -2.21 0.04 1.27 -2.55 -0.29 -0.96 -0.41 0.49 -0.07 -0.08 0.82 27 28 29 30 31 32 33 34 35 36 37 38 39 -1.29 -0.18 0.71 0.71 0.37 1.73 -1.06 -0.17 -1.29 -0.29 1.28 -0.31 0.82 40 41 42 43 44 45 1.41 -0.42 0.51 -0.28 0.16 0.27 $leveneLevene's Test for Homogeneity of Variance Df F value Pr(>F)group 1 0.4744 0.4947 43 $bp Breusch-Pagan testdata: Y ~ X BP = 1.3147, df = 1, p-value = 0.25152 4 6 8 10Box plot Predictor variable2 4 6 8 10Dot plotPredictor variable0 50 100 150Box plot Response variable0 50 100 150Dot plotResponse variable2 4 6 8 100 50 100 150Response vs. predictorPredictor variableResponse variable2 4 6 8 10-20 -10 0 10Residuals vs. predictorPredictor variableResiduals20 40 60 80 100 120 140-20 -10 0 10Residuals vs. estimated mean responseEstimated mean responseResiduals20 40 60 80 100 120 140-3 -2 -1 0 1 2 3ei* vs. estimated mean responseEstimated mean responseSemistud. residuals20 10 20 30 40-20 -10 0 10Residuals vs. observation numberObservation numberResidualsHistogram of semistud. residualsSemistud. residualsDensity-3 -2 -1 0 1 20.0 0.1 0.2 0.3 0.4 0.5-2 -1 0 1 2-2 -1 0 1Normal Q-Q PlotTheoretical QuantilesSemistud. residuals>################################################################################## >#3.4h> copier2<-read.table(file = "C:\\chris\\UNL\\STAT870\\Instructor_CD\\Data Sets\\Chapter 3 Data Sets\\CH03PR04.txt", header = FALSE, col.names = c("minutes", "copiers", "age.copier", "experience"), sep = "")> head(copier2) minutes copiers age.copier experience1 20 2 20 42 60 4 19 53 46 3 27 44 41 2 32 15 12 1 24 46 137 10 26 4 > par(mfrow = c(1,2)) > plot(x = copier2$age.copier, y = mod.fit$residuals, xlab = "Age of copier (months)", ylab = "Residual", main = "Residual vs. Copier age", panel.first = grid(col = "gray", lty = "dotted"))> abline(h = 0, col = "red")> abline(lm(formula = mod.fit$residuals ~ copier2$age.copier), col = "darkgreen") #Quick way to get a line on plot to show relationship > plot(x = copier2$experience, y = mod.fit$residuals, xlab = "Experience of service person (years)", ylab = "Residual", main = "Residual vs. Experience", panel.first = grid(col = "gray", lty = "dotted"))> abline(h = 0, col = "red")310 15 20 25 30 35-20 -10 0 10Residual vs. Copier ageAge of copier (months)Residual1 2 3 4 5 6 7-20 -10 0 10Residual vs. ExperienceExperience of service person (years)Residual> mod.fit2<-lm(formula = minutes ~ copiers + age.copier, data = copier2)> summary(mod.fit2)Call:lm(formula = minutes ~ copiers + age.copier, data = copier2)Residuals: Min 1Q Median 3Q Max -13.4815 -3.2212 -0.5019 3.9245 10.6721 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -27.0292 3.8787 -6.969 1.61e-08 ***copiers 14.8734 0.3148 47.242 < 2e-16 ***age.copier 1.1068 0.1433 7.726 1.36e-09 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 5.796 on 42 degrees of freedomMultiple R-Squared: 0.9824, Adjusted R-squared: 0.9816 F-statistic: 1175 on 2 and 42 DF, p-value: < 2.2e-16 Overall, I do not see problems with the model, except perhaps normality and another predictor variable would be helpful. Answer a.-h. of the problem on your own. Notice how these parts are basically the same as the 7 items given in the Section 3.11 notes: 1) Diagnostics for predictor and response
View Full Document