DOC PREVIEW
UIUC STAT 420 - hw3_sol

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

AMS 578 Homework 3 solution Problem 1: a. > setwd("C:/spring 2009 TA/hw3"); > file<-"CH06PR05.txt"; > data<-read.table(file,header=FALSE, col.names=c("Y","X1","X2")); > attach(data); > X<-as.matrix(cbind(1,X1,X2)); > Y<-as.matrix(Y); > n<-length(Y);p<-dim(X)[2]; > Xs<-t(X)%*%X; > H<-X%*%solve(Xs)%*%t(X); > b<-solve(Xs)%*%t(X)%*%Y;b; [,1] 37.650 X1 4.425 X2 4.375 So the estimated regression function is Y=37.650+4.425X1+4.375X2. The b1 represents the effect of moisture content on the product’s rating. Increasing the moisture by one unit increases the rating by 4.425 units. b. > Yhat<-X%*%b; > resid<-Y-Yhat; > X1X2<-X1*X2 > par(mar<-c(4,4,1,1)); > plot(resid~Yhat,pch=20);Residual plot against fitted values provides information on appropriateness of our model, constancy of error terms. In this plot, no significant outliers and the error appear to be constant. > plot(resid~X1,pch=20); > plot(resid~X2,pch=20); Residual plot against predictor variables test the appropriateness of our model with respect to specific variable. In this plot, no significant systematic tendencies are observed. > plot(resid~X1X2,pch=20);In this plot, there is no systematic tendencies, so it is not interaction between X1 and X2. > qqnorm(resid); This plot indicates that the error terms are fairly normally distributed. c. H0: β1 = β2 =0 Ha: β1 ≠ 0 or β2 ≠ 0 The test statistics is : F* = MSR/MSE If F*>F(1-α,p-1,n-p), we conclude Ha, otherwise, we conclude H0. > #constructure the ANOVA table> J<-matrix(1/n,n,n);I=diag(n); > sse<-t(Y)%*%(I-H)%*%Y;sse; [,1] [1,] 94.3 > ssto<-t(Y)%*%(I-J)%*%Y; ssto; [,1] [1,] 1967 > ssr<-t(Y)%*%(H-J)%*%Y; ssr; [,1] [1,] 1872.7 > mse<- sse/(n-p); > msr<-ssr/(p-1); > F<-msr/mse > qf(0.99,2,13) [1] 6.700965 > F [,1] [1,] 129.0832 F* =129.0832 > 6.700965= F(0.99,2,13), So we reject H0 and conclude that at least one of β1 and β2 is not zero. d. > 1-pf(F,2,13) [,1] [1,] 2.658261e-09 So the P-value is 2.658261e-09 e. > R2<-ssr/ssto;R2; [,1] [1,] 0.952059 So the R2 is 0.952059 which means the most of the variance comes from regression, our model fits well. Problem 2: a. > Xh<-matrix(c(1,5,4),3,1); > Yh<-t(b)%*%Xh; > s2Yh<-t(Xh)%*%solve(Xs)%*%Xh*mse; > sYh<-sqrt(s2Yh); > YhCI<-c(Yh-qt(0.995,n-p)*sYh,Yh+qt(0.995,n-p)*sYh);YhCI; [1] 73.88111 80.66889 The CI is [73.88111, 80.66889] with confidence level 99%b. > s2Ypred<-mse+s2Yh; > sYpred<-sqrt(s2Ypred); > YpredCI<-c(Yh-qt(0.995,n-p)*sYpred,Yh+qt(0.995,n-p)*sYpred);YpredCI; [1] 68.48077 86.06923 So the 99 % prediction interval is [68.48077, 86.06923]. c. > Xnew<-X[,1:2] > Xsnew<-t(Xnew)%*%Xnew; > Hnew<-Xnew%*%solve(Xsnew)%*%t(Xnew); > bnew<-solve(Xsnew)%*%t(Xnew)%*%Y;bnew; [,1] 50.775 X1 4.425 > sseX1<-t(Y)%*%(I-Hnew)%*%Y;sseX1; [,1] [1,] 400.55 > sstoX1<-t(Y)%*%(I-J)%*%Y;sstoX1; [,1] [1,] 1967 > ssrX1<-t(Y)%*%(Hnew-J)%*%Y; ssrX1; [,1] [1,] 1566.45 SSR(X2|X1) = sseX1-sse = 400.55-94.3 = 306.25 The ANOVA Table is as follow: Source SS Df MS Regression 1872.7 2 936.35 X1 1566.45 1 1566.45 X2|X1 306.25 1 306.25 Error 94.3 13 7.253846 Total 1967 15 d. H0: β2 =0 Ha: β2 ≠ 0 The test statistics is : F* = MSR(X2|X1)/MSE If F*>F(1-α, 1,n-p), we conclude Ha, otherwise, we conclude H0. In this problem, F* =306.25/7.253846=42.21898 > 9.073806= F(0.99,1,13), So we reject H0 and conclude that β2 is not zero. The P-value is > 1-pf(42.21898,1,13) [1] 2.011048e-05Problem 3: a. > setwd("C:/spring 2009 TA/hw3"); > file<-"CH06PR09.txt"; > data<-read.table(file,header=FALSE, col.names=c("Y","X1","X2","X3")); > attach(data); > X<-as.matrix(cbind(1,X1,X2,X3)); > Y<-as.matrix(Y); > n<-length(Y);p<-dim(X)[2]; > Xs<-t(X)%*%X; > H<-X%*%solve(Xs)%*%t(X); > b<-solve(Xs)%*%t(X)%*%Y;b; [,1] 4.149887e+03 X1 7.870804e-04 X2 -1.316602e+01 X3 6.235545e+02 So the fitted regression model is Y=4149.887+0.000787X1-13.166X2+623.555X3. . The b1, b2, b3 here, separately, represents the effect of each predict variable on the dependent variable by changing 1 unit. b. > Yhat<-X%*%b; > resid<-Y-Yhat; > X1X2<-X1*X2 > > par(mar<-c(4,4,1,1)); NULL > plot(resid~Yhat,pch=20); > plot(resid~X1,pch=20);> plot(resid~X2,pch=20); > plot(resid~X3,pch=20); > plot(resid~X1X2,pch=20); >qqnorm(resid); From these plots, we can see the residual’s variation becomes smaller when the fitted value goes larger, which is similar to the situations of X1 and X3. Residuals only show variation constancy with respect to X2. And from the QQ plot, residuals are fairly normally distributed. c. >plot(resid,type='l');No systematic trend is observed. Error terms appear to be uncorrelated. Problem 4: a. > #constructs the ANOVA table > J<-matrix(1/n,n,n);I=diag(n); > sse<-t(Y)%*%(I-H)%*%Y;sse; [,1] [1,] 985530 > ssto<-t(Y)%*%(I-J)%*%Y;ssto; [,1] [1,] 3162136 > ssr<-t(Y)%*%(H-J)%*%Y;ssr; [,1] [1,] 2176606 > mse<- sse/(n-p); > msr<-ssr/(p-1); > mse<- sse/(n-p);mse; [,1] [1,] 20531.87 > msr<-ssr/(p-1);msr; [,1] [1,] 725535 > Xnew<-X[,1:2] > Xsnew<-t(Xnew)%*%Xnew; > Hnew<-Xnew%*%solve(Xsnew)%*%t(Xnew); > sseX1<-t(Y)%*%(I-Hnew)%*%Y;sseX1; [,1] [1,] 3025770 > ssrX1<-t(Y)%*%(Hnew-J)%*%Y; ssrX1;[,1] [1,] 136366 > X13<-cbind(Xnew,X3) > Xs13<-t(X13)%*%X13; > H13<-X13%*%solve(Xs13)%*%t(X13); > sseX13<-t(Y)%*%(I-H13)%*%Y;sseX13; [,1] [1,] 992204 > sseX13 - sse [,1] [1,] 6674 > sseX1-sseX13 [,1] [1,] 2033566 SSR(X2|X1, X3) = sseX13 - sse = 992204 - 985530= 6674 SSR(X3|X1) = sseX1-sseX13 = 3025770 - 992204= 2033566 Source SS DF MS Regression 2176606 3 725535.4 X1 136366 1 136366 X3|X1 2033566 1 2033566 X2|X1, X3 6674 1 6674 Error 985530 48 20531.87 Total 3162136 51 62002 b. H0: β2 =0 Ha: β2 ≠ 0 The test statistics is : F* = MSR(X2|X1)/MSE If F*>F(1-α, 1,n-p), we conclude Ha, otherwise, we conclude H0. In this problem, F* =6674/20531.87=0.3250 < 4.042652= F(0.95,1,48), So we do not reject H0 that β2 = 0. The P-value is > 1-pf(0.3250,1,48) [1] 0.5712771 c. Yes. > X12<-X[,1:3] > Xs12<-t(X12)%*%X12; > H12<-X12%*%solve(Xs12)%*%t(X12); > sseX12<-t(Y)%*%(I-H12)%*%Y;sseX12;[,1] [1,] 3020044 > Xnew2<-cbind(1,X2) > Xs2<-t(Xnew2)%*%Xnew2; > H2<-Xnew2%*%solve(Xs2)%*%t(Xnew2); > sseX2<-t(Y)%*%(I-H2)%*%Y;sseX2; [,1] [1,] 3150741 > ssrX2<-t(Y)%*%(H2-J)%*%Y; ssrX2; [,1] [1,] 11395 So here, SSR(X2|X1) = SSE (X1) – SSE (X1, X2) = 3025770 – 3020044 = 5726


View Full Document

UIUC STAT 420 - hw3_sol

Documents in this Course
Load more
Download hw3_sol
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view hw3_sol and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view hw3_sol 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?