DOC PREVIEW
UIUC STAT 420 - hw1_sol

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

AMS 578 Homework 1 solution Problem 1: a. > setwd("C:/spring 2009 TA/hw1"); > file<-"CH01PR19.txt"; > data<-read.table(file,header=FALSE, col.names=c("Y","X")); > attach(data); > muy<-mean(Y); > mux<-mean(X); > b1<-sum((X-mux)*(Y-muy))/sum((X-mux)^2); > b0<-muy-b1*mux; > b1 [1] 0.03882713 > b0 [1] 2.114049 So the LSE of β0 and β1 are, separately, 2.114049 and 0.03882713. The estimated regression function is: Y= 0.03882713*X +2.114049 b. > par(mar<-c(4,4,1,1)); NULL > plot(Y~X, pch=20); > abline(b0,b1,col=2); > legend(20,1, legend=c("Y= 0.03882713*X +2.114049"), col=2,lty=1,text.col=2,bty="n");c. E(Y|X= 30) = 0.03882713*30 +2.114049 = 3.278863 d. The point estimate of the change in the mean response is just the b1, 0.03882713. Problem 2: a. > resid<-Y-(0.03882713*X +2.114049); > MSE<-sum(resid^2)/118; > sb1<-sqrt(MSE/sum((X-mux)^2)); > sb1 [1] 0.01277302 > CI<-c(b1-qt(0.995,118)*sb1,b1+qt(0.995,118)*sb1) > CI [1] 0.005385614 0.072268640 So the 99 percent confidence interval is (0.005385614, 0.072268640), which doesn’t include zero. That means there is the significant relationship existed between the ACT score and the GPA at the end of the freshman year. b. H0: β1= 0; Ha: β1 ≠ 0; And the decision rule is if |t*| ≥ t1-a/2,n-2, We will reject H0. Here a=0.01, n= 120 So: > ts=b1/sb1; > abs(ts)-qt(0.995,118); [1] 0.4216398 That means |t*| ≥ t0.995,118, which means given the significance level 0.01, we reject H0. c. > pvalue<-2*pt(-ts,118); > pvalue; [1] 0.002916604 The P-value is 0.002916604, less than 0.01. So it gives the same decision in part (b). Problem 3: a. > Yh=b0+b1*28; > sYh=sqrt(MSE*(1/120+sum((28-mux)^2)/sum((X-mux)^2))); > YhCI<-c(Yh-qt(0.975,118)*sYh,Yh+qt(0.975,118)*sYh); > YhCI[1] 3.061384 3.341033 So the 95 percent confidence interval is [3.061384, 3.341033], which means GPA of the student whose ACT is 28, will be somewhere between 3.061384 and 3.341033 with 95% possibility. b. > Ypred=b0+b1*28; > sYpred=sqrt(MSE*(1+1/120+sum((28-mux)^2)/sum((X-mux)^2))); > YpredCI<-c(Ypred-qt(0.975,118)*sYpred,Ypred+qt(0.975,118)*sYpred); > YpredCI [1] 1.959355 4.443063 So the 95 percent prediction interval is [1.959355, 4.443063], which means Mary’s GPA will be somewhere between 1.959355 and 4.443063with 95% possibility. c. Yes, Yes. d. > W<-sqrt(2*qf(0.95,2,118)); > YhCB<-c(Yh-W*sYh,Yh+W*sYh); > YhCB [1] 3.026159 3.376258 The 95 percent confidence band is [3.026159, 3.376258] when Xh=28. And it is do wider than the confidence interval in part (a). Problem 4: a. > file<-"CH01PR20.txt"; > data<-read.table(file,header=FALSE, col.names=c("Y","X")); > attach(data); > muy<-mean(Y); > > b1<-sum((X-mux)*(Y-muy))/sum((X-mux)^2); > b0<-muy-b1*mux; > c(b0,b1); [1] -0.5801567 15.0352480 The estimated regression function is: Y= 15.0352480*X - 0.5801567. b. > par(mar<-c(4,4,1,1)); NULL > plot(Y~X,pch=20); > abline(b0,b1,col=2);> legend(4,10, legend=c("Y= 15.0352480*X - 0.5801567"), col=2,lty=1,text.col=2,bty="n"); c. b0 is the intercept value of the regression function at X = 0. But this linear regression model is applied to number of minutes spent by the service person to service on copiers. X here represents the number of copiers which only can be positive integer. So b0 doesn’t provide any meaningful information here. d. E(Y|X= 5) = 15.0352480*5 - 0.5801567 = 74.59608 Problem 5: a. It is just to estimate the β1, in problem 4, we know the point estimate of β1 is 15.0352480. > resid<-Y-(15.0352480*X - 0.5801567); > MSE<-sum(resid^2)/43; > sb1<-sqrt(MSE/sum((X-mux)^2)); > CI<-c(b1-qt(0.95,43)*sb1,b1+qt(0.95,43)*sb1); > CI [1] 14.22314 15.8473So the confidence interval is [14.22314, 16.33722], which means the change in the mean service time when the number of copier serviced increases by one will be somewhere between 14.22314 and 16.33722 with 90% possibility. b. H0: β1= 0; Ha: β1 ≠ 0; And the decision rule is if |t*| ≥ t1-a/2,n-2, We will reject H0. Here a=0.10, n= 45 So: > ts=b1/sb1; > abs(ts)-qt(0.95,43); [1] 29.44219 So here |t*|- t0.95,43 =29.44219>0 , which means we reject H0. We conclude that there is a linear association between X and Y. > pvalue<-2*pt(-ts,43); > pvalue; [1] 4.009032e-31 And the P-value is 4.009032e-31. c. Yes. In part (a), my conclusion is the90 percent confidence interval of β1 is [14.22314, 16.33722], which is not include, and also far away from zero. That means Y do have linear association with X which is consistent with the conclusion I obtain in part (b). d. This is a one-side T test. H0: β1≤ 14; Ha: β1 > 14; And the decision rule is if t* > t1-a,n-2, We will reject H0. Here a=0.05, n= 45 So: > ts=(b1-14)/sb1; > ts-qt(0.95,43); [1] 0.461913 Since here t* - t0.95,43 = 0.461913>0, which means we reject H0. So the standard is being satisfied by Tri-City Company. > pvalue<-pt(-ts,43); > pvalue; [1] 0.01890766And the P-value is 0.01890766. e. We can perform a test to see whether the stat-up time existed: H0: β0= 0; Ha: β0 > 0; The test statistics is b0/s{ b0}: > sb0<-sqrt(MSE*(1/45+mux^2/sum((X-mux)^2))); > ts<-b0/sb0; The P-value is: > pvalue<- pt(-ts,43); > pvalue [1] 0.5814706 So we can see, P- value is 0.5814706, which means in the significant level 0.01, 0.05, 0.1, we can not reject H0. Problem 6: a. > Yh=b0+b1*6; > sYh=sqrt(MSE*(1/45+sum((6-mux)^2)/sum((X-mux)^2))); > YhCI<-c(Yh-qt(0.95,43)*sYh,Yh+qt(0.95,43)*sYh); > YhCI; [1] 87.28387 91.97880 So the 90 percent confidence interval is [87.28387, 91.97880], which means the mean service time on calls with six copiers are serviced will be somewhere between 87.28387 and 91.97880 with 90% possibility b. > Ypred=b0+b1*6; > sYpred=sqrt(MSE*(1+1/45+sum((6-mux)^2)/sum((X-mux)^2))); > YpredCI<-c(Ypred-qt(0.95,43)*sYpred,Ypred+qt(0.95,43)*sYpred); > YpredCI; [1] 74.46413 104.79853 So the 90 percent predicted confidence interval is [74.46413, 104.79853], which means the predicted mean service time on calls with six copiers are serviced will be somewhere between 74.46433 and 104.79833 with 90% possibility. c. > YhCCI<-YhCI/6;> YhCCI; [1] 14.54731 15.32980 So the converted confidence interval is [14.54731, 15.32980], which means the expected service time per copier on calls with six copiers are serviced will be somewhere between 14.54731 and 15.32980 with 90% possibility. d. > W<-sqrt(2*qf(0.90,2,43)); > YhCB<-c(Yh-W*sYh,Yh+W*sYh); > YhCB; [1] 86.55263 92.71003 The 90 percent confidence band is [86.55263, 92.71003] when Xh=6.


View Full Document

UIUC STAT 420 - hw1_sol

Documents in this Course
Load more
Download hw1_sol
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view hw1_sol and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view hw1_sol 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?