STAT 333 Discussion 6Problem 3 in homework 41. p-value is uniformly distributed on [0,1] under null hypothesis. (Why?)2. Distribution of p-value in (a) and (b).p−value in (a) − 1000 runsp2Frequency0.0 0.2 0.4 0.6 0.8 1.00 20 40 60 80 100p−value in (b) − 1000 runspFrequency0.0 0.2 0.4 0.6 0.8 1.00 100 200 300 400Practice Problems: one-way ANOVAA researcher is studying the effectiveness of three methods of reducing smoking. He wants to determine whether themean reduction in the number of cigarettes smoked daily differs from one method to another among men patients.12 patients who smoked about 60 cigarettes per day before treatment are randomly assigned to use one of themethods, 4 for each method. The reductions in the number of cigarettes smoked daily are:MethodA B C10 19 119 20 139 21 158 20 13(a) Consider the following 3 columns of data:X1 X2 Y1 0 101 0 91 0 91 0 80 1 190 1 200 1 210 1 200 0 110 0 130 0 150 0 13Relate the output with the scientific problem and interpret the coefficients (b0, b1, b2) in a meaningful way.1(b) Fit the regression of Y on X1 and X2.> X1=c(1,1,1,1,0,0,0,0,0,0,0,0)> X2=c(0,0,0,0,1,1,1,1,0,0,0,0)> Y =c(10,9,9,8,19,20,21,20,11,13,15,13)> out=lm(Y~X1+X2)> summary(out)Call:lm(formula = Y ~ X1 + X2)Residuals:Min 1Q Median 3Q Max-2.00 -0.25 0.00 0.25 2.00Coefficients:Estimate Std. Error t value Pr(>|t|)(Intercept) 13.0000 0.5774 22.517 3.18e-09X1 -4.0000 0.8165 -4.899 0.000849X2 7.0000 0.8165 8.573 1.27e-05Residual standard error: 1.155 on 9 degrees of freedomMultiple R-squared: 0.9538, Adjusted R-squared: 0.9436F-statistic: 93 on 2 and 9 DF, p-value: 9.748e-07(c) Fit the model in R by using a factor(group) variable. Compare the output with (b) and interpret.> method=c("A","A","A","A","B","B","B","B","C","C","C","C")> method=factor(method)> reduction=c(10,9,9,8,19,20,21,20,11,13,15,13)> out2=lm(reduction~method)> summary(out2)Call:lm(formula = reduction ~ method)Residuals:Min 1Q Median 3Q Max-2.00 -0.25 0.00 0.25 2.00Coefficients:Estimate Std. Error t value Pr(>|t|)(Intercept) 9.0000 0.5774 15.588 8.07e-08methodB 11.0000 0.8165 13.472 2.86e-07methodC 4.0000 0.8165 4.899 0.000849Residual standard error: 1.155 on 9 degrees of freedomMultiple R-squared: 0.9538, Adjusted R-squared: 0.9436F-statistic: 93 on 2 and 9 DF, p-value: 9.748e-072> model.matrix(out2) # the default model matrix generated by R.(Intercept) methodB methodC1 1 0 02 1 0 03 1 0 04 1 0 05 1 1 06 1 1 07 1 1 08 1 1 09 1 0 110 1 0 111 1 0 112 1 0 1attr(,"assign")[1] 0 1 1attr(,"contrasts")attr(,"contrasts")$method[1] "contr.treatment"(d) Use the model in (b) to calculate a 95% confidence interval for the mean reductions of method A.> predict(out, newdata=data.frame(X1=1, X2=0),interval="confidence")fit lwr upr1 9 7.693943 10.30606(e) Use the model in (c) to calculate a 95% confidence interval for the mean reductions of method A.> confint(out2)2.5 % 97.5 %(Intercept) 7.693943 10.306057methodB 9.152956 12.847044methodC 2.152956 5.847044(f) Compare the ANOVA table of models in (b) and (c).> anova(out) # matrix approachAnalysis of Variance TableResponse: YDf Sum Sq Mean Sq F value Pr(>F)X1 1 150 150.000 112.5 2.187e-06X2 1 98 98.000 73.5 1.268e-05Residuals 9 12 1.333> anova(out2) # factor variable approachAnalysis of Variance TableResponse: reductionDf Sum Sq Mean Sq F value Pr(>F)method 2 248 124.000 93 9.748e-07Residuals 9 12 1.333In addition, we can look at the summary of out:3> summary(out)Call:lm(formula = Y ~ X1 + X2)Residuals:Min 1Q Median 3Q Max-2.00 -0.25 0.00 0.25 2.00Coefficients:Estimate Std. Error t value Pr(>|t|)(Intercept) 13.0000 0.5774 22.517 3.18e-09X1 -4.0000 0.8165 -4.899 0.000849X2 7.0000 0.8165 8.573 1.27e-05Residual standard error: 1.155 on 9 degrees of freedomMultiple R-squared: 0.9538, Adjusted R-squared: 0.9436F-statistic: 93 on 2 and 9 DF, p-value:
View Full Document