ISU STAT 511 - Homework # 6 - D1732564

Home> Schools> Iowa State University> Statistics (STAT) > STAT 511> Homework # 6

DOC PREVIEW

ISU STAT 511 - Homework # 6

School name Iowa State University

Course Stat 511- Statistical Methods

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1Stat 511 HW#6 Spring 20031. The data set sealstrength.txt located on the course "data sets" page (follow thelink from the main course Web page) is a classic one taken from "Sealing Strength ofWax-Polyethylene Blends" by Brown, Turner and Smith (Tappi, 1958). Given are valuesof(Coded) Seal Temperature111225 where is in F30txt−=o(Coded) Cooling Bar Temperature22255 where is in F9txt−=o(Coded) Polyethylene Content31.1 where is in %.6cxc−=Bread Wrapper Seal Strength in g/in.yfrom an experiment run to find good (largey) settings for the process variables x. A"standard" "response surface" analysis of these data is based on a multivariate quadraticregression. Use R and appropriate matrix calculations to do the following.a) Fit the (linear in the parameters and quadratic in the predictors) model2220112233415263712813923iiiiiiiiiiiiiiyxxxxxxxxxxxxββββββββββε=++++++++++to these data. Then compute and normal-plot standardized residuals.b) In the model from a), test 0459H:0βββ====L. Report a p-value. Doesquadratic curvature in response (as a function of the x's) appear to be statisticallydetectable? (If the null hypothesis is true, the response is "planar" as a function ofthex's.)c) Some multivariate calculus on the fitted quadratic equation can be used to establishthat it has an absolute maximum at about the set of conditions1231.01,.26, and .68xxx=−==Use R matrix calculations to find 90% two-sided confidence limits for the mean responsehere. Then find 90% two-sided prediction limits for a new response from this set ofconditions.2. (Testing "Lack of Fit" … See Section 6.6 of Christensen) Suppose that in the usuallinear model=+YXßeXis of full rank (k). Suppose further that there aremn<distinct rows inXand thatmk>. One can then make up a "cell means" model forY(where observations havingthe same corresponding row in Xare given the same mean response) say*=+YXµe2This model puts no restrictions on the means of the observations except that those withidentical corresponding rows ofXare equal. It is the case that()()*CC⊂XXand it thusmakes sense to test the hypothesis()0H:E C∈YXin the cell means model. This can bedone using()()( )( )**'/'/mkFnm−−=−−XXXYPPYYIPYand this is usually known as testing for "lack of fit."Use R and matrix calculations to find a p-value for testing lack of fit to the quadraticregression equation in problem 1.3. (Adapted from Koehler's Spring 2002 HW7) In a study to examine the effects of4I= drugs on dogs under 3J=disease conditions, increases in systolic blood pressure(, in mm Hgy ) were observed after drug treatment for several dogs with experimentallyinduced cases of the diseases. The measured increases from Kutner (1974) are as below.Disease 1 Disease 2 Disease 3Drug 142,44,36,13,19,2233,26,33,2131,3,25,25,24−Drug 228,23,24,42,1334,33,31,363,26,28,32,3,16Drug 31,29,1911,9,7,1,6−21,1,9,3Drug 424,9,22,2,15−27,12,12,5,16,15−22,7,25,5,12a) Create three vectors in R of length 58n=. The first should containyvalues, thesecond Drug ID Numbers (1-4), and the third Disease ID Numbers (1-3). Call thesevectors respectively "y", "drug" and "disease". Then create and print out an R dataframe using the commands> d<-data.frame(y,drug,disease)> db) Turn the numerical variables drug and disease into variables that R will recognizeas levels of factors by issuing the commands> d$drug<-as.factor(d$drug)> d$disease<-as.factor(d$disease)Then compute and print out the cell means by typing> means<-tapply(d$y,list(d$drug,d$disease),mean)> meansYou may find out more about the function tapply by typing> ?tapply3c) Make a crude interaction plot by doing the following. First type> x.axis<-unique(d$drug)to set up horizontal plotting positions for the sample means. Then make a "matrix plot"with lines connecting points by issuing the commands> matplot(c(1,4),c(-10,50),type="n",xlab="Drug",ylab="MeanResponse",main="Change in Systolic Blood pressure")> matlines(x.axis,means,type='b',lty=c(1,3,7))The first of these commands sets up the axes and makes a dummy plot with invisiblepoints "plotted" at (1,10) and (4,50)−. The second puts the lines and identifying diseasenumbers (as plotting symbols) on the plot.d) Set the default for the restriction used to create a full rank model matrix, run the linearmodels routine and find both (sensible) sets of "Type I" sums of squares by issuing thefollowing commands> options(contrasts=c("contr.sum","contr.sum"))> lm.out1<-lm(y~drug*disease,data=d)> summary.aov(lm.out1,ssType=1)> lm.out2<-lm(y~disease*drug,data=d)> summary.aov(lm.out2,ssType=1)Then compute "Type III" sums of squares by issuing the command> summary.aov(lm.out1,ssType=3)This is the question as assigned. As discussed via e-mail, it appears that R will notcompute Type III sums of squares and one must use John Fox's unsupported "car"package to get this done in R. Splus DOES produce the Type III sums of squares if theabove command is used.(As Prof. Koehler points out about this data set, we have ignored a potentially importantaspect of the original real problem here. There were actually originally6dogs assigned atrandom to each of the 12 treatment combinations. We have tacitly assumed that the datathat are missing are "missing at random" i.e. that the "missingness" provides noinformation about the effects of the treatments. If that tacit assumption is wrong, none ofwhat is done above is anything but a numerical exercise … it provides no seriousscientific insight. For example, you might consider how differently you might thinkabout the medical problem if you believed that if in fact all missing data correspond todead dogs, and deaths were fundamentally due to huge blood pressure increases that arenot captured by the given values.)44. Below is a small table of fake 2-way factorial data. Enter them into R in three vectorsof length 12n=, much as was done in problem 3. Call these vectors "Y", "A", and "B".Level 1 of B Level 2 of B Level 3 of BLevel 1 of A121410,12Level 2 of A911,126,7Level 3 of A10117a) Repeat parts a)-d) of Problem 3 on these data.b) Create129×full rank model matrices for both a cell means model and an effectsmodel with the sum restriction for these data. Using R matrix calculations and the 2nd ofthese, compute "Type I" sums of squares corresponding to the linear model fit by> lm.out1<-lm(Y~A*B,data=d)Then use the first of these model matrices and appropriate matricesCand compute sumsof squares for

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

ISU STAT 511 - Homework # 6

Sign up for free to view:

Please select your school