DOC PREVIEW
ISU STAT 401 - hw9ans

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Stat 401 F / XW HW 9 answers 1) Bat echolocation a) No answer needed, but the estimates are Intercept: ‐1.576, X2: 0.102, X3: 0.079, logmass: 0.815 b) 2 pts. Group Intercept Slope non‐e bats ‐1.58 0.815birds ‐1.47 0.815echo bats ‐1.50 0.815 c) No answer needed, but the estimates are Intercept: ‐1.498, X2: 0.024, X4: ‐0.079, logmass: 0.815 d) 1 pt Group Intercept Slope non‐e bats ‐1.58 0.815birds ‐1.47 0.815echo bats ‐1.50 0.815 1 pt. These are the same values as in b. Note: Notice that the estimates depend on which model is fit (with X3 or with X4). The intercepts for each group do not. In fact, there are an infinite number of potential sets of estimates. Each set will give exactly the same intercepts. The intercepts are an example of what are technically called estimable functions, as are group means. Their estimates do not depend on how the model is parameterized. Some of the regression coefficients do depend on how the model is parameterized. They are not estimable, which is why SAS labels the estimates with a B when you use class type. You would also get the B if you tried to include all three indicators X2, X3 and X4 in one model. Notice that the slope is estimable in this model, even if you use class type. e) 2pts T = ‐0.39, p = 0.70. No evidence of a difference in intercepts. Note: You could have gotten the same test by looking at the coefficient for X3 in the model for part a. My SAS: ods html close; ods html; data bats; infile 'case1002.csv' dsd firstobs=2; input Mass Type $ Energy; logenergy = log(energy); logmass = log(mass); if type = 'birds' then x2 = 1; else x2 = 0; if type = 'echoloca' then x3 = 1; else x3 = 0; if type = 'non-echo' then x4 = 1; else x4 = 0;run; proc print; run; /* part a and b, could also use proc reg; could also use proc glm with class type */ /* You probably did b by hand. Can use an estimate statement, which gives se as well */ proc glm; model logenergy = logmass x2 x3; estimate 'intercept for non-echolocating bats' intercept 1; estimate 'intercept for birds' intercept 1 x2 1; estimate 'intercept for echolocating bats' intercept 1 x3 1; run; /* part c and d, could also use proc reg; could also use proc glm with class type */ /* You probably did d by hand. Can use an estimate statement, which gives se as well */ proc glm; model logenergy = logmass x2 x4; estimate 'intercept for non-echolocating bats' intercept 1 x4 1; estimate 'intercept for birds' intercept 1 x2 1; estimate 'intercept for echolocating bats' intercept 1; run; 2) Zinc and Copper 2 pts I don’t see very much difference between the two residual plots. The simpler model (for protein ) is preferable. Notes: The point with the raw residual above 40 looks unusual. If you look at the standardized residuals (in the default reg output), you see that obs has a standardized residual > 3, which is moderately unusual. 95% of the standardized residuals should be between ‐2 and 2. There is a potential gotcha in the SAS code. If you omit the data=minnows on the second proc reg, you don’t get the residuals from the regression of log protein. If you look in the log window, you see a warning note: NOTE: Variable yhat already exists on file WORK.RESIDS, using yhat2 instead. NOTE: Variable resid already exists on file WORK.RESIDS, using resid2 instead. The new residuals are there, but they’re in the variables yhat2 and resid2. yhat and resid still have the old residuals. That’s because when SAS comes to the second proc reg, the last created data set is resids. Specifying data=minnows sends SAS back to the original data set, which does not contain yhat or resid. My SAS code: data minnows; infile 'zinc.txt' firstobs=2; input copper zinc protein; copper2 = copper*copper; zinc2 = zinc*zinc; copperzinc = zinc*copper; lprotein = log(protein); run; proc reg; model protein = copper zinc copper2 zinc2 copperzinc; output out=resids r=resid p=yhat; title 'Model for protein'; run; proc sgplot; scatter x=yhat y=resid; run; proc reg data=minnows; model lprotein = copper zinc copper2 zinc2 copperzinc; output out=resids r=resid p=yhat; title 'Model for log protein'; run; proc print data=resids; run; proc sgplot; scatter x=yhat y=resid; run; 3) Kentuc I don’t seesame. Mythat a quathat time different. of quadra 2 pts. Forfrom the o 2 pts. Tab ky Derby e much differy answers areadratic term fdecreased st But the one atic coefficienr Y = time: F =others. ble or a plot orence betweee for time. Thfor year needteadily from 1for the quadt = 0 has a ve= 24.56, p < 0of means withTrack Dusty Fast Good Heavy Muddy Sloppy Slow en the time anhose for speeds to be in the1896 to ca 19ratic model loery small p‐va.0001, very sth standard erTime LSMEA122.563412123.563085124.957869128.755375127.626959125.829196126.198558nd the speed d are similar,e model. If yo65, then stayooks flatter talue. trong evidencrors. I show AN Standard2 1.326435 0.140259 0.429075 0.558229 0.523696 0.640158 0.57879analyses. Th, but the numou look at a pyed flat. The rhan that for tce that at leas both, but ond Error Pr > 3 <.006 <.005 <.002 <.001 <.005 <.008 <.00he residual plombers are diffeplot of time vsresidual plotsthe linear most one track cnly one neede|t|001 001 001 001 001 001 001 ots look abouerent. It is cls year, you ses are a bit lessodel and the tcondition diffed. ut the lear ee s test fers 2 pts For speed, F = 23.33, p < 0.0001. very strong evidence that at least one track condition differs from the others. 2 pts Again, either a table or plot of means needed, but not both. My SAS code: (or half of it; omitting the analyses where time replaced by speed) data derby; infile 'ex0920.csv' dsd firstobs=2; input Year Winner $ Starters NetToWinner Time Speed Track $ Conditions $; year2 = year*year; run; proc glm; class track; model time = year track; output out=resids r=resid p=yhat; title 'Time, linear year'; run; proc sgplot; scatter x=yhat y=resid; run; proc glm data=derby; class track; model time = year year2 track; output out=resids r=resid p=yhat stdp=semean ; lsmeans track /stderr; ods output lsmeans = lsmeans ; title 'Time, quadratic year'; run; proc sgplot data=resids; scatter x=yhat y=resid; run; data lsmeans2; set lsmeans; meanminus = lsmean - stderr; meanplus = lsmean + stderr; run; proc sgplot; scatter x = track y=lsmean


View Full Document

ISU STAT 401 - hw9ans

Download hw9ans
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view hw9ans and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view hw9ans 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?