Unformatted text preview:

Lab Five: Linear regression; longitudinal dataLab ObjectivesFollow along with the computer in front…run;3 0.4007 0.0064STAST 210 SAS LAB FIVE, July 19, 2004 Lab Five: Linear regression; longitudinal data Lab Objectives After today’s lab you should be able to: 1. Use PROC GLM to generate least squares means and differences with confidence limits, and to make pair-wise comparisons adjusted for multiple comparisons. 2. Use PROC GLM to run analysis of covariance. 3. Interpret output from PROC GLM. 4. Use PROC REG to run multiple linear regression. 5. Interpret output from PROC REG. 6. Output diagnostics (predicted values, residuals) from linear regression into a new dataset. 7. Use PROC GPLOT to create simple scatter plots and diagnostic plots for linear regression. 8. Begin to produce enhanced graphs using PROC GPLOT. Recommended reading in Walker: Chapters 10-11 1STAST 210 SAS LAB FIVE, July 19, 2004 LAB EXERCISE STEPS: Follow along with the computer in front… 1. Double-click on to open the SAS editor file “data creation code” which should be saved in your stats210 folder from last week; run the libname statement: libname stats210 'C:\Documents and Settings\mitl-pc.LANE-LIB\My Documents\Stats210’; 2. Using the Explorer Browser on the left hand side of your screen, double check that a stats210 library has been properly created, and that it contains the SAS dataset “runners”. 3. Try ANOVA for the outcome variable sumedi1 (though not perfectly normally distributed outcome…). And examine output. proc anova data=stats210.runners; class mencat; run; model sumedi1=mencat; 4. To figure out which groups are different after adjusting the p-value post-hoc for having done 3 pairwise comparisons (using a scheffe adjustment): proc glm data= stats210.runners; class mencat; Proc glm=”General linear model”—more powerful than ANOVA...does ANOVA “plus”…we are actually making a linear regression model : “model sumedi1=mencat” with sumedi1 as the outcome and mencat as the categorical predictor. model sumedi1=mencat; lsmeans mencat/pdiff adjust=scheffe cl; run; “automatically adjust my p-values for all pairwise comparisons” using a scheffe adjustment… 2STAST 210 SAS LAB FIVE, July 19, 2004 Generates the same ANOVA table as before, plus the following: The GLM Procedure Least Squares Means Adjustment for Multiple Comparisons: Scheffe LSMEAN mencat SPINE LSMEAN Number lsmeans 1 16.0000000 1 2 20.6923077 2 3 7.3000000 3 mean sumedi1 score for each group Least Squares Means for effect mencat Pr > |t| for H0: LSMean(i)=LSMean(j) pdiff Dependent Variable: sumedi1 i/j 1 2 3 1 0.7912 0.4007 and 2 0.7912 0.0064 3 0.4007 0.0064 After adjusting for multiple comparisons, only groups 2 and 3 (oligomenorrheic eumenorrheic) are significantly different at p<.05 level. sumedi1 mencat LSMEAN 95% Confidence Limits cl 1 16.000000 3.948831 28.051169 2 20.692308 14.007522 27.377094 3 7.300000 2.899535 11.700465 Difference Simultaneous 95% Between Confidence Limits for i j Means LSMean(i)-LSMean(j) 1 2 -4.692308 -22.016236 12.631620 1 3 8.700000 -7.427699 24.827699 2 3 13.392308 3.331670 23.452945 95% Confidence intervals for the difference in means between group i and group j. Note that 2 vs. 3 does not cross 0. 95% confidence intervals for the mean sumedi1 scores. Note that the confidence interval for group 1 is wide, because it is a very small group; use interactive data analysisÆdistributionÆtablesÆfrequency counts to find that n=4. 3STAST 210 SAS LAB FIVE, July 19, 2004 5. Run the following code; note the change in p-values of differences if we hadn’t adjusted for multiple comparisons; also note that SAS gives you a warning! proc glm data= stats210.runners; class mencat; model sumedi1=mencat; lsmeans mencat/pdiff cl; run; 6. Controlling for confounders (ANCOVA) Sometimes, you want to control for confounders. This requires ANCOVA *(analysis of covariance). We’ll return to this when we talk about regression. For now, use PROC GLM again and add confounders to your model. proc glm data=stats210.runners; class mencat; model neck1=mencat pounds1 age; lsmeans mencat/pdiff adjust=tukey; “correct for group differences in age and weight” s”“Use a Tukey’s adjustment for multiple comparisonrun; GIVES: The GLM Procedure Dependent Variable: neck1 neck1 Sum of Source DF Squares Mean Square F Value Pr > F Model 4 0.25314374 0.06328593 5.28 0.0016 df=5-1 because there are 5 predictors: age, weight, oligomenorrheic, amenorrheic, eumenorrheicÆ in linear regression this translates to 5 regression coefficients (including the intercept) plain Overall ANOVA table. This says that at least some of the predictors in the model significantly exdifferences (variation) in neck bone density (p<.0016). Error 42 0.50370337 0.01199294 Corrected Total 46 0.75684711 R-Square Coeff Var Root MSE neck1 Mean 1095.0119929.==MSE 0.334471 11.90846 0.109512 0.919617 9.11100919617.109512.100meandeviation standard== xx1095.0119929.==MSEEstimated standard deviation of neck bone density (average variability within groups) 33.75.25..R2===TSSSSModel An R-square of .33 means that 33% of the total variance in neck bone density is explained by age, weight, and menstrual group. 4STAST 210 SAS LAB FIVE, July 19, 2004 Least Squares Means Adjustment for Multiple Comparisons: Tukey-Kramer LSMEAN mencat neck1 LSMEAN Number 1 0.86742642 1 2 0.89075083 2 3 0.93908445 3 Least Squares Means for effect mencat Pr > |t| for H0:


View Full Document

Stanford STATS 210 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?