Statistical Methods and Computing 22S 30 105 Instructor Cowles Lab 2 Feb 2 2011 1 Downloading files and accessing SAS We will be using the billion dat dataset again today as well as the OECD dataset on health care expenses Read the file OECD info to learn about the OECD dataset Do this by left clicking n the filename and then selecting Open with and Wordpad Also look at the OECD data dataset Then call up SAS 2 Sorting scatterplots correlation and regression In the following SAS code lines that begin with an asterisk are comments and do not need to by typed Setting the number of characters in output lines and pages run Producing separate analysis for each region Note In addition to a complete univariate analysis within each region this procedure produces side by side boxplots of wealth by region proc univariate plot data billion var wlth by region run Producing a scatterplot Note the following code plots wlth on the y axis and age on the x axis proc plot data billion plot wlth age run options linesize 79 pagesize 60 Reading the billionaire dataset into SAS Reading the OECD dataset into SAS Use this version if you are running SAS on the computer you are on data billion infile c temp billion dat input wlth age region run Use this version if you are running SAS on the Virtual Desktop data billion input wlth age region datalines paste data in here run Sorting a dataset Note the 13 in the input statment tells SAS the number of characters in the longest country name Without this information SAS would truncate the country names to 8 letters each data OECD input country 13 pcgdp pch beds los docs infmort datalines paste data here run Better text scatter plots proc plot data OECD plot pch pcgdp vpos 20 hpos 40 run Note If we want to produce separate output for different subsets of a dataset we must first sort the dataset by the variable that defines those subsets Correlation proc sort data billion by region 1 2 File proc corr data OECD var pcgdp pch run Open by SAS name Work library double click OECD double click Regression proc reg data OECD model pch pcgdp model resp vbl explanatory vbl id country identifies observations in list of predicted values and residuals run To create a scatterplot choose Graphs Scatterplot Use the interactive window to specify the explanatory variable on the X axis and the response variable on the Y axis To do regression analysis choose Statistics Regression Simple Again interactively specify the explanatory and response variables Other choices in the window can be used to request predicted values and specific plots 4 Insight is another point and click facility built into SAS We will be using its graphical features later on when we study multiple regression In case you want to try it now here are some instructions Predicted values and residuals From the main pull down window select the following sequence of choices Solutions Note the p option on the model statement gets list of predicted values and residuals Analysis Interactive data analysis proc reg data OECD model pch pcgdp p id country run In the window that appears you must specify which dataset you wish to use Do so by clicking Library Work Dataset OECD Scatterplots and Residual plots Open To do regression in Insight choose Note the lp option on the proc reg statement makes any plots become text plots that appear in the output window Without this option you get prettier plots that are harder to print proc reg data OECD lp model pch pcgdp p plot pch pcgdp symbol hplots 2 vplots 2 run plot residual predicted symbol hplots 2 vplots 2 run Analyze Fit To identify the response variable use your mouse to click PCH and then Y Similarly copy PCGDP into the X column Click OK and lots of regression output and plots will appear To get out of Insight and back into command mode click in the window showing the data in spreadsheet form Then pull down the File menu and choose End 5 3 Insight for regression Remember to exit from SAS and log out of your hawkid Analyst for regression Use the following steps to get into Analyst from the menu Solutions Analysis Analyst You must specify which dataset you wish to use Do so by clicking 3 4 Output The UNIVARIATE Procedure Schematic Plots 40 35 30 25 20 15 10 0 0 5 0 0 region A E M O 5 The UNIVARIATE Procedure Schematic Plots 40 35 30 25 20 15 10 0 5 0 0 region U 6 Plot of wlth age Legend A 1 obs B 2 obs etc wlth 38 4 A 36 0 33 6 31 2 28 8 26 4 24 0 A 21 6 19 2 16 8 14 4 A A A 12 0 A A 9 6 A A 7 2 AA A A A A 4 8 A A A A AA A A A A A AA A AAA C AAA B A AABAAA AAABE CA A A 2 4 B A A A C A BCB DCAAEAA BBAB A A A DDADB AF DDDHI EJAFBBBC BC A A A AA CA CABBACAAA BABA AA 0 0 0 20 40 60 80 100 120 age NOTE 8 obs had missing values 7 Plot of PCH PCGDP Symbol used is PCH 4000 2000 0 0 10000 20000 30000 40000 PCGDP NOTE 4 obs hidden The CORR Procedure 2 Variables pcgdp pch Simple Statistics Variable pcgdp pch N Mean Std Dev Sum Minimum Maximum 29 29 20395 1509 6871 760 95177 591441 43758 6720 232 00000 34536 3898 Pearson Correlation Coefficients N 29 Prob r under H0 Rho 0 pcgdp pch pcgdp 1 00000 0 87420 0001 pch 0 87420 0001 1 00000 8 Model MODEL1 Dependent Variable PCH Model MODEL2 Dependent Variable PCH Analysis of Variance Analysis of Variance Sum of Squares Mean Square Source DF Model Error C Total 1 12390694 827 12390694 827 27 3822637 8631 141579 18012 28 16213332 69 Root MSE Dep Mean C V 376 27009 1508 89655 24 93677 R square Adj R sq F Value Prob F 87 518 0 0001 Mean Square DF Model Error C Total 1 12390694 827 12390694 827 27 3822637 8631 141579 18012 28 16213332 69 Root MSE Dep Mean C V 0 7642 0 7555 Sum of Squares Source 376 27009 1508 89655 24 93677 R square Adj R sq F Value Prob F 87 518 0 0001 0 7642 0 7555 Parameter Estimates Parameter Estimates Variable DF INTERCEP PCGDP 1 1 Parameter Estimate Standard Error T for H0 Parameter 0 Prob T 465 663682 222 33243595 0 096818 0 01034925 2 094 9 355 0 0457 0 0001 Variable DF INTERCEP PCGDP 1 1 Obs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 9 Parameter Estimate Standard Error T for H0 Parameter 0 Prob T 465 663682 222 33243595 0 096818 0 …
View Full Document